-2

n给定一个由字符组合组成的长度字符串A B D

例一:AAAABAAAADADDDDADDDBBBBBBDDDDA

Thresholdx,给定的子串可以包含任何其他最大长度的连续子串x

Ex-2:对于AEx-1 中的子序列,AAAABAAAADA是阈值为 的具有 (1,11) 边界的合法子串x = 2

同样,我想分别提取子字符串A和子字符串D,忽略B主字符串。主字符串中每种类型可以有许多子字符串。

模型输出:

Type Boundaries
A    1,11
D    12,20
D    26,29 

A如果距离大于阈值会破坏字符串,我通过查找 s 之间的距离来实现一种效率低下的非算法方式。我不得不为A和单独运行这个D。这导致边界区域重叠。

我可以有更好的方法来解决这个问题吗?

编辑-1

合法子字符串可以是任意长度,但不应被大于 threshold 的其他子字符串污染x。这意味着在搜索它的子字符串时A,不应包含其他字符BD连续大于阈值。

如果x = 2在搜索时A,AABBAAAA, AABDAAAA是有效的,但不是AADBDAAA, AABBBAAA。同样,在搜索 D(A并且B将成为污染者)时。

使用“Pham Trung”答案的 EDIT-2实施

代码:

start = 0
lastA = -1
lastD = -1
x = 2

arr = ["A", "A", "A", "A", "B", "A", "A", "A", "A", "D", "A", "D", "D", "D", "D", "A", "D", "D", "D", "B", "B", "B", "B", "B", "B", "D", "D", "D", "D", "A"]

for i in range(0, len(arr)):
    if(arr[i] == 'A'):
        if(lastA != -1 and i - lastA > x):
            print("A", start + 1, lastA + 1)
            start = i
        lastA = i
    elif(arr[i] == 'D'):
        if(lastD != -1 and i - lastD > x):
            print("D", start + 1, lastD + 1)
            start = i
        lastD = i

输出:

A 1 11
D 16 19
A 26 16

代码无法在子字符串之后提取子1st字符串。

4

1 回答 1

1

因此,以下是针对您的问题的一些建议:

由于我们的字符串中只有三种类型的字符,因此很容易跟踪这些字符的最后位置。

从字符串的开头开始,跟踪当前字符与其最后位置之间的距离,如果它大于阈值,则断开它并从那里开始新的子字符串。

伪代码:

int start = 0;
int lastA = -1;
int lastD = -1;
for(int i = 0; i < input.length(); i++)
    if(input.charAt(i) == 'A'){
       if(lastA != -1 && i - lastA > x){
           create a substring from start to i - 1; 
           start = i; //Update the new start for the next substring
           lastD = -1;//Reset count for D
       }
       lastA = i;
    }else if(input.charAt(i) == 'D'){
       //Do similar to what we do for character A
    } 
}
create a substring from start to end of the string; //We need to add the last substring.

更新python代码:

start = 0
lastA = -1
lastD = -1
x = 2

arr = ["A", "A", "A", "A", "B", "A", "A", "A", "A", "D", "A", "D", "D",    "D","D", "A", "D", "D", "D", "B", "B", "B", "B", "B", "B", "D", "D", "D", "D", "A"]

for i in range(0, len(arr)):
    if(arr[i] == 'A'):
        if(lastA != -1 and i - lastA > x):
            print("A", start + 1, lastA + 1)
            start = lastA + 1
            while(start < len(arr) and arr[start] == 'B'):
                start = start + 1
            lastD = -1 
        lastA = i
    elif(arr[i] == 'D'):
        if(lastD != -1 and i - lastD > x):
            print("D", start + 1, lastD + 1)
            start = lastD + 1
            while(start < len(arr) and arr[start] == 'B'):
                start = start + 1
            lastA = -1
        lastD = i
while(start < len(arr) and arr[start] == 'B'):
    start = start + 1 
if(start < len(arr)):   
   print("A or D", start + 1, len(arr))
于 2015-12-23T09:11:58.143 回答