2

我正在尝试创建一个程序,该程序将列出子字符串在父字符串中出现的位置。例如,假设我们在父字符串“abcabcabcabcabcabca”中搜索“bc”,程序将返回 1、4、7、10、13、16。

到目前为止,我一直在使用:

import string

def subStringMatchExact():
    print "This program will index the locations a given sequence"
    print "occurs within a larger sequence"
    seq = raw_input("Please input a sequence to search within: ")
    sub = raw_input("Please input a sequence to search for: ")
    n = 0
    for i in seq:
        x = string.find(seq, sub [n:])
        print x
        n = x + 1

我也尝试用 string.index 运算符替换 string.find 。任何建议将不胜感激。

4

3 回答 3

3

只需.find()在输入字符串本身上调用。它将返回匹配的位置或未-1找到匹配的位置。它还需要一个start参数,因此您可以查找下一个匹配项:

def subStringMatchExact():
    print "This program will index the locations a given sequence"
    print "occurs within a larger sequence"
    seq = raw_input("Please input a sequence to search within: ")
    sub = raw_input("Please input a sequence to search for: ")

    positions = []
    pos = -1
    while True:
        pos = seq.find(sub, pos + 1)  # start searching *beyond* the previous match
        if pos == -1:   # Not found
            break
        positions.append(pos)
    return positions
于 2013-02-24T21:32:44.340 回答
3

我很懒,所以我会使用re.finditer

>>> import re
>>> s = "abcabcabcabcabcabca"
>>> for m in re.finditer('bc',s):
...     print m.start()
... 
1
4
7
10
13
16
于 2013-02-24T21:33:29.950 回答
0

如果这对您很重要,列表推导是一种非常优雅的方式:

>>> seq = "abcabcabcabcabcabca"
>>> sub = "bc"
>>> [i for i in range(len(seq)) if seq[i:].startswith(sub)]
[1, 4, 7, 10, 13, 16]

这也应该是最快的解决方案。它遍历字符串并尝试查看剩余字符串(从该位置到结尾)是否在任何位置以指定的子字符串开头。如果是,它会收集该位置,如果不是,它会继续下一个。

于 2013-02-24T21:36:03.247 回答