python - Python下一个子字符串搜索

Question

我正在多次发送带有前置/后置码的消息。我希望能够在两个有效的前置/后置码之间提取消息。我当前的代码是

print(msgfile[msgfile.find(preamble) + len(preamble):msgfile.find(postamble, msgfile.find(preamble))])

问题是如果后同步码损坏，它将打印第一个有效前同步码和下一个有效后同步码之间的所有数据。接收到的文本文件示例为：

garbagePREAMBLEmessagePOSTcMBLEgarbage
garbagePRdAMBLEmessagePOSTAMBLEgarbage
garbagePREAMBLEmessagePOSTAMBLEgarbage

它会打印

messagePOSTcMBLEgarbage
garbagePRdEAMBLEmessage

但我真正希望它打印的是来自第三行的消息，因为它具有有效的前置/后置码。所以我想我想要的是能够从子字符串的下一个实例中查找和索引。是否有捷径可寻？

编辑：我不希望我的数据是很好的离散行。我只是这样格式化它，这样更容易看到

score 0 · Accepted Answer

import re

lines = ["garbagePREAMBLEmessagePOSTcMBLEgarbage",
        "garbagePRdAMBLEmessagePOSTAMBLEgarbage",
        "garbagePREAMBLEmessagePOSTAMBLEgarbage"]

# you can use regex
my_regex = re.compile("garbagePREAMBLE(.*?)POSTAMBLEgarbage")

# get the match found between the preambles and print it
for line in lines:
    found = re.match(my_regex,line)
    # if there is a match print it
    if found:
        print(found.group(1))

# you can use string slicing
def validate(pre, post, message):
    for line in lines:
        # method would break on a string smaller than both preambles
        if len(line) < len(pre) + len(post):
            print("error line is too small")

        # see if the message fits the pattern
        if line[:len(pre)] == pre and line[-len(post):] == post:
            # print message
            print(line[len(pre):-len(post)])

validate("garbagePREAMBLE","POSTAMBLEgarbage", lines)

score 0 · Accepted Answer

所有消息都在单行上吗？然后，您可以使用正则表达式来识别具有有效前导和后导的行：

input_file = open(yourfilename)
import re
pat = re.compile('PREAMBLE(.+)POSTAMBLE')
messages = [pat.search(line).group(1) for line in input_file 
            if pat.search(line)]

print messages

score 0 · Accepted Answer

逐行处理：

>>> test = "garbagePREAMBLEmessagePOSTcMBLEgarbage\n"
>>> test += "garbagePRdAMBLEmessagePOSTAMBLEgarbage\n"
>>> test += "garbagePREAMBLEmessagePOSTAMBLEgarbage\n"
>>> for line in test.splitlines():
        if line.find(preamble) != -1 and line.find(postamble) != -1:
            print(line[line.find(preamble) + len(preamble):line.find(postamble)])

python - Python下一个子字符串搜索

3 回答 3

Related

Reference