python - Python正则表达式在字符串开头不匹配？

Question

我正在浏览一个带有正则表达式提取数据的二进制文件，但我遇到了无法追踪的正则表达式问题。

这是我遇到问题的代码：

        z = 0
        for char in string:
            self.response.out.write('|%s' % char.encode('hex'))
            z+=1
            if z > 20:
                self.response.out.write('<br>')
                break

        title = []
        string = re.sub('^\x72.([^\x7A]+)', lambda match: append_match(match, title), string, 1)
        print_info('Title', title)

def append_match(match, collection, replace = ''):
    collection.append(match.group(1))
    return replace

这是运行时字符串中前 20 个字符的内容：

|72|0a|50|79|72|65|20|54|72|6f|6c|6c|7a|19|54|72|6f|6c|6c|62|6c

它什么都不返回，除非我删除了 ^，在这种情况下它返回“Troll”（不是引号），即 54726F6C6C。当我阅读它时，它应该将所有内容返回到 \x7a 。

这里发生了什么？

score 2 · Accepted Answer

问题是\x0A(=newline) 默认情况下不会被点匹配。尝试将dotall 标志添加到您的模式中，例如：

re.sub('(?s)^\x72.([^\x7A]+)....

python - Python正则表达式在字符串开头不匹配？

1 回答 1

Related

Reference