2

I'm searching a way to do some string substitution in python 2.7 usin regex on a binary file.

s is a string I get from reading a binary file. It contains this sequence (hex ) :

' 00 00 03 00 00 01 4A 50 20 43 52 55 4E 43 48 20 32 20 45 51 00 F7 00 F0 '

here is the variable I use for finding the string to sub :

f01 = re.findall( br'\x03\x00\x00\x01(.*?)\xF7\x00\xF0', s)

here is my sub :

f99 = re.sub( br'\x03\x00\x00\x01(.*?)\xF7\x00\xF0', br'\x03\x00\x00\x01\x4B\x4B\x4B\x4B\xF7\x00\xF0', s) 

now , while I got no error , my sub doesn't seem to change my string. Am I missing something ?

>>> f01 = re.findall( br'\x03\x00\x00\x01(.*?)\xF7\x00\xF0', s)
>>> print f01[0]
JP CRUNCH 2 EQ
>>> f99 = re.sub( br'\x03\x00\x00\x01(.*?)\xF7\x00\xF0', br'\x03\x00\x00\x01\x4B\x4B\x4B\x4B\xF7\x00\xF0', s)
>>> print f99
MThd
>>> print f99[0]
M
>>> print f01[0]
JP CRUNCH 2 EQ
>>> f01 = re.findall( br'\x03\x00\x00\x01(.*?)\xF7\x00\xF0', s)
>>> print f01[0]
JP CRUNCH 2 EQ

I would like to have my initial string changed to \x03\x00\x00\x01\x4B\x4B\x4B\x4B\xF7\x00\xF0 so I can store it to a file.

4

1 回答 1

2

r''文字前缀使所有斜线都按字面意思解释,即r'\x00'不是单个零字节,而是 4 个字符。

为避免将随机字节解释为正则表达式元字符,您可以使用re.escapefunction

为避免在替换字符串中重复前缀、后缀,您可以使用 regex'lookahead,lookbehind:

>>> s
'\x00\x00\x03\x00\x00\x01JP CRUNCH 2 EQ\x00\xf7\x00\xf0'
>>> pre = b'\x03\x00\x00\x01'
>>> suff = b'\xf7\x00\xf0'
>>> re.sub(br'(?<=%s).*?(?=%s)' % tuple(map(re.escape, [pre, suff])), b'\x4b'*4, s)
'\x00\x00\x03\x00\x00\x01KKKK\xf7\x00\xf0'

您可能需要re.DOTALL正则表达式标志来强制.匹配换行符。

于 2012-07-18T10:40:39.957 回答