0

Python 不断返回一个带有损坏字符的字符串。

Python

test = re.sub('handle(.*?)', '<verse osisID="lol">\1</verse>', 'handle a bunch of random text here.')
print test

我想要的是

<verse osisID="lol">a bunch of random text here.</verse>

我得到了什么

<verse osisID="lol">*broken character*</verse>a bunch of random text here.
4

2 回答 2

8

您应该转义\字符或使用r''原始字符串:

>>> re.sub('handle(.*?)', r'<verse osisID="lol">\1</verse>', 'handle a bunch of random text here.')
'<verse osisID="lol"></verse> a bunch of random text here.'

如果没有r''原始字符串文字,反斜杠将被解释为转义码。您也可以将反斜杠加倍:

>>> '\1'
'\x01'
>>> '\\1'
'\\1'
>>> r'\1'
'\\1'
>>> print r'\1'
\1

请注意,您只替换那里的单词handle,该.*?模式至少匹配 0 个字符。删除问号,它将与您的预期输出匹配:

>>> re.sub('handle(.*)', r'<verse osisID="lol">\1</verse>', 'handle a bunch of random text here.')
'<verse osisID="lol"> a bunch of random text here.</verse>'
于 2012-08-09T19:45:40.163 回答
0

以下代码在 python 3.6 下测试

import re 

test = 'a bunch of random text here.'
resp = re.sub(r'(.*)',r'<verse osisID="lol">\1</verse>',test)
print (resp)

<verse osisID="lol">a bunch of random text here.</verse>
于 2017-06-02T02:31:08.987 回答