好的正则表达式大师,我有一个很长的文本,我正在尝试在包含“他说”和类似变体的句子中添加引号。
例如:
s = 'This should have no quotes. This one should he said. But this one should not. Neither should this. But this one should she said.'
应该导致:
This should have no quotes. "This one should," he said. But this one should not. Neither should this. "But this one should," she said.
到目前为止,我可以非常接近,但并不完全正确:
>>> import re
>>> m = re.sub(r'\.\W(.*?) (he|she|it) said.', r'. "\1," \2 said.', s)
结果是:
>>> print m
This should have no quotes. "This one should," he said. But this one should not. "Neither should this. But this one should," she said.
如您所见,它在第一个实例周围正确地加上了报价,但在第二个实例中放置得太早了。任何帮助表示赞赏!