我有一个结构如下的文件:
A: some text
B: more text
even more text
on several lines
A: and we start again
B: more text
more
multiline text
我正在尝试找到将像这样拆分我的文件的正则表达式:
>>>re.findall(regex,f.read())
[('some text','more text','even more text\non several lines'),
('and we start again','more text', 'more\nmultiline text')]
到目前为止,我已经完成了以下操作:
>>>re.findall('A:(.*?)\nB:(.*?)\n(.*?)',f.read(),re.DOTALL)
[(' some text', ' more text', ''), (' and we start again', ' more text', '')]
未捕获多行文本。我猜是因为惰性限定符真的很懒,什么也抓不到,但我把它拿出来,正则表达式变得非常贪婪:
>>>re.findall('A:(.*?)\nB:(.*?)\n(.*)',f.read(),re.DOTALL)
[(' some text',
' more text',
'even more text\non several lines\nA: and we start again\nB: more text\nmore\nmultiline text')]
有人有想法吗?谢谢 !