我需要获取未包含在尖括号中的文本。
我的输入如下所示:
> whatever something<X="Y" zzz="abc">this is a foo bar <this is a
> < whatever>and i ><only want this
所需的输出是:
> whatever something
this is a foo bar <this is a
>
and i ><only want this
我尝试先检测括号内的东西,然后将其移除。但似乎我正在匹配内部<>
而不是整体的属性<...>
。我如何实现所需的输出?
import re
x = """whatever something<X="Y" zzz="abc">this is a foo bar <this is a\n< whatever>and i ><only want this"""
re.findall("<([^>]*)>", x.strip())
['X="Y" zzz="abc"', 'this is a\n ', ' whatever']