鉴于我如何提取以 <> 为界并以逗号分隔的引号中的元素列表中的解决方案 - python、正则表达式?,我能够捕获由尖括号内的 a 和值表示的所需模式的theprefix
和 thevalues
CAPITALIZED.PREFIX
< "value1" , "value2", ... >
"""calf_n1 := n_-_c_le & n_-_pn_le &\n [ ORTH.FOO < "cali.ber,kl", 'calf' , "done" >,\nLKEYS.KEYREL.PRED "_calf_n_1_rel",\n ORHT2BAR <"what so ever >", "this that mess < up"> ,\n LKEYS.KEYREL.CARG "<20>",\nLOOSE.SCREW ">20 but <30"\n JOKE <'whatthe ', "什么" >,\n THIS + ]."""
但是我遇到了问题,因为我有上面的字符串。所需的输出将是:
('ORTH.FOO', ['cali.ber,kl','calf','done'])
('ORHT2BAR', ['what so ever >', 'this that mess < up'])
('JOKE', ['whathe ', 'what'])
我尝试了以下方法,但它只给了我第一个元组,我如何获得所需输出中的所有可能的元组?:
import re
intext = """calf_n1 := n_-_c_le & n_-_pn_le &\n [ ORTH.FOO < "cali.ber,kl", 'calf' , "done" >,\nLKEYS.KEYREL.PRED "_calf_n_1_rel",\n ORHT2BAR <"what so ever >", "this that mess < up">\n LKEYS.KEYREL.CARG "<20>",\nLOOSE.SCREW ">20 but <30" ]."""
pattern = re.compile(r'.*?([A-Z0-9\.]*) < ([^>]*) >.*', flags=re.DOTALL)
f, v = pattern.match(intext).groups()
names = re.findall('[\'"](.*?)["\']', v)
print f, names