0

我对 Python 相当陌生,我正在尝试构建一个正则表达式,它将在特定单词之间进行匹配。我的 REST 调用返回了一个长字符串,格式如下:

ip=1.0.8.0 statistic=rtt.std_dev predictions=iad-mci:114.717204,ord-cgnt:30.107700,nyc-inap:32.537077,iad-cgnt:0.000000,hkg-pccw:98.157281,ord-tata:6.058292,sjc-l3:57.089664,nyc-cgnt:36.489616,pvg-cu2:1039.978803,bgl-rel:115.671650,nyc-bgp:94.454690,pvg-cu1:377.429628,las-level3:0.000000,nyc-tgl:119.197070,atl-inap:42.021698

ip=1.0.8.0 statistic=rtt.match_length predictions=iad-mci:13.000000,ord-cgnt:16.000000,nyc-inap:16.000000,iad-cgnt:20.000000,hkg-pccw:16.000000,ord-tata:16.000000,sjc-l3:16.000000,nyc-cgnt:16.000000,pvg-cu2:16.000000,bgl-rel:13.000000,nyc-bgp:16.000000,pvg-cu1:16.000000,las-level3:20.000000,nyc-tgl:16.000000,atl-inap:16.000000

ip=1.0.8.0 statistic=rtt.mean predictions=iad-mci:348.247084,ord-cgnt:319.301775,nyc-inap:328.353336,iad-cgnt:248.600000,hkg-pccw:452.789753,ord-tata:313.643350,sjc-l3:321.487964,nyc-cgnt:315.238098,pvg-cu2:312.502609,bgl-rel:352.945035,nyc-bgp:382.419130,pvg-cu1:332.139637,las-level3:177.400000,nyc-tgl:392.333887,atl-inap:325.668400

ip=1.0.8.0 statistic=rtt.age predictions=iad-mci:3066.160981,ord-cgnt:3366.161424,nyc-inap:4266.160056,iad-cgnt:49566.161227,hkg-pccw:5166.165995,ord-tata:3066.158230,sjc-l3:5466.160068,nyc-cgnt:3366.161192,pvg-cu2:5166.160410,bgl-rel:1566.160768,nyc-bgp:3666.159675,pvg-cu1:2766.160713,las-level3:251466.160789,nyc-tgl:3966.159966,atl-inap:4866.167164

我只需要提取ip=1.0.8.0 statistic=rtt.mean 预测的所有数据。我应该如何正则表达式?我应该使用 re.findall 还是 re.match ?

4

2 回答 2

0
>>> text = '''ip=1.0.8.0 statistic=rtt.std_dev predictions=iad-mci:114.717204,ord-cgnt:30.107700,nyc-inap:32.537077,iad-cgnt:0.000000,hkg-pccw:98.157281,ord-tata:6.058292,sjc-l3:57.089664,nyc-cgnt:36.489616,pvg-cu2:1039.978803,bgl-rel:115.671650,nyc-bgp:94.454690,pvg-cu1:377.429628,las-level3:0.000000,nyc-tgl:119.197070,atl-inap:42.021698

ip=1.0.8.0 statistic=rtt.match_length predictions=iad-mci:13.000000,ord-cgnt:16.000000,nyc-inap:16.000000,iad-cgnt:20.000000,hkg-pccw:16.000000,ord-tata:16.000000,sjc-l3:16.000000,nyc-cgnt:16.000000,pvg-cu2:16.000000,bgl-rel:13.000000,nyc-bgp:16.000000,pvg-cu1:16.000000,las-level3:20.000000,nyc-tgl:16.000000,atl-inap:16.000000

ip=1.0.8.0 statistic=rtt.mean predictions=iad-mci:348.247084,ord-cgnt:319.301775,nyc-inap:328.353336,iad-cgnt:248.600000,hkg-pccw:452.789753,ord-tata:313.643350,sjc-l3:321.487964,nyc-cgnt:315.238098,pvg-cu2:312.502609,bgl-rel:352.945035,nyc-bgp:382.419130,pvg-cu1:332.139637,las-level3:177.400000,nyc-tgl:392.333887,atl-inap:325.668400

ip=1.0.8.0 statistic=rtt.age predictions=iad-mci:3066.160981,ord-cgnt:3366.161424,nyc-inap:4266.160056,iad-cgnt:49566.161227,hkg-pccw:5166.165995,ord-tata:3066.158230,sjc-l3:5466.160068,nyc-cgnt:3366.161192,pvg-cu2:5166.160410,bgl-rel:1566.160768,nyc-bgp:3666.159675,pvg-cu1:2766.160713,las-level3:251466.160789,nyc-tgl:3966.159966,atl-inap:4866.167164'''
>>> import re
>>> re.findall(r'ip=1.0.8.0 statistic=rtt.mean predictions.*', text)
['ip=1.0.8.0 statistic=rtt.mean predictions=iad-mci:348.247084,ord-cgnt:319.301775,nyc-inap:328.353336,iad-cgnt:248.600000,hkg-pccw:452.789753,ord-tata:313.643350,sjc-l3:321.487964,nyc-cgnt:315.238098,pvg-cu2:312.502609,bgl-rel:352.945035,nyc-bgp:382.419130,pvg-cu1:332.139637,las-level3:177.400000,nyc-tgl:392.333887,atl-inap:325.668400']
于 2013-06-10T20:04:42.153 回答
0

尽量避免在不需要时使用正则表达式。这不仅更快,而且大多数时候最终变得更加健壮。

In [10]: text = '''ip=1.0.8.0 statistic=rtt.std_dev predictions=iad-mci:114.717204,ord-cgnt:30.107700,nyc-inap:32.537077,iad-cgnt:0.000000,hkg-pccw:98.157281,ord-tata:6.058292,sjc-l3:57.089664,nyc-cgnt:36.489616,pvg-cu2:1039.978803,bgl-rel:115.671650,nyc-bgp:94.454690,pvg-cu1:377.429628,las-level3:0.000000,nyc-tgl:119.197070,atl-inap:42.021698
ip=1.0.8.0 statistic=rtt.match_length predictions=iad-mci:13.000000,ord-cgnt:16.000000,nyc-inap:16.000000,iad-cgnt:20.000000,hkg-pccw:16.000000,ord-tata:16.000000,sjc-l3:16.000000,nyc-cgnt:16.000000,pvg-cu2:16.000000,bgl-rel:13.000000,nyc-bgp:16.000000,pvg-cu1:16.000000,las-level3:20.000000,nyc-tgl:16.000000,atl-inap:16.000000
ip=1.0.8.0 statistic=rtt.mean predictions=iad-mci:348.247084,ord-cgnt:319.301775,nyc-inap:328.353336,iad-cgnt:248.600000,hkg-pccw:452.789753,ord-tata:313.643350,sjc-l3:321.487964,nyc-cgnt:315.238098,pvg-cu2:312.502609,bgl-rel:352.945035,nyc-bgp:382.419130,pvg-cu1:332.139637,las-level3:177.400000,nyc-tgl:392.333887,atl-inap:325.668400
ip=1.0.8.0 statistic=rtt.age predictions=iad-mci:3066.160981,ord-cgnt:3366.161424,nyc-inap:4266.160056,iad-cgnt:49566.161227,hkg-pccw:5166.165995,ord-tata:3066.158230,sjc-l3:5466.160068,nyc-cgnt:3366.161192,pvg-cu2:5166.160410,bgl-rel:1566.160768,nyc-bgp:3666.159675,pvg-cu1:2766.160713,las-level3:251466.160789,nyc-tgl:3966.159966,atl-inap:4866.167164'''

In [13]: ip = '1.0.8.0'

In [14]: result = filter(lambda s: s.startswith('ip={0} statistic=rtt.mean predictions'.format(ip)), text.split('\n'))

In [15]: list(result)
Out[15]: ['ip=1.0.8.0 statistic=rtt.mean predictions=iad-mci:348.247084,ord-cgnt:319.301775,nyc-inap:328.353336,iad-cgnt:248.600000,hkg-pccw:452.789753,ord-tata:313.643350,sjc-l3:321.487964,nyc-cgnt:315.238098,pvg-cu2:312.502609,bgl-rel:352.945035,nyc-bgp:382.419130,pvg-cu1:332.139637,las-level3:177.400000,nyc-tgl:392.333887,atl-inap:325.668400']
于 2013-06-10T20:12:12.233 回答