我有一些实时数据包如下存储在一个文件log.log
中,我正在读取log.log
文件tail -f
并解析它。但是所有的行都是随机的,没有固定的值,比如随机ip,随机值data::blocks
,每个data::是一个列值。例如在一个log.log
Ohter type of lines...
[TCP]: incomeing data: 91 bytes, data=connect data::10.109.0.200data::10.109.0.86data::wandata::p4data::1400data::800data::end
[TCP]: incomeing data: 91 bytes, data=connect data::10.109.0.201data::10.109.8.86data::landata::p4data::1400data::700data::end
[TCP]: incomeing data: 91 bytes, data=connect data::10.109.0.200data::10.109.58.86data::3gdata::p4data::400data::800data::end
something.. else...
现在,我该如何解析这条线?它可以忽略任何东西,只在匹配时解析:
connect data::ANYdata::ANYdata::ANYdata::ANYdata::ANYdata::ANYdata::end
跑:
$ tail -f /var/tmp/log.log | python -u /var/tmp/myparse.py
myparse.py:
import sys, time, os, subprocess
import re
def p(command):
subprocess.Popen(command, shell=False, stdin=subprocess.PIPE, stdout=subprocess.PIPE)
while True:
line = sys.stdin.readline()
if line:
if "command:start" in line:
print "OK - working"
p("/var/tmp/single_thread_process.sh")
if "connect data::" in line:
..
else:
# ^(?:\+|00)(\d+)$ Parse the 0032, 32, +32
#match = re.search(r'^(?:\+|00)(\d+)$', line)
#if match:
#print "OK"
### NOT working ###
match = re.search(r'^connect data::*data::*data::*data::*data::*data::*data::end$', line)
if match:
print "OK"