-2

如果该行中有“命令”,我想从文本文档中提取每行中的第二个数字。我想要命令和其他行打印在那些执行的数字旁边。有数百行。

线条看起来像:

1376328501.285|1166703600|0|SimControl|4|Command 72FB0007: AC28200 - "Thrst History Reset" to DCDR 0 time=62

如果按照我的需要编程,这条线应该出来

1166703600 Command 72FB0007: AC28200 - "Thrst History Reset" to DCDR 0 time=62

我该怎么做呢?

4

4 回答 4

2

csv使用模块将数据视为 CSV 数据(尽管由管道分隔):

import csv

with open('inputfile', 'rb') as inputfile:
    reader = csv.reader(inputfile, delimiter='|')
    for row in reader:
        if len(row) > 5 and row[5].lower().startswith('command'):
            print row[1], row[5]

给你一个迭代器,csv.reader()为每一行产生一个列表;您的示例行将导致:

['1376328501.285', '1166703600', '0', 'SimControl', '4', 'Command 72FB0007: AC28200 - "Thrst History Reset" to DCDR 0 time=62']

索引从 0 开始,因此带有Command文本的列是row[5]; 第二列编号在row[1]. 上面的代码测试当前行中是否有足够的列,并且如果row[5]小写时以单词开头command

以上假设 Python 2;对于 Python 3,它看起来略有不同:

import csv

with open('inputfile', newline='') as inputfile:
    reader = csv.reader(inputfile, delimiter='|')
    for row in reader:
        if len(row) > 5 and row[5].lower().startswith('command'):
            print(row[1], row[5])
于 2013-10-30T17:48:20.200 回答
0
import re

s = '''
1376328501.285|1166703600|0|SimControl|4|Command aaaaa
12347801.2|11660|0|Sim|5|Command bbb
13587918501.1|13|0|XCF|6|cccccc
101.285|285|0|pof|7|ddddd
137501|-2.87|457|run|8|Command eeee
'''
print s

regx = re.compile('^[^|]+\|([^|]+).+?(Command.+\n?)',
                  re.MULTILINE)

print ''.join('%s %s' % m.groups() for m in regx.finditer(s))

结果

1376328501.285|1166703600|0|SimControl|4|Command aaaaa
12347801.2|11660|0|Sim|5|Command bbb
13587918501.1|13|0|XCF|6|cccccc
101.285|285|0|pof|7|ddddd
137501|-2.87|457|run|8|Command eeee

1166703600 Command aaaaa
11660 Command bbb
-2.87 Command eeee
于 2013-10-30T18:39:20.307 回答
0
lines = '1376328501.285|1166703600|0|SimControl|4|Command 72FB0007: AC28200 - "Thrst History Reset" to DCDR 0 time=62'

if 'Command' in lines:
    lines_lst = lines.split('|')
    what_you_want = lines_lst[1] + ' '+ lines_lst[-1]

print what_you_want
>>> 1166703600 Command 72FB0007: AC28200 - "Thrst History Reset" to DCDR 0 time=62

因此,如果您有一个包含数千行的文件,如下所示:

f = open(YOUR_FILE, 'r')
data = f.readlines()
f.close()

foo = []
for lines in data:
    if 'Command' in lines:
        lines_lst = lines.split('|')
        what_you_want = lines_lst[1] + ' '+ lines_lst[-1]
        foo.append(what_you_want)
于 2013-10-30T17:55:00.203 回答
0
>>> l = """1376328501.285|1166703600|0|SimControl|4|Command 72FB0007: AC28200 - "Thrst History Reset" to DCDR 0 time=62"""
>>> l = [l,l,l]

>>> [ele.split("|")[1] for ele in l if "command" in ele.lower()]
['1166703600', '1166703600', '1166703600']
于 2013-10-30T17:52:08.580 回答