3

我正在尝试将 bash 命令转换为 python 中的子进程。bash 命令是:

猫日志文件.msg.log | grep ABCD | awk '{打印 $14,$10,$5,$7}' | 排序 -t' ' -k4 -n -r | 头 -10 > output.csv

到目前为止,我有以下子流程:

cat = subprocess.Popen(['cat', 'LogFile.msg.log'],
                        stdout=subprocess.PIPE,
                        )
grep = subprocess.Popen(['grep', 'ABCD'],
                        stdin=cat.stdout,
                        stdout=subprocess.PIPE,
                        )
awk = subprocess.Popen(['awk', '{print $14,$10,$5,$7}'],
                        stdin=grep.stdout,
                        stdout=subprocess.PIPE,
                        )
sort = subprocess.Popen(['sort', '-t','' '', '-k4', '-n', '-r'],
                        stdin=awk.stdout,
                        stdout=subprocess.PIPE,
                        )
head = subprocess.Popen(['head', '-10'],
                        stdin=sort.stdout,
                        stdout=subprocess.PIPE,
                        )
out = subprocess.Popen(['>', 'output.csv'],
                        stdin=head.stdout,
                        stdout=subprocess.PIPE,
                        )
end_of_pipe = out.stdout

现在我收到以下错误:

Sort: empty tab
Traceback (most recent call last):
  File "./latency2", line 39, in <module>
    stdout=subprocess.PIPE,
  File "/usr/lib64/python2.6/subprocess.py", line 639, in __init__
    errread, errwrite)
  File "/usr/lib64/python2.6/subprocess.py", line 1228, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory

我确定我错过了一些东西,但不确定是什么。

4

2 回答 2

3

你有两个问题。首先是您没有sort正确翻译参数。当您运行此sort命令时:

sort -t' ' -k4 -n -r

shell 将标记粘贴在一起-t并粘贴' '到单个参数"-t "(破折号、三通、空格)中。因此,它的正确子流程参数应该是:

sort = subprocess.Popen(['sort', '-t ', '-k4', '-n', '-r'],
                        stdin=awk.stdout,
                        stdout=subprocess.PIPE,
                        )

第二个问题是最终重定向到带有> output.csv标记的文件。当 shell 看到这一点时,它不会运行名为>;的命令。相反,它打开文件output.csv进行写入并将其设置为最后一个命令的标准输出句柄。因此,您不应该尝试运行名为>子进程的命令;相反,您需要通过打开文件来模拟 shell:

head = subprocess.Popen(['head', '-10'],
                        stdin=sort.stdout,
                        stdout=open('output.csv', 'w'),  # Not a pipe here
                        )
于 2013-04-10T05:05:07.183 回答
2

你可以重写:

cat LogFile.msg.log | grep ABCD | awk '{print $14,$10,$5,$7}' |
sort -t' ' -k4 -n -r | head -10 > output.csv

在纯 Python 中:

from heapq import nlargest
from operator import itemgetter

select_items = itemgetter(13, 9, 4, 6) # note: zero-based indices
with open('LogFile.msg.log') as file, open('output.csv', 'w') as outfile:
    rows = (select_items(line.split()) for line in file if 'ABCD' in line)
    top10_rows = nlargest(10, rows, key=lambda row: int(row[3]))
    print("\n".join(map(" ".join, top10_rows)))
于 2013-04-10T20:05:00.380 回答