我一次处理多个文件。每个文件都有摘要统计信息。在流程结束时,我想创建一个汇总文件,将所有统计信息加起来。我已经知道如何从日志文件中挖掘统计信息。但我希望能够添加数字并回显到另一个文件这是我用来挖掘时间的。
find . -iname "$srch1*" -exec grep "It took" {} \; -print
输出是这样的
It took 0 hours, 11 minutes and 4 seconds to process that file.
./filepart000010-20140204-154923.dat.gz.log
It took 0 hours, 11 minutes and 56 seconds to process that file.
./filepart000007-20140204-154923.dat.gz.log
It took 0 hours, 29 minutes and 54 seconds to process that file.
./filepart000001-20140204-154923.dat.gz.log
It took 0 hours, 22 minutes and 33 seconds to process that file.
./filepart000004-20140204-154923.dat.gz.log
It took 0 hours, 59 minutes and 38 seconds to process that file.
./filepart000000-20140204-154923.dat.gz.log
It took 0 hours, 11 minutes and 50 seconds to process that file.
./filepart000005-20140204-154923.dat.gz.log
It took 0 hours, 22 minutes and 10 seconds to process that file.
./filepart000002-20140204-154923.dat.gz.log
It took 0 hours, 10 minutes and 39 seconds to process that file.
./filepart000008-20140204-154923.dat.gz.log
It took 0 hours, 12 minutes and 27 seconds to process that file.
./filepart000009-20140204-154923.dat.gz.log
It took 0 hours, 22 minutes and 36 seconds to process that file.
./filepart000003-20140204-154923.dat.gz.log
It took 0 hours, 11 minutes and 40 seconds to process that file.
./filepart000006-20140204-154923.dat.gz.log
我想要的是这样的
Summary
filepart000006-20140204-154923.dat.gz.log 0 hours, 11 minutes and 40 seconds
然后找出其中最长的时间并输出一些消息,例如。
Total time taken =____________
我是并行运行的,所以花费的时间是最长的。
然后做一些这样的计算。
find . -iname "$srch*" -exec grep "Processed Files" {} \; -print
Processed Files: 7936635
./filename-20131102-part000000-20140204-153310.dat.gz.log
Processed Files: 3264805
./filename-20131102-part000001-20140204-153310.dat.gz.log
Processed Files: 1607547
./filename-20131102-part000008-20140204-153310.dat.gz.log
Processed Files: 3180478
./filename-20131102-part000003-20140204-153310.dat.gz.log
Processed Files: 1595497
./filename-20131102-part000007-20140204-153310.dat.gz.log
Processed Files: 1568532
./filename-20131102-part000009-20140204-153310.dat.gz.log
Processed Files: 3259884
./filename-20131102-part000002-20140204-153310.dat.gz.log
Processed Files: 3141542
./filename-20131102-part000004-20140204-153310.dat.gz.log
Processed Files: 3124221
./filename-20131102-part000005-20140204-153310.dat.gz.log
Processed Files: 3136845
./filename-20131102-part000006-20140204-153310.dat.gz.log
如果我只想要指标
( find . -iname "dl-aster-full-20131102*" -exec grep "Processed Files" {} \;) | cut -d":" -f2
7936635
3264805
1607547
3180478
1595497
1568532
3259884
3141542
3124221
3136845
基于以上 2 只创建一个摘要文件。
Filename Processed files
filename-20131102-part000000-20140204-153310.dat.gz.log 7936635
....然后是以上所有内容的摘要。
( 7936635 +
3264805 +
1607547 +
3180478.....etc
1595497
1568532
3259884
3141542
3124221
3136845 ) as
Total Files = ____________
所以总体喜欢这个。
Filename Processed files
filename-20131102-part000000-20140204-153310.dat.gz.log 7936635
Total Files = ____________ ( sum of all above )
需要做的就是——获取格式的输出
Filename Processed files
filename-20131102-part000000-20140204-153310.dat.gz.log 7936635
在我上面的命令中,它们位于不同的行,然后对已经输出的数字进行求和。
我的问题是。- 我怎样才能像上面那样执行加法 - 使用任何东西。我会避免使用 PERL,因为我不确定,它会安装在运行 shell 的任何地方——我怎样才能像上面那样格式化输出。我已经知道如何提取输出