0

您好,我正在寻找一个 McAfee 日志文件并删除所有“正常”和其他我不想看到的报告实例。之前我们使用了一个利用 grep 的 -v 选项的 shell 脚本,但现在我们正在寻找一个可以在 linux 和 windows 上运行的 python 脚本。经过几次尝试,我能够让一个正则表达式在一个在线正则表达式生成器中工作,但是我很难将它实现到我的脚本中。 在线正则表达式生成器

编辑:我想删除“正常”、“坏了”、“是块行”和“无法打开文件”行,所以我只剩下一个文件,里面有我遇到的问题有兴趣。在shell中有点像这样:

grep -v "is OK" ${OUTDIR}/${OUTFILE} | grep -v "is a broken" | grep -v "file could not be opened" | grep -v "is a block" > ${OUTDIR}/${OUTFILE}.trimmed 2>&1

我在这里阅读并搜索文件:

import re

f2 = open(outFilePath)
contents = f2.read()
print contents
p = re.compile("^((?!(is OK)|(file could not be opened)| (is a broken)|(is a block)))*$", re.MULTILINE | re.DOTALL)
m = p.findall(contents)
print len(m)
for iter in m:
    print iter
f2.close()

我正在尝试搜索的文件示例:

eth0
10.0.11.196
00:0C:29:AF:6A:A7
parameters passed to uvscan: --DRIVER /opt/McAfee/uvscan/datfiles/current --    ANALYZE --AFC=32 ATIME-PRESERVE --PLAD --RPTALL RPTOBJECTS SUMMARY --UNZIP -- RECURSIVE --SHOWCOMP --MIME --THREADS=4 /tmp
temp XML output is: /tmp/HIQZRq7t2R
McAfee VirusScan Command Line for Linux64 Version: 6.0.5.614
Copyright (C) 2014 McAfee, Inc.
(408) 988-3832 LICENSED COPY - April 03 2016

AV Engine version: 5700.7163 for Linux64.
Dat set version: 8124 created Apr 3 2016
Scanning for 670707 viruses, trojans and variants.


No file or directory found matching /root/SVN/swd-lhn-build/trunk/utils/ATIME-PRESERVE

No file or directory found matching /root/SVN/swd-lhn-build/trunk/utils/RPTOBJECTS

No file or directory found matching /root/SVN/swd-lhn-build/trunk/utils/SUMMARY
/tmp/tmp.BQshVRSiBo ... is OK.
/tmp/keyring-F6vVGf/socket ... file could not be opened.
/tmp/keyring-F6vVGf/socket.ssh ... file could not be opened.
/tmp/keyring-F6vVGf/socket.pkcs11 ... file could not be opened.
/tmp/yum.log ... is OK.
/tmp/tmp.oW75zGUh4S ... is OK.
/tmp/.X11-unix/X0 ... file could not be opened.
/tmp/tmp.LCZ9Ji6OLs ... is OK.
/tmp/tmp.QdAt1TNQSH ... is OK.
/tmp/ks-script-MqIN9F ... is OK.
/tmp/tmp.mHXPvYeKjb/mcupgrade.conf ... is OK.
/tmp/tmp.mHXPvYeKjb/uvscan/uninstall-uvscan ... is OK.
/tmp/tmp.mHXPvYeKjb/mcscan ... is OK.
/tmp/tmp.mHXPvYeKjb/uvscan/install-uvscan ... is OK.
/tmp/tmp.mHXPvYeKjb/uvscan/readme.txt ... is OK.
/tmp/tmp.mHXPvYeKjb/uvscan/uvscan_secure ... is OK.
/tmp/tmp.mHXPvYeKjb/uvscan/signlic.txt ... is OK.
/tmp/tmp.mHXPvYeKjb/uvscan/uvscan ... is OK.
/tmp/tmp.mHXPvYeKjb/uvscan/liblnxfv.so.4 ... is OK.

但我没有得到正确的输出。我也尝试删除 MULTILINE 和 DOTALL 选项,但仍然没有得到正确的响应。下面是使用 DOTALL 和 MULTILINE 运行时的输出。

9
('', '', '', '', '')
('', '', '', '', '')
('', '', '', '', '')
('', '', '', '', '')
('', '', '', '', '')
('', '', '', '', '')
('', '', '', '', '')
('', '', '', '', '')
('', '', '', '', '')

任何帮助将非常感激!!谢谢!!

4

4 回答 4

2

也许想想更简单,一行一行:

import re
import sys

pattern = re.compile(r"(is OK)|(file could not be opened)|(is a broken)|(is a block)")

with open(sys.argv[1]) as handle:
    for line in handle:
        if not pattern.search(line):
            sys.stdout.write(line)

输出:

eth0
10.0.11.196
00:0C:29:AF:6A:A7
parameters passed to uvscan: --DRIVER /opt/McAfee/uvscan/datfiles/current --    ANALYZE --AFC=32 ATIME-PRESERVE --PLAD --RPTALL RPTOBJECTS SUMMARY --UNZIP -- RECURSIVE --SHOWCOMP --MIME --THREADS=4 /tmp
temp XML output is: /tmp/HIQZRq7t2R
McAfee VirusScan Command Line for Linux64 Version: 6.0.5.614
Copyright (C) 2014 McAfee, Inc.
(408) 988-3832 LICENSED COPY - April 03 2016

AV Engine version: 5700.7163 for Linux64.
Dat set version: 8124 created Apr 3 2016
Scanning for 670707 viruses, trojans and variants.


No file or directory found matching /root/SVN/swd-lhn-build/trunk/utils/ATIME-PRESERVE

No file or directory found matching /root/SVN/swd-lhn-build/trunk/utils/RPTOBJECTS

No file or directory found matching /root/SVN/swd-lhn-build/trunk/utils/SUMMARY
于 2016-04-05T21:33:37.210 回答
0

有时正则表达式更复杂,但如果你真的只是在寻找这些模式,那么我可能会尝试简单的方法:

terms = (
    'is OK',
    'file could not be opened',
    'is a broken',
    'is a block',
)

with open('/tmp/sample.log') as f:
    for line in f:
        if line.strip() and not any(term in line for term in terms):
            print(line, end='')

它可能不会比正则表达式快,但它很简单。或者,您也可以使用更严格的方法:

terms = (
    'is a broken',
    'is a block',
)

with open('/tmp/samplelog.log') as f:
    for line in f:
        line = line.strip()
        if not line:
            continue
        elif line.endswith('is OK.'):
            continue
        elif line.endswith('file could not be opened.'):
            continue
        elif any(term in line for term in terms):
            continue
        print(line)

我将采取的方法在很大程度上取决于我希望谁使用该脚本:)

于 2016-04-05T21:40:46.137 回答
0

试试这个(它在一行中完成)

p = re.compile("^(?:[if](?!s OK|s a broken|s a block|ile could not be opened)|[^if])*$")

这意味着如果在一行中有“i”或“f”,则不能跟上提到的后缀,或者它不是“i”或“f”,那么没关系。它对行中的所有字符重复此操作。

编辑:在 regex101.com 进行测试后,我发现它为什么不起作用。这是可以工作的单行正则表达式。

p = re.compile("^(?:[^if\n]|[if](?!s OK|ile could not be openeds OK|s a broken|s a block|ile could not be opened))*$", re.MULTILINE)
于 2016-04-05T21:45:05.070 回答
0

我知道现在回答为时已晚。但我看到没有答案是正确的解决方案。

您在这种情况下的正则表达式是错误的。您有不必要的附加组,缺少一个句号“。” 此外,仅当“是 OK|文件无法打开|已损坏”位于句子的开头时,它才会匹配。

"hello world is OK": does not match  
"is OK hello world": matches

在反向匹配中,只需使用非捕获组 '(?:)' 而不是捕获组 '()'。这是为了不得到一个空字符串。

如果要删除整个句子,可以使用以下表达式:

 r"^(?!.*(?:is OK|is a broken|file could not be opened)).*"
"is OK. hello world": matches  
"hello world is OK.": matches  
"is Ok.": matches

如果要删除整个句子,但只删除以“is OK.|File could not be open.|Is a broken.”结尾的句子,可以使用以下表达式:

r"^(?!.*(?:is OK|is a broken|file could not be opened)\.$).*"
"is OK. hello world" does not match  
"hello world is OK.": matches  
"is Ok.": matches

记得使用非捕获组 '(?:)' 而不是捕获组 '()',否则你会得到一个空字符串:

                #Capturing group
regex = r"^(?!.*(is OK|file could not be opened|is a broken|is a block)).*"
print(re.findall(regex,text,flags=re.MULTILINE))

输出:

['', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '']

使用join()函数获取全文

                #Non-capturing group
regex = r"^(?!.*(?:is OK|file could not be opened|is a broken|is a block)).*"
print("\n".join(re.findall(regex,text,flags=re.MULTILINE)))

输出:

eth1
10.0.11.196
00:0C:29:AF:6A:A7
parameters passed to uvscan: --DRIVER /opt/McAfee/uvscan/datfiles/current --    ANALYZE --AFC=32 ATIME-PRESERVE --PLAD --RPTALL RPTOBJECTS SUMMARY --UNZIP -- RECURSIVE --SHOWCOMP --MIME --THREADS=4 /tmp
temp XML output is: /tmp/HIQZRq7t2R
McAfee VirusScan Command Line for Linux64 Version: 6.0.5.614
Copyright (C) 2014 McAfee, Inc.
(408) 988-3832 LICENSED COPY - April 03 2016

AV Engine version: 5700.7163 for Linux64.
Dat set version: 8124 created Apr 3 2016
Scanning for 670707 viruses, trojans and variants.


No file or directory found matching /root/SVN/swd-lhn-build/trunk/utils/ATIME-PRESERVE

No file or directory found matching /root/SVN/swd-lhn-build/trunk/utils/RPTOBJECTS

No file or directory found matching /root/SVN/swd-lhn-build/trunk/utils/SUMMARY

测试一下

于 2020-10-11T10:27:32.420 回答