I want to "grep" multiple regex on multiple files. I have all those regex in a file (one per line), that I load in the following way, constructing a "super regex" :
dic = open('regex.dic')
rex = []
for l in iter(dic):
if not l.startswith('#'):
rex.append('^.*%s.*$' % l.strip())
rex = '|'.join(rex)
debug('rex='+rex)
global regex
regex = re.compile(rex, re.IGNORECASE|re.MULTILINE)
dic.close()
Then I check my files like this :
with open(fn, 'r') as f: data = f.readlines()
for i, line in enumerate(data):
if len(line) <= 512: #Sanity check
if regex.search(line):
if not alreadyFound:
log( "[!]Found in %s:" % fn)
alreadyFound = True
found = True
copyFile(fn)
log("\t%s" % '\t'.join(data[i-args.context:i+args.context+1]).strip())
This works. I feel this is really not efficient and dangerous (some regex in the dic could break the "super regex"). I was thinking about looping in the regex array, but that would mean scanning each file multiple time :/
Any brillant idea on how to do this ? Thanks!