I have some code to parse an apache log file(start_search
, and end_search
are date strings of the format found in an apache log):
with open("/var/log/apache2/access.log",'r') as log:
from itertools import takewhile, dropwhile
s_log = dropwhile(lambda L: start_search not in L, log)
e_log = takewhile(lambda L: end_search not in L, s_log)
query = [line for line in e_log if re.search(r'GET /(.+veggies|.+fruits)',line)]
import csv
query_dict = csv.DictReader(query,fieldnames=('ip','na-1','na-2','time', 'zone', 'url', 'refer', 'client'),quotechar='"',delimiter=" ")
import re
veggies = [ x for x in query_dict if re.search('veggies',x['url']) ]
fruits = [ x for x in query_dict if re.search('fruits',x['url']) ]
The second list generator is always empty; that is, if I switch the order of the last two lines:
fruits = [ x for x in query_dict if re.search('fruits',x['url']) ]
veggies = [ x for x in query_dict if re.search('veggies',x['url']) ]
the second list is always empty.
Why? (and how can I populate the fruits
and veggies
lists?)