0

我有以下问题:

>>> lines = tuple(open('/var/log/fail2ban.log', 'r'))
>>> for item in lines:
...     item = item.strip('\n')
...     if "fail2ban.actions:" in item and "[postfix]" in item and "Ban" in item:
...             item = item.split(' ')
...             print item
...
['2013-01-17', '11:03:51,752', 'fail2ban.actions:', 'WARNING', '[postfix]', 'Ban', '87.111.253.157']
['2013-01-17', '11:10:42,612', 'fail2ban.actions:', 'WARNING', '[postfix]', 'Ban', '37.206.77.26']
['2013-01-17', '11:23:08,674', 'fail2ban.actions:', 'WARNING', '[postfix]', 'Ban', '37.2.185.188']
['2013-01-17', '12:40:44,997', 'fail2ban.actions:', 'WARNING', '[postfix]', 'Ban', '37.2.185.188']
['2013-01-17', '13:28:38,006', 'fail2ban.actions:', 'WARNING', '[postfix]', 'Ban', '194.106.26.177']
['2013-01-17', '13:43:56,959', 'fail2ban.actions:', 'WARNING', '[postfix]', 'Ban', '70.27.53.95']
['2013-01-17', '14:42:36,601', 'fail2ban.actions:', 'WARNING', '[postfix]', 'Ban', '95.120.42.12']
['2013-01-17', '14:45:35,147', 'fail2ban.actions:', 'WARNING', '[postfix]', 'Ban', '95.120.42.12']

我非常想知道如何过滤重复项(项目 [6],在这种情况下为 ip),以便只打印唯一值。

4

3 回答 3

0

您可以创建一个列表或一组您已经看过的 IP,然后在打印该行之前检查该列表。

像这样的东西:

lines = tuple(open('/var/log/fail2ban.log', 'r'))
seen = set()    
for item in lines:
  item = item.strip('\n')
  if "fail2ban.actions:" in item and "[postfix]" in item and "Ban" in item:
    item = item.split(' ')
    if item[6] not in seen:
      seen.add(item[6])
      print item
于 2013-01-17T14:15:12.990 回答
0
>>> lines = tuple(open('/var/log/fail2ban.log', 'r'))
>>> seen = set()    
>>> for item in lines:
...     item = item.strip('\n')
...     if "fail2ban.actions:" in item and "[postfix]" in item and "Ban" in item:
...             item = item.split(' ')
...             if item[6] not in seen: 
...                 print item
...             else:
...                 seen.add(item[6])
于 2013-01-17T14:16:07.343 回答
0

如果您只希望每个 IP 有一个条目并且没有结果是哪个条目,请尝试以下操作:

item_dict = dict()
lines = tuple(open('/var/log/fail2ban.log', 'r'))
for item in lines:
    item = item.strip('\n')
    if "fail2ban.actions:" in item and "[postfix]" in item and "Ban" in item:
            item = item.split(' ')
            item_dict[item[6]]=item[:-1]

print(item_dict)

[编辑]:如果顺序很重要,您可以使用 OrderedDict。为此,只需更换

item_dict = dict()

from collections import OrderedDict
item_dict = OrderedDict()

[编辑 2]:如果您只需要一组符合您的条件的 IP,那么您应该使用一组 IP。

item_set = set()
lines = tuple(open('/var/log/fail2ban.log', 'r'))
for item in lines:
    item = item.strip('\n')
    if "fail2ban.actions:" in item and "[postfix]" in item and "Ban" in item:
            item = item.split(' ')
            item_set.add(item[6])

print('\n'.join(item_set))

根据定义,集合的每个元素都是唯一的。

于 2013-01-17T14:18:20.990 回答