python - 删除重复条目？

Question

我需要比较不同行的值。每一行都是一个字典，我需要比较相邻行中键“标志”的值。我该怎么做？简单地说：

for row in range(1,len(myjson))::
      if row['flag'] == (row-1)['flag']:
             print yes

返回一个类型错误：'int' object is not subscriptable

即使 range 返回一个整数列表...

回应评论：

行列表是字典列表。最初，我导入一个制表符分隔的文件并使用 csv.dict 模块读取它，使其成为一个字典列表，其中的键对应于变量名。

代码：（其中 myjson 是字典列表）

for row in myjson:
    print row

输出：

{'website': '', 'phone': '', 'flag': 0, 'name': 'Diane Grant Albrecht M.S.', 'email': ''}
{'website': 'www.got.com', 'phone': '111-222-3333', 'flag': 1, 'name': 'Lannister G. Cersei M.A.T., CEP', 'email': 'cersei@got.com'}
{'website': '', 'phone': '', 'flag': 2, 'name': 'Argle D. Bargle Ed.M.', 'email': ''}
{'website': 'www.daManWithThePlan.com', 'phone': '000-000-1111', 'flag': 3, 'name': 'Sam D. Man Ed.M.', 'email': 'dman123@gmail.com'}
{'website': '', 'phone': '', 'flag': 3, 'name': 'Sam D. Man Ed.M.', 'email': ''}
{'website': 'www.daManWithThePlan.com', 'phone': '111-222-333', 'flag': 3, 'name': 'Sam D. Man Ed.M.', 'email': 'dman123@gmail.com'}
{'website': '', 'phone': '', 'flag': 4, 'name': 'D G Bamf M.S.', 'email': ''}
{'website': '', 'phone': '', 'flag': 5, 'name': 'Amy Tramy Lamy Ph.D.', 'email': ''}

还：

type(myjson)

<type 'list'>

score 2 · Accepted Answer

为了比较相邻的项目，您可以使用zip：

例子：

>>> lis = [1,1,2,3,4,4,5,6,7,7]
for x,y in zip(lis, lis[1:]):
     if x == y :
        print x,y,'are equal'
...         
1 1 are equal
4 4 are equal
7 7 are equal

对于您的字典列表，您可以执行以下操作：

from itertools import izip
it1 = iter(list_of_dicts)
it2 = iter(list_of_dicts)
next(it2)
for x,y in izip(it1, it2):
      if x['flag'] == y['flag']
             print yes

更新：

对于超过 2 个相邻的项目，您可以使用itertools.groupby：

>>> lis =  [1,1,1,1,1,2,2,3,4]
for k,group in groupby(lis):
     print list(group)

[1, 1, 1, 1, 1]
[2, 2]
[3]
[4]

对于您的代码，它将是：

>>> for k, group in groupby(dic, key = lambda x : x['flag']):
...     print list(group)
...     
[{'website': '', 'phone': '', 'flag': 0, 'name': 'Diane Grant Albrecht M.S.', 'email': ''}]
[{'website': 'www.got.com', 'phone': '111-222-3333', 'flag': 1, 'name': 'Lannister G. Cersei M.A.T., CEP', 'email': 'cersei@got.com'}]
[{'website': '', 'phone': '', 'flag': 2, 'name': 'Argle D. Bargle Ed.M.', 'email': ''}]
[{'website': 'www.daManWithThePlan.com', 'phone': '000-000-1111', 'flag': 3, 'name': 'Sam D. Man Ed.M.', 'email': 'dman123@gmail.com'}, {'website': '', 'phone': '', 'flag': 3, 'name': 'Sam D. Man Ed.M.', 'email': ''}, {'website': 'www.daManWithThePlan.com', 'phone': '111-222-333', 'flag': 3, 'name': 'Sam D. Man Ed.M.', 'email': 'dman123@gmail.com'}]
[{'website': '', 'phone': '', 'flag': 4, 'name': 'D G Bamf M.S.', 'email': ''}]
[{'website': '', 'phone': '', 'flag': 5, 'name': 'Amy Tramy Lamy Ph.D.', 'email': ''}]

score 1 · Accepted Answer

您的例外表明这list_of_rows不是您认为的那样。

要查看其他相邻的行，如果list_of_rows确实是一个列表，我会使用enumerate()包含当前索引，然后使用该索引加载下一行和上一行：

for i, row in enumerate(list_of_rows):
    previous = list_of_rows[i - 1] if i else None
    next = list_of_rows[i + 1] if i + 1 < len(list_of_rows) else None

score 1 · Accepted Answer

1

看起来您想批量访问列表元素：http:
//code.activestate.com/recipes/303279/

于 2013-07-09T15:04:16.970 回答

score 1 · Accepted Answer

你可以试试这个

pre_item = list_of_rows[0]['flag']
for row in list_of_rows[1:]:
      if row['flag'] == pre_item :
             print yes
      pre_item = row['flag']

score 0 · Accepted Answer

list_of_rows = [ { 'a': 'foo',
                   'flag': 'bar' },
                 { 'a': 'blo',
                   'flag': 'bar' } ]
for row, successor_row in zip(list_of_rows, list_of_rows[1:]):
    if row['flag'] == successor_row['flag']:
        print "yes"

score 0 · Accepted Answer

这很简单。如果您需要删除那些对键“flag”具有相同值的字典，正如您的帖子标题所暗示的那样（这有点误导，因为您的字典不是严格意义上的重复），您可以简单地遍历整个字典列表，在单独的列表中跟踪标志，如果一个项目的标志已经在标志列表中，则根本不添加它，它看起来像：

def filterDicts(listOfDicts):
    result = []
    flags = []
    for di in listOfDicts:
        if di["flag"] not in flags:
            result.append(di)
            flags.append(di["flag"])
    return result

当使用您提供的字典列表的值调用时，它会返回包含 5 个项目的列表，每个项目都有一个唯一的标志值。

python - 删除重复条目？

6 回答 6

Related

Reference