python - 如何使用列表理解从列表中删除重复项？

Question

如何使用列表理解从列表中删除重复项？我有以下代码：

a = [1, 2, 3, 3, 5, 9, 6, 2, 8, 5, 2, 3, 5, 7, 3, 5, 8]
b = []
b = [item for item in a if item not in b]

但它不起作用，只会产生相同的列表。为什么它产生一个相同的列表？

score 14 · Accepted Answer

它生成一个相同的列表，因为b它在运行时不包含任何元素。你想要什么：

>>> a = [1, 2, 3, 3, 5, 9, 6, 2, 8, 5, 2, 3, 5, 7, 3, 5, 8]
>>> b = []
>>> [b.append(item) for item in a if item not in b]
[None, None, None, None, None, None, None, None]
>>> b
[1, 2, 3, 5, 9, 6, 8, 7]

score 12 · Accepted Answer

如果您不介意使用与列表理解不同的技术，您可以使用一个集合：

>>> a = [1, 2, 3, 3, 5, 9, 6, 2, 8, 5, 2, 3, 5, 7, 3, 5, 8]
>>> b = list(set(a))
>>> print b
[1, 2, 3, 5, 6, 7, 8, 9]

score 5 · Accepted Answer

keys在dict以值 ina作为其键的构造上使用。

b = dict([(i, 1) for i in a]).keys()

或使用一组：

b = [i for i in set(a)]

score 4 · Accepted Answer

列表不变的原因是它b开始时是空的。这意味着if item not in b永远是True。只有在生成列表之后，才会将这个新的非空列表分配给变量b。

score 4 · Accepted Answer

使用groupby：

>>> from itertools import groupby
>>> a = [1, 2, 3, 3, 5, 9, 6, 2, 8, 5, 2, 3, 5, 7, 3, 5, 8]
>>> [k for k, _ in groupby(sorted(a, key=lambda x: a.index(x)))]
[1, 2, 3, 5, 9, 6, 8, 7]

如果您不关心值首先出现在原始列表中的哪个顺序，请省略 key 参数，例如

>>> [k for k, _ in groupby(sorted(a))]
[1, 2, 3, 5, 6, 7, 8, 9]

你可以用groupby. 要识别多次出现的项目：

>>> [k for k, v in groupby(sorted(a)) if len(list(v)) > 1]
[2, 3, 5, 8]

或者建立一个频率字典：

>>> {k: len(list(v)) for k, v in groupby(sorted(a))}
{1: 1, 2: 3, 3: 4, 5: 4, 6: 1, 7: 1, 8: 2, 9: 1}

itertools模块中有一些非常有用的功能：chain，仅举几例！teeproduct

score 1 · Accepted Answer

>>> from itertools import groupby
>>> repeated_items = [2,2,2,2,3,3,3,3,4,5,1,1,1]
>>> [
...     next(group)
...     for _, group in groupby(
...         repeated_items,
...         key=repeated_items.index
...     )
... ]
[2, 3, 4, 5, 1]

score 1 · Accepted Answer

对于 Python 3.6+，与Niek de Klein 最出色的解决方案相比，有一个改进（主要缺陷是它丢失了输入顺序）。由于dicts 现在是插入排序的，您可以这样做：

b = list(dict.fromkeys(a))

在早期的 Python 上，你会这样做：

from collections import OrderedDict

b = list(OrderedDict.fromkeys(a))

虽然它并没有那么快（即使OrderedDict移动到 C 层，它也保留了大量开销来支持dict不支持它们的重新排序操作，避免了）。

score 1 · Accepted Answer

>>> a = [10,20,30,20,10,50,60,40,80,50,40,0,100,30,60]
>>> [a.pop(a.index(i, a.index(i)+1)) for i in a if a.count(i) > 1]
>>> print(a)

python - 如何使用列表理解从列表中删除重复项？

8 回答 8

Related

Reference