12

我正在尝试从 2 个列表中删除重复项。所以我写了这个函数:

a = ["abc", "def", "ijk", "lmn", "opq", "rst", "xyz"]

b = ["ijk", "lmn", "opq", "rst", "123", "456", ]

for i in b:
    if i in a:
        print "found " + i
        b.remove(i)

print b

但是我发现匹配项之后的匹配项不会被删除。

我得到这样的结果:

found ijk
found opq
['lmn', 'rst', '123', '456']

但我希望结果是这样的:

['123', '456']

我怎样才能修复我的功能来做我想做的事?

谢谢你。

4

11 回答 11

35

Here is what's going on. Suppose you have this list:

['a', 'b', 'c', 'd']

and you are looping over every element in the list. Suppose you are currently at index position 1:

['a', 'b', 'c', 'd']
       ^
       |
   index = 1

...and you remove the element at index position 1, giving you this:

['a',      'c', 'd']
       ^
       |
    index 1

After removing the item, the other items slide to the left, giving you this:

['a', 'c', 'd']
       ^
       |
    index 1

Then when the loop runs again, the loop increments the index to 2, giving you this:

['a', 'c', 'd']
            ^ 
            |
         index = 2

See how you skipped over 'c'? The lesson is: never delete an element from a list that you are looping over.

于 2013-08-12T19:28:56.293 回答
28

您的问题似乎是您正在更改您正在迭代的列表。而是迭代列表的副本。

for i in b[:]:
    if i in a:
        b.remove(i)


>>> b
['123', '456']

但是,如何使用列表推导来代替?

>>> a = ["abc", "def", "ijk", "lmn", "opq", "rst", "xyz"]
>>> b = ["ijk", "lmn", "opq", "rst", "123", "456", ]
>>> [elem for elem in b if elem not in a ]
['123', '456']
于 2013-08-12T19:20:03.940 回答
25

关于什么

b= set(b) - set(a)

如果您需要可能的重复在b结果和/或要保留的顺序中也出现重复,那么

b= [ x for x in b if not x in a ] 

会做。

于 2013-08-12T19:24:48.023 回答
4

您要求删除两个列表重复项,这是我的解决方案:

from collections import OrderedDict
a = ["abc", "def", "ijk", "lmn", "opq", "rst", "xyz"]
b = ["ijk", "lmn", "opq", "rst", "123", "456", ]

x = OrderedDict.fromkeys(a)
y = OrderedDict.fromkeys(b)

for k in x:
    if k in y:
        x.pop(k)
        y.pop(k)


print x.keys()
print y.keys()

结果:

['abc', 'def', 'xyz']
['123', '456']

这里的好处是您保持两个列表项的顺序

于 2013-08-12T19:26:55.487 回答
3

或一组

set(b).difference(a)

如果这很重要,则预先警告集合将不会保持秩序

于 2013-08-12T19:22:11.597 回答
3

您可以使用 lambda 函数。

f = lambda list1, list2: list(filter(lambda element: element not in list2, list1))

list2 中的重复元素将从 list1 中删除。

>>> a = ["abc", "def", "ijk", "lmn", "opq", "rst", "xyz"]
>>> b = ["ijk", "lmn", "opq", "rst", "123", "456"]
>>> f(a, b)
['abc', 'def', 'xyz']
>>> f(b, a)
['123', '456']
于 2020-12-14T12:31:44.750 回答
2

避免在迭代列表时编辑列表问题的一种方法是使用推导:

a = ["abc", "def", "ijk", "lmn", "opq", "rst", "xyz"]
b = ["ijk", "lmn", "opq", "rst", "123", "456", ]
b = [x for x in b if not x in a]
于 2013-08-12T20:24:39.583 回答
0

关于“如何解决它?”已经有很多答案了,所以这是一个“如何改进它并变得更加 Pythonic?”:既然你想要实现的是得到 listb和 list之间的区别a,你应该对集合使用差异操作(集合上的操作):

>>> a = ["abc", "def", "ijk", "lmn", "opq", "rst", "xyz"]
>>> b = ["ijk", "lmn", "opq", "rst", "123", "456", ]
>>> s1 = set(a)
>>> s2 = set(b)
>>> s2 - s1
set(['123', '456'])
于 2013-08-12T20:29:37.400 回答
0

您可以使用综合列表

a = ["abc", "def", "ijk", "lmn", "opq", "rst", "xyz"]
b = ["ijk", "lmn", "opq", "rst", "123", "456", ]

从 a 中删除的重复值

c=[value for value in a if value not in b]

从 b 中删除重复值

c=[value for value in b if value not in a]
于 2021-09-08T11:44:26.967 回答
0

按照 7stud 的思路,如果您以相反的顺序浏览列表,则不会遇到您遇到的问题:

a = ["abc", "def", "ijk", "lmn", "opq", "rst", "xyz"]

b = ["ijk", "lmn", "opq", "rst", "123", "456", ]

for i in reversed(b):
    if i in a:
        print "found " + i
        b.remove(i)

print b

Output:
found rst
found opq
found lmn
found ijk
['123', '456']

于 2020-12-14T13:09:01.203 回答
-1

一个简单的解决方法是迭代一个范围,查看索引处的元素,删除该元素,然后将计数器减 1。
模拟未经测试的代码

for i in range(0, len(b)):
    if b[i] in a:
        del b[i]
        i -= 1

于 2021-08-27T10:14:14.137 回答