3

Is it possible to run through a subset in a python list?

I have the following problem, I have two lists, list1 is very long and list2 is quite short. Now, I want to check which elements of the list2 are also in list1. My current version looks like this:

for item in list1:
    if item in list2:
        # do something

This takes a very long time. Is it possible to get a subset and then run through the list?

I need to do this many times.

4

3 回答 3

5

如果列表元素是可散列的,您可以使用集合找到交集:

>>> for x in set(list2).intersection(list1):
        print x

如果它们不可散列,您至少可以通过对较短的列表进行排序并进行二等分查找来加快搜索速度:

>>> from bisect import bisect_left
>>> list2.sort()
>>> n = len(list2)
>>> for x in list1:
        i = bisect_left(list2, x)
        if i != n and list2[i] == x:
            print x

如果您的数据元素既不可散列也不可排序,那么您将无法加速原始代码:

>>> for x in list1:
        if x in list2:
            print x

set-intersection 方法的运行时间与两个列表的长度之和成正比O(n1 + n2)。二等分搜索方法的运行时间为O((n1 + n2) * log(n2))。原始蛮力方法的运行时间为O(n1 * n2).

于 2013-05-10T08:03:12.083 回答
1

您可以sets在此处使用,它们提供与列表O(1)相比O(N)的查找。但是集合期望项目必须是可散列的(不可变的)。

s = set(list1)
for item in lis2:
    if item in s:
       #do something
于 2013-05-10T07:57:28.420 回答
0

你可以使用速记

list3 = [l for l in list1 if l in list2]

如果您的列表中有重复的元素

l2 = list(set(list2))
l1 = list(set(list1))

list3 = [l for l l1 if l in l2]
于 2013-05-10T08:00:20.537 回答