python - python中“in”和“for”操作的效率

Question

我在这里有两个列表，分别称为 A 和 B。

len(A) > len(B)

有两种方法可以做某事：

方法一：

def f():
    return [someFunc(i) for i in A if i in B]

方法二：

def f():
    return [someFunc(i) for i in B if i in A]

哪个更有效，为什么？

score 4 · Accepted Answer

4

Both are O(mn), since you're iterating over each list for each element of the other list.

于 2012-08-15T08:48:02.697 回答

score 4 · Accepted Answer

虽然操作的时间复杂度in是O(n)，所以整个操作是O(nm)，但这主要适用于它如何随问题规模扩展。A在性能方面，除了和B互斥的最坏情况之外，它应该更快地执行for B if i in A(where len(A) > len(B))，因为in一旦找到匹配项就会停止迭代。

A考虑所有条目和B具有相同值的最佳情况。该in操作将有效地O(1)和整个操作O(m)。

每个人都喜欢的一些基准：

$ python -m timeit "A=list(range(100000));B=list(range(100))" "[i for i in A if i in B]"
10 loops, best of 3: 113 msec per loop

$ python -m timeit "A=list(range(100000));B=list(range(100))" "[i for i in B if i in A]"
100 loops, best of 3: 2.6 msec per loop

抛开性能不谈，请注意您提供的两个函数做的不是同一件事。第一个迭代A并丢弃未出现在中的值，B这与迭代B并丢弃未出现在中的值不同A。回到两个列表中的所有值都相同的场景，第一个函数将返回len(A)元素，而第二个len(B)元素将返回。

score 3 · Accepted Answer

def f(A, B):
    return [someFunc(_) for _ in set(A).intersection(B)]

should be the most efficient way of doing this (at least if the lists are long enough for the time complexity to be of issue)

python - python中“in”和“for”操作的效率

3 回答 3

Related

Reference