python - 与 python 代码相比，库 itertools 的性能

Question

作为对我的问题的回答找到两个列表相同的基于 1 的位置，我得到了使用 C 库 itertools 来加快速度的提示。

为了验证我使用 cProfile 编写了以下测试：

from itertools import takewhile, izip

def match_iter(self, other):
    return sum(1 for x in takewhile(lambda x: x[0] == x[1],
                                        izip(self, other)))

def match_loop(self, other):
    element = -1
    for element in range(min(len(self), len(other))):
        if self[element] != other[element]:
            element -= 1
            break
    return element +1

def test():
    a = [0, 1, 2, 3, 4]
    b = [0, 1, 2, 3, 4, 0]

    print("match_loop a=%s, b=%s, result=%s" % (a, b, match_loop(a, b)))
    print("match_iter a=%s, b=%s, result=%s" % (a, b, match_iter(a, b)))

    i = 10000
    while i > 0:
        i -= 1
        match_loop(a, b)
        match_iter(a, b)

def profile_test():
    import cProfile
    cProfile.run('test()')

if __name__ == '__main__':
    profile_test()

函数 match_iter() 正在使用 itertools，而函数 match_loop() 是我在使用普通 python 之前实现的。

函数 test() 定义了两个列表，打印带有两个函数结果的列表以验证它是否正常工作。两个结果都具有预期值 5，即列表的长度相等。然后它在这两个函数上循环 10,000 次。

最后，使用 profile_test() 分析整个事情。

我学到的是 izip 没有在 python3 的 itertools 中实现，至少在我使用的 debian wheezy whitch 中没有实现。所以我用python2.7运行了测试

结果如下：

python2.7 match_test.py
match_loop a=[0, 1, 2, 3, 4], b=[0, 1, 2, 3, 4, 0], result=5
match_iter a=[0, 1, 2, 3, 4], b=[0, 1, 2, 3, 4, 0], result=5
         180021 function calls in 0.636 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.636    0.636 <string>:1(<module>)
        1    0.039    0.039    0.636    0.636 match_test.py:15(test)
    10001    0.048    0.000    0.434    0.000 match_test.py:3(match_iter)
    60006    0.188    0.000    0.275    0.000 match_test.py:4(<genexpr>)
    50005    0.087    0.000    0.087    0.000 match_test.py:4(<lambda>)
    10001    0.099    0.000    0.162    0.000 match_test.py:7(match_loop)
    20002    0.028    0.000    0.028    0.000 {len}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
    10001    0.018    0.000    0.018    0.000 {min}
    10001    0.018    0.000    0.018    0.000 {range}
    10001    0.111    0.000    0.387    0.000 {sum}

让我想知道的是，查看 cumtime 值，我的普通 python 版本对于 10,000 个循环的值为 0.162 秒，而 match_iter 版本需要 0.434 秒。

一方面，python 非常快，很棒，所以我不必担心。但这是否正确，C 库完成工作所需的时间是简单 python 代码的两倍多？还是我犯了一个致命的错误？

为了验证我也用 python2.6 运行了测试，这似乎更快，但循环和 itertools 之间的区别相同。

谁有经验并愿意提供帮助？

score 7 · Accepted Answer

首先，对实际计时的赞誉。
其次，可读性通常比编写快速代码更重要。如果您的代码运行速度提高了 3 倍，但您每 3 周花费 2 周来调试它，那么这不值得您花时间。
第三，您还可以使用timeit小段代码来计时。我发现这种方法比使用profile. （profile虽然有利于发现瓶颈）。

itertools一般来说，相当快。但是，特别是在这种情况下，您takewhile会减慢速度，因为 itertools 需要为沿途的每个元素调用一个函数。python 中的每个函数调用都有与之相关的合理数量的开销，因此可能会减慢您的速度（首先还有创建 lambda 函数的成本）。请注意，sum使用生成器表达式也会增加一些开销。但最终，在这种情况下，基本循环似乎总是获胜。

from itertools import takewhile, izip

def match_iter(self, other):
    return sum(1 for x in takewhile(lambda x: x[0] == x[1],
                                        izip(self, other)))

def match_loop(self, other):
    cmp = lambda x1,x2: x1 == x2

    for element in range(min(len(self), len(other))):
        if self[element] == other[element]:
            element += 1
        else:
            break

    return element

def match_loop_lambda(self, other):
    cmp = lambda x1,x2: x1 == x2

    for element in range(min(len(self), len(other))):
        if cmp(self[element],other[element]):
            element += 1
        else:
            break

    return element

def match_iter_nosum(self,other):
    element = 0
    for _ in takewhile(lambda x: x[0] == x[1],
                       izip(self, other)):
        element += 1
    return element

def match_iter_izip(self,other):
    element = 0
    for x1,x2 in izip(self,other):
        if x1 == x2:
            element += 1
        else:
            break
    return element



a = [0, 1, 2, 3, 4]
b = [0, 1, 2, 3, 4, 0]

import timeit

print timeit.timeit('match_iter(a,b)','from __main__ import a,b,match_iter')
print timeit.timeit('match_loop(a,b)','from __main__ import a,b,match_loop')
print timeit.timeit('match_loop_lambda(a,b)','from __main__ import a,b,match_loop_lambda')
print timeit.timeit('match_iter_nosum(a,b)','from __main__ import a,b,match_iter_nosum')
print timeit.timeit('match_iter_izip(a,b)','from __main__ import a,b,match_iter_izip')

但是请注意，最快的版本是循环+itertools 的混合体。这个（显式）循环izip也恰好更容易阅读（在我看来）。因此，我们可以由此得出结论，这takewhile是缓慢的部分，不一定itertools是一般的。

score 4 · Accepted Answer

我想这里的问题是您的测试列表很小 - 这意味着任何差异都可能很小，并且创建迭代器的成本超过了它们带来的收益。

在较大的测试中（性能可能更重要），使用的版本sum()可能会优于其他版本。

此外，还有样式问题 - 手动版本更长，并且依赖于按索引迭代，因此也不太灵活。

我认为最易读的解决方案是这样的：

def while_equal(seq, other):
    for this, that in zip(seq, other):
        if this != that:
            return
        yield this

def match(seq, other):
    return sum(1 for _ in while_equal(seq, other))

有趣的是，在我的系统上有一个稍微修改过的版本：

def while_equal(seq, other):
    for this, that in zip(seq, other):
        if this != that:
            return
        yield 1

def match(seq, other):
    return sum(while_equal(seq, other))

比纯循环版本执行得更好：

a = [0, 1, 2, 3, 4]
b = [0, 1, 2, 3, 4, 0]

import timeit

print(timeit.timeit('match_loop(a,b)', 'from __main__ import a, b, match_loop'))
print(timeit.timeit('match(a,b)', 'from __main__ import match, a, b'))

给予：

1.3171300539979711
1.291257290984504

也就是说，如果我们将纯循环版本改进为更加 Pythonic：

def match_loop(seq, other):
    count = 0
    for this, that in zip(seq, other):
        if this != that:
            return count
        count += 1
    return count

这次（使用与上述相同的方法）0.8548871780512854对我来说，比任何其他方法都快得多，同时仍然可读。这可能是由于原始版本中的索引循环，这通常非常慢。然而，我会选择这篇文章的第一个版本，因为我觉得它是最易读的。

python - 与 python 代码相比，库 itertools 的性能

2 回答 2

Related

Reference