正如 Patashu 建议的那样,让我们使用timeit
测试而不是尝试猜测。我将使用 中的魔法%timeit
,ipython
因为它更简单。这是代码:
In [275]: def ff(): pass
In [276]: def ft(): pass
In [277]: def f1(b): # naive implementation
.....: for i in range(1000000):
.....: if b: ft()
.....: else: ff()
In [278]: %timeit f1(True)
10 loops, best of 3: 117 ms per loop
In [279]: def f2(b): # DSM's implementation
.....: f = ft if b else ff
.....: for i in range(1000000):
.....: f()
In [280]: %timeit f2(True)
10 loops, best of 3: 99.2 ms per loop
所以,它要快一点,至少在我的 Mac 上的 CPython 3.3.0 64 位中。
However, if you know anything about Python optimization, you'll probably notice that this is about the same performance gain you'd expect just from moving a global variable to local. So, let's take that out of the equation by doing the same thing without hoisting the boolean expression:
In [277]: def f3(b): # Just local binding, no if hoisting
.....: f, g = ft, ff
.....: for i in range(1000000):
.....: if b: f()
.....: else: g()
In [286]: %timeit f3(True)
10 loops, best of 3: 94.8 ms per loop
I put together a more complete test, including the OP's intended optimization, and code that works in 3.x and 2.x without change, and ran it against Apple 2.7.2, python.org 3.3.0, PyPy 1.9.0/2.7.2, and Jython 2.5.2 (all 64-bit builds on a Mac, and then just using Cython 0.17.1 pyximport (under Python 3.3.0) to compile the same source as Cython code:
3.3.0 2.7.2 PyPy Jython Cython
orig 1.136 1.519 0.091 1.680 0.448
OP optimization 1.119 1.362 0.034 1.613 0.460
rebinding 0.936 1.369 0.030 1.492 0.137
DSM version 0.936 1.329 0.031 1.523 0.138
So, it looks like binding the names outside the loop gives a speed boost of somewhere between 1.1x and 3x; additionally hoisting the comparison out of the loop may give you another 3% or so—but all of this is nothing compared to using PyPy instead of CPython, Cython instead of Python, or even 3.x instead of 2.x. Writing actual Cython or custom C code, or moving the loop into numpy
, would be even faster.
And if you think about it, this makes sense. If the cost of a billion bool comparisons or global lookups matters, the cost of a billion function calls and a billion loops through the interpreter are going to matter far more. If you're not bothering to optimize that out (and you can often do that just by using a generator expression, list comprehension, map
call, etc. in place of a loop, even if switching interpreters, rewriting your code around numpy
, etc. aren't feasible), you shouldn't be worrying about the small stuff.
And obviously, if that last 3% ever really does make a difference, you'll need to perform more realistic tests, on the platforms you actually care about.
It's probably worth using DSM's implementation—but because it's more idiomatic and easier to read, not because it may or may not be faster.