python - Python中导入模块的优化

Question

我正在阅读 David Beazley 的 Python 参考书，他提出了一个观点：

例如，如果您执行大量平方根运算，使用“from math import sqrt”和“sqrt(x)”比输入“math.sqrt(x)”更快。

和：

对于涉及大量使用方法或模块查找的计算，通过首先将要执行的操作放入局部变量来消除属性查找几乎总是更好。

我决定尝试一下：

第一的（）

def first():
    from collections import defaultdict
    x = defaultdict(list)

第二（）

def second():
    import collections
    x = collections.defaultdict(list)

结果是：

2.15461492538
1.39850616455

诸如此类的优化对我来说可能并不重要。但我很好奇为什么与 Beazley 所写的相反的是真实的。请注意，有 1 秒的差异，鉴于任务微不足道，这很重要。

为什么会这样？

更新：

我得到的时间如下：

print timeit('first()', 'from __main__ import first');
print timeit('second()', 'from __main__ import second');

score 6 · Accepted Answer

from collections import defaultdictand应该在import collections迭代计时循环之外，因为您不会重复执行它们。

我想from语法必须比语法做更多的工作import。

使用此测试代码：

#!/usr/bin/env python

import timeit

from collections import defaultdict
import collections

def first():
    from collections import defaultdict
    x = defaultdict(list)

def firstwithout():
    x = defaultdict(list)

def second():
    import collections
    x = collections.defaultdict(list)

def secondwithout():
    x = collections.defaultdict(list)

print "first with import",timeit.timeit('first()', 'from __main__ import first');
print "second with import",timeit.timeit('second()', 'from __main__ import second');

print "first without import",timeit.timeit('firstwithout()', 'from __main__ import firstwithout');
print "second without import",timeit.timeit('secondwithout()', 'from __main__ import secondwithout');

我得到结果：

first with import 1.61359190941
second with import 1.02904295921
first without import 0.344709157944
second without import 0.449721097946

这显示了重复进口的成本。

score 4 · Accepted Answer

我也会得到first(.)和之间的相似比率second(.)，唯一的区别是时间是微秒级的。

我不认为你的时间衡量任何有用的东西。尝试找出更好的测试用例！

更新：
FWIW，这里有一些支持大卫比兹利观点的测试。

import math
from math import sqrt

def first(n= 1000):
    for k in xrange(n):
        x= math.sqrt(9)

def second(n= 1000):
    for k in xrange(n):
        x= sqrt(9)

In []: %timeit first()
1000 loops, best of 3: 266 us per loop
In [: %timeit second()
1000 loops, best of 3: 221 us per loop
In []: 266./ 221
Out[]: 1.2036199095022624

所以first()比 . 慢 20% second()。

score 1 · Accepted Answer

我的猜测是，您的测试是有偏见的，第二个实现从第一个已经加载模块，或者只是从最近加载它中获益。

你试了多少次？你是不是换了订单之类的。。

score 1 · Accepted Answer

first()不保存任何内容，因为仍然必须访问模块才能导入名称。

此外，您没有给出计时方法，而是给出了似乎first()执行初始导入的函数名称，因为必须编译和执行模块，所以它总是比后续导入长。

score 1 · Accepted Answer

还有阅读/理解源代码的效率问题。这是一个真实的例子（来自stackoverflow问题的代码）

原来的：

import math

def midpoint(p1, p2):
   lat1, lat2 = math.radians(p1[0]), math.radians(p2[0])
   lon1, lon2 = math.radians(p1[1]), math.radians(p2[1])
   dlon = lon2 - lon1
   dx = math.cos(lat2) * math.cos(dlon)
   dy = math.cos(lat2) * math.sin(dlon)
   lat3 = math.atan2(math.sin(lat1) + math.sin(lat2), math.sqrt((math.cos(lat1) + dx) * (math.cos(lat1) + dx) + dy * dy))
   lon3 = lon1 + math.atan2(dy, math.cos(lat1) + dx)
   return(math.degrees(lat3), math.degrees(lon3))

选择：

from math import radians, degrees, sin, cos, atan2, sqrt

def midpoint(p1, p2):
   lat1, lat2 = radians(p1[0]), radians(p2[0])
   lon1, lon2 = radians(p1[1]), radians(p2[1])
   dlon = lon2 - lon1
   dx = cos(lat2) * cos(dlon)
   dy = cos(lat2) * sin(dlon)
   lat3 = atan2(sin(lat1) + sin(lat2), sqrt((cos(lat1) + dx) * (cos(lat1) + dx) + dy * dy))
   lon3 = lon1 + atan2(dy, cos(lat1) + dx)
   return(degrees(lat3), degrees(lon3))

score 0 · Accepted Answer

像往常一样编写代码，导入一个模块并将其模块和常量引用为module.attribute. 然后，要么在你的函数前加上装饰器来绑定常量，要么使用下面的函数绑定你程序中的所有模块bind_all_modules：

def bind_all_modules():
    from sys import modules
    from types import ModuleType
    for name, module in modules.iteritems():
        if isinstance(module, ModuleType):
            bind_all(module)

def bind_all(mc, builtin_only=False, stoplist=[],  verbose=False):
    """Recursively apply constant binding to functions in a module or class.

    Use as the last line of the module (after everything is defined, but
    before test code).  In modules that need modifiable globals, set
    builtin_only to True.

    """
    try:
        d = vars(mc)
    except TypeError:
        return
    for k, v in d.items():
        if type(v) is FunctionType:
            newv = _make_constants(v, builtin_only, stoplist,  verbose)
            try: setattr(mc, k, newv)
            except AttributeError: pass
        elif type(v) in (type, ClassType):
            bind_all(v, builtin_only, stoplist, verbose)

python - Python中导入模块的优化

6 回答 6

Related

Reference