3

To be specific I need to convert an integer, say 9999, into bytes, b'9999' in python 2.6 up to python 3.x

In python 2.x I do

b'%s'%n

while in python 3.x

('%s'%n).encode()

Performance in python 2.6

>>> from timeit import Timer
>>> Timer('b"%s"%n','n=9999').timeit()
0.24728001750963813

Performance in python 3.2

>>> from timeit import Timer
>>> Timer('("%s"%n).encode()','n=9999').timeit()
0.534475012767416

Assuming my benchmarks are set up correctly, that is an hefty penalty in python 3.x.

Is there a way to improve performance to close the gap with 2.6/2.7?

Maybe via the cython route?

This is the generator function I'm trying to optimize. It is called over and over with args being a list of strings, bytes or numbers:

def pack_gen(self, args, encoding='utf-8'):
    crlf = b'\r\n'
    yield ('*%s\r\n'%len(args)).encode(encoding)
    for value in args:
        if not isinstance(value, bytes):
            value = ('%s'%value).encode(encoding)
        yield ('$%s\r\n'%len(value)).encode(encoding)
        yield value
        yield crlf

The function is called in this way

b''.join(pack_gen(args))
4

1 回答 1

0

在 Python 3.2 中使用 repr() 得到了更好的结果:

>>> Timer('repr(n).encode()','n=9999').timeit()
0.32432007789611816
>>> Timer('("%s" % n).encode()','n=9999').timeit()
0.44790005683898926

但当然它不如"%s" % n.

我还检查了 Python 3.3.0a3,希望 PEP393 能稍微加快转换速度,但令人惊讶的是,我得到的结果比 3.2 更差:

>>> Timer('repr(n).encode()','n=9999').timeit()
0.35951611599921307
>>> Timer('("%s"%n).encode()','n=9999').timeit()
0.4658188759985933

最后,如下:

>>> Timer('str(n).encode()','n=9999').timeit()
0.49958825100111426

表明大部分成本可能来自函数调用开销(根据上面的结果,我假设 Python 内部实现str()了整数类型调用repr())。因此,如果您需要在此级别进行优化,那么正如您所建议的,使用 cython 替换整个代码块可能是可行的方法。或者更好地向我们展示您正在尝试优化的循环,因为可能有许多其他方法可以加速它。

于 2012-05-16T01:20:12.233 回答