python - 对于巨大的数组来说，numpy 比 list 慢吗？

Question

检查我的以下代码；它是在 python 中实现的 sigma_2 函数（使用粗筛）的一部分，它是除数函数之一http://mathworld.wolfram.com/DivisorFunction.html

from time import time
from itertools import count
import numpy

def sig2(N, nump=False):
    init = time()


    #initialize array with value=1 since every positive integer is divisible by 1
    if nump:
        print 'using numpy'
        nums = numpy.ones((N,), dtype=numpy.int64)
    else:        
        nums = [1 for i in xrange(1, N)]

    #for each number n < N, add n*n to n's multiples
    for n in xrange(2, N):
        nn = n*n
        for i in count(1):
            if n*i >= N: break
            nums[n*i-1] += nn

    print 'sig2(n) done - {} ms'.format((time()-init)*1000)

我尝试了不同的值，而 numpy 非常令人失望。

2000年：

sig2(n) done - 4.85897064209 ms
took : 33.7610244751 ms
using numpy
sig2(n) done - 31.5930843353 ms
took : 55.6900501251 ms

对于 200000：

sig2(n) done - 1113.80600929 ms
took : 1272.8869915 ms
using numpy
sig2(n) done - 4469.48194504 ms
took : 4705.97100258 ms

它继续下去，我的代码并不是真正可扩展的——因为它不是 O(n)，但是对于这两个以及这两个结果，使用 numpy 会导致性能问题。numpy 不应该比 python 列表和字典更快吗？这是我对 numpy 的印象。

score 6 · Accepted Answer

正如@unutbu 所说，当您使用矢量化操作时，numpy 真的很出色。这是使用 numpy 优化的实现（与 Mathworld 中除数函数的定义一致）：

import numpy as np

def sig2_numpy(N):

    x = np.arange(1,N+1)
    x[(N % x) != 0] = 0
    return np.sum(x**2)

当你调用它时，它会快得多：

>> import time
>> init = time.time()
>> print sig2_numpy(20000)
>> print "It took", (time.time()-init)*1000., 'ms'
It took 0.916957855225 ms

score 3 · Accepted Answer

NumPy 通过对整个数组执行计算来提高速度，而不是一次对单个值进行计算。

当你写

    for i in count(1):
        if n*i >= N: break
        nums[n*i-1] += nn

您正在强制 NumPy 数组nums一次将数组中的单个值递增一个索引。对于 NumPy 数组来说，这是一个缓慢的操作。

python - 对于巨大的数组来说，numpy 比 list 慢吗？

2 回答 2

Related

Reference