python - 哪个更快 np.vstack、np.append、np.concatenate 或 cython 中的手动功能？

Question

我编写了一些程序，它numpy在每次迭代中更新一个列表并对其进行一些操作。迭代次数取决于时间。例如在 1 秒内，可能有 1000 到 2500 次迭代。这意味着 numpy 列表中的项目对于运行程序 1 秒不会超过 2500。

我已经实现了一个基本算法，我不确定它是否是最快的计算方法bonus：

import numpy as np

cdef int[:, :] pl_list
cdef list pl_length
cdef list bonus
pl_list = np.array([[8, 7]], dtype=np.int32)

def modify(pl_list, pl_length):
    cdef int k_const = 10
    mean = np.mean(pl_list, axis=0)
    mean = np.subtract(mean, pl_length)
    dev = np.std(pl_list, axis=0)
    mean[0] / dev[0] if dev[0] != 0 else 0
    mean[1] / dev[1] if dev[1] != 0 else 0

    bonus = -1 + (2 / (1 + np.exp(-k_const * mean)))
    return list(bonus)


for i in range(2499): # I just simplified the loop. the main loop works like startTime - time.clock() < seconds
    rand = np.random.randint(8, 64)
    pl_length = [rand, rand-1]

    pl_list = np.append(pl_list, [pl_length], axis=0)
    bonus = modify(pl_list, pl_length)

我正在考虑使用这些想法加速这个程序：

使用np.vstack,np.stack或者np.concatenate代替np.append(pl_list, [pl_length]). （哪一个可能更快？）
像这样使用自制函数计算np.std、np.mean（因为在内存视图中迭代在cython中非常快）：

cdef int i,sm = 0
for i in range(pl_list.shape[0]):
sm += pl_list[i]
mean = sm/pl_list.shape[0]
我也在考虑为内存视图定义一个静态长度（比如 2500），所以我不需要使用np.append，我可以在那个 numpy 列表上构建一个队列结构。（队列库怎么样？在这种操作中这比 numpy 列表快吗？）

对不起，如果我的问题太多和复杂。我只是想在速度上获得最佳性能。

score 17 · Accepted Answer

忽略modify函数，循环的核心是：

pl_list = np.array([[8, 7]], dtype=np.int32)
....

for i in range(2499):
    ....
    pl_list = np.append(pl_list, [pl_length], axis=0)
    ...

作为一般规则，我们不鼓励np.concatenate在循环中使用及其衍生物。追加到列表中并在最后进行一次连接会更快。（稍后会详细介绍）

是pl_list列表还是数组？顾名思义，它是一个列表，但在创建时它是一个数组。我还没有研究过modify它是否需要数组或列表。

查看类似函数的源代码np.append。基本函数是np.concatenate，它接受一个列表，并将它们沿指定轴连接到一个新数组中。换句话说，它适用于一长串数组。

np.append用 2 个参数替换该列表输入。所以必须反复应用。这很慢。每个追加都会创建一个新数组。

np.hstack只需确保列表元素至少为 1d，np.vstack使它们为 2d，stack添加维度等。所以基本上它们都做同样的事情，只是对输入进行微调。

另一个模型是分配一个足够大的数组来开始，例如res = np.zeros((n,2))，并在处插入值res[i,:] = new_value。速度与列表追加方法大致相同。该模型可以移动到cython并typed memoryviews用于（可能）大幅提高速度。

score 1 · Accepted Answer

大约晚了四年，但是像我这样的人可能会偶然发现这一点，

如果可能的话，您想使用诸如列表理解之类的方法，通常如果您想要速度，这是最好的方法之一，但您最终可能会牺牲可读性来换取速度。

如果附加到文件，证明列表理解比标准循环更快： https ://towardsdatascience.com/speeding-up-python-code-fast-filtering-and-slow-loops-8e11a09a9c2f

例如，如果你想追加到一个列表，你可以做[append_item for append_item in range(range)]

尽管额外的好处（牺牲了可读性）允许您在代码中添加第二个 for 循环：

my_list = [append_item for append_item in range(repetitions) for _ in range(repeat)]

或更简洁：

my_list = [append_item
for append_item in range(repetitions)
for _ in range(repeat)]

然而，这个函数可能更有趣的是，您可以在列表定义中执行大量计算函数。

my_list = [
append_item
for append_item in range(repetitions)
for heavy_comp_item in [function_call]
for _ in range(x)   
]

我在这里包含了一个“for _ in range(x)”，以允许复制相同的值（发现为heavy_comp_item）x 次。

对不起，如果我刚刚给你的东西在这里没有翻译成你的代码，但希望这对未来的项目有帮助:)。

python - 哪个更快 np.vstack、np.append、np.concatenate 或 cython 中的手动功能？

2 回答 2

Related

Reference