python - 如何提高python中for循环的性能

Question

我正在用 python 做很多模拟，模拟系统响应。

我目前一直在使用 Runge-Kutta 方案，但偶然发现了另一个我一直在测试的方案。

在 Matlab 中进行测试时，与我的 Runge-Kutta 相比，我获得了卓越的性能。但是，当我将其转移到 Python 时，速度明显变慢。

我不确定这是否就是这样，或者我是否可以改进我的编码方式，所以如果可能的话，我很想听听你的一些意见。

Matlab中的代码，举例：

dt = 0.0001;
f = randn(1, (60 / dt));
ns = length(f);
yo = zeros(3,1);
P1 = [0; 0.001; 0];
F = [1 0.0001 0; 0.001 1 0.0001; 0.001 0 1];
y1 = zeros(3, ns);
tic
for i = 1:ns
    y1(:, i) = P1*f(:, i) + F*yo;
    yo = y1(:, i);
end
toc

其中循环在 0.55-0.61 秒内执行。

Python中的代码，例如：

dt = 0.0001
f = np.random.randn(1, int(60 / dt))
ns = np.size(f)
yo = np.zeros((3))
F = np.array([[1, 0.0001, 0], [0.001, 1, 0.0001], [0.001, 0, 1]])
P1 = np.transpose(np.array([[0, 0.0001, 0]]))
y1 = np.zeros((3, ns), order='F')
start_time = time.time()
for i in range(ns-1):
    y1[:, i] = np.dot(P1, f[:, i]) + np.reshape(np.dot(F, yo), (3))
    yo = y1[: , i]
print("--- %s seconds ---" % (time.time() - start_time))

其中循环在 2.8 -3.1 秒内执行。

我可以做些什么来改善这一点吗？

感谢您考虑我的问题。

score 2 · Accepted Answer

我建议numba在评论中使用。这是一个例子：

import numba
import numpy as np

def py_func(dt, F, P1):
    f = np.random.randn(1, int(60 / dt))
    ns = f.size
    yo = np.zeros((3))
    y1 = np.zeros((3, ns), order='F')
    for i in range(ns-1):
        y1[:, i] = np.dot(P1, f[:, i]) + np.reshape(np.dot(F, yo), (3))
        yo = y1[: , i]
    return yo

@numba.jit(nopython=True)
def numba_func(dt, F, P1):
    f = np.random.randn(1, int(60 / dt))
    ns = f.size
    yo = np.zeros((3))
    y1 = np.zeros((3, ns))
    for i in range(ns-1):
        y1[:, i] = np.dot(P1, f[:, i]) + np.reshape(np.dot(F, yo), (3))
        yo = y1[: , i]
    return yo

您不能使用“F”顺序，numba因为它使用 C 类型数组，而不是 FORTRAN 数组。

时序差异如下所示：

纯python循环：

%%timeit
py_func(dt, F, P1)

结果：

2.88 s ± 100 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

麻木：

%%timeit
numba_func(dt, F, P1)

结果：

588 ms ± 10.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

score 1 · Accepted Answer

我稍微优化了你的代码，我的执行时间从 2.8 秒下降到了 1.2 秒左右。在您寻找更快的解释器之前，我建议您进行分析（请参阅 line_profiler）并尝试从最里面的循环中删除所有可能的内容。更好的是，尽量避免任何显式的“for”循环并依赖 numpy 函数，例如 dot、einsum 等。

我想还有一些地方可以优化。我不认为我改变了你的价值观，但最好检查一下。使用其他工具，如 numba 或 cython ( cython.org ) 或 pypy ( pypy.org )，我猜您的执行时间会大大提高。

#!/usr/bin/env python3

import numpy as np
import time

np.random.seed(0)

#@profile
def run():
    dt = 0.0001
    f = np.random.randn(1, int(60 / dt))
    ns = np.size(f)
    yo = np.zeros((3))
    F = np.array([[1, 0.0001, 0], [0.001, 1, 0.0001], [0.001, 0, 1]])
    P1 = np.transpose(np.array([[0, 0.0001, 0]]))
    start_time = time.time()
    y1 = np.outer(f, P1)
    for i in range(ns-1):
        y1[i] += F@yo
        yo = y1[i]
    print("--- %s seconds ---" % (time.time() - start_time))
    y1 = y1.T
    print(yo)

run()

python - 如何提高python中for循环的性能

2 回答 2

Related

Reference