python - 用 cython 创建矩阵的有效方法

Question

我有一个为我计算矩阵的函数，但它真的很慢。即使在 cython 中它运行缓慢，所以我想知道是否可以做任何事情来增强下面的代码。

编辑：我已更改或添加

des = np.zeros([n-m+1,m])to cdef np.ndarray des = np.zeros([n-m+1,m], dtype=DTYPE)（这比我没有np.empty... 说m/2我添加了 a快，cdef int m2 = m/2但这似乎没有任何帮助。

cimport numpy as np
cimport cython

DTYPE = float
ctypedef np.float_t DTYPE_t

@cython.boundscheck(False)
@cython.cdivision(True)
@cython.wraparound(False)
cpdef map4(np.ndarray[DTYPE_t, ndim=1] s, int m): 

  cdef int n = len(s)
  cdef int i
  cdef int j

  des = np.zeros([n-m+1,m])
  for j in xrange(m):
      for i in xrange(m/2,n-m/2-1):
          des[i-m/2,j] = s[i-j+m/2]

  return des, s, m, n

通常n~10000和m=1001。

score 3 · Accepted Answer

尝试：

cdef np.ndarray des = np.zeros([n-m+1,m])

您也可以像对参数 s 所做的那样使其更具体。您还可以关闭边界检查。查看cython numpy 教程。

您可能还想创建一个变量：

cdef int m_2 = m/2

并在您拥有的任何地方使用它，m/2因为我不知道 Cython 是否会为您进行优化。

score 2 · Accepted Answer

假设您将分配每个元素，它也可能有助于使用np.empty而不是np.zeros：

des = np.empty([n-m+1,m])

score 0 · Accepted Answer

I'm not seeing m being set anywhere. At the bottom of your code, you mention that n~10,000, and m=1001. Does that mean that m is a constant integer of 32 bits? Not seeing your compilation flags, it's frequently worthwhile to try it with and without -ffast-math to see if that makes a difference. With large arrays and matrices, using a smaller data type usually shows a significant speedup, provided that the smaller data type preserves the range and accuracy that your program needs, though I'm not seeing a large potential benefit on this calculation.

If you could show us the C code that is generated by this, that might help, as well.

python - 用 cython 创建矩阵的有效方法

3 回答 3

Related

Reference