python - Cythonize 两个小的 numpy 函数，需要帮助

Question

问题

我正在尝试 Cythonize 两个主要处理 numpy ndarrays 的小函数，用于某些科学目的。这两个小函数在遗传算法中被调用了数百万次，占算法花费的大部分时间。

我自己取得了一些进展，并且都运行良好，但我只获得了微小的速度提升（10%）。更重要的是，cython --annotate 表明大部分代码仍在通过 Python。

编码

第一个功能：

这个函数的目的是取回数据片段，它在一个内部嵌套循环中被调用了数百万次。根据 data[1][1] 中的布尔值，我们可以正向或反向获取切片。

#Ipython notebook magic for cython
%%cython --annotate
import numpy as np
from scipy import signal as scisignal

cimport cython
cimport numpy as np
def get_signal(data):
    #data[0] contains the data structure containing the numpy arrays
    #data[1][0] contains the position to slice
    #data[1][1] contains the orientation to slice, forward = 0, reverse = 1

    cdef int halfwinwidth = 100
    cdef int midpoint = data[1][0]
    cdef int strand = data[1][1]
    cdef int start = midpoint - halfwinwidth
    cdef int end = midpoint + halfwinwidth
    #the arrays we want to slice
    cdef np.ndarray r0 = data[0]['normals_forward']
    cdef np.ndarray r1 = data[0]['normals_reverse']
    cdef np.ndarray r2 = data[0]['normals_combined']
    if strand == 0:
        normals_forward = r0[start:end]
        normals_reverse = r1[start:end]
        normals_combined = r2[start:end]
    else:
        normals_forward = r1[end - 1:start - 1: -1]
        normals_reverse = r0[end - 1:start - 1: -1]
        normals_combined = r2[end - 1:start - 1: -1]
    #return the result as a tuple
    row = (normals_forward,
           normals_reverse,
           normals_combined)
    return row

第二个功能

这个得到一个 numpy 数组的元组列表，我们希望将数组元素明智地相加，然后对它们进行规范化并获得交集的积分。

def calculate_signal(list signal):
    cdef int halfwinwidth = 100
    cdef np.ndarray profile_normals_forward = np.zeros(halfwinwidth * 2, dtype='f')
    cdef np.ndarray profile_normals_reverse = np.zeros(halfwinwidth * 2, dtype='f')
    cdef np.ndarray profile_normals_combined = np.zeros(halfwinwidth * 2, dtype='f')
    #b is a tuple of 3 np.ndarrays containing 200 floats
    #here we add them up elementwise
    for b in signal:
        profile_normals_forward += b[0]
        profile_normals_reverse += b[1]
        profile_normals_combined += b[2]
    #normalize the arrays
    cdef int count = len(signal)

    #print "Normalizing to number of elements"
    profile_normals_forward /= count
    profile_normals_reverse /= count
    profile_normals_combined /= count
    intersection_signal = scisignal.detrend(np.fmin(profile_normals_forward, profile_normals_reverse))
    intersection_signal[intersection_signal < 0] = 0
    intersection = np.sum(intersection_signal)

    results = {"intersection": intersection,
               "profile_normals_forward": profile_normals_forward,
               "profile_normals_reverse": profile_normals_reverse,
               "profile_normals_combined": profile_normals_combined,
               }
    return results

任何帮助表示赞赏 - 我尝试使用内存视图，但由于某种原因，代码变得非常非常慢。

score 0 · Accepted Answer

修复数组 cdef 后（如前所述，指定了 dtype），您可能应该将例程放在 cdef 函数中（只能由同一脚本中的 def 函数调用）。

在函数的声明中，您需要提供类型（如果是数组 numpy，则需要提供维度）：

cdef get_signal(numpy.ndarray[DTYPE_t, ndim=3] data):

不过，我不确定使用 dict 是个好主意。您可以使用 numpy 的列或行切片，例如 data[:, 0]。

python - Cythonize 两个小的 numpy 函数，需要帮助

问题

编码

第一个功能：

第二个功能

1 回答 1

Related

Reference