python - cython 函数输出与 python 函数输出略有不同

Question

通过向一些变量添加类型，我已经将 python 函数转换为 cython 等价物。但是，cython 函数产生的输出与原始 python 函数略有不同。

我在这篇文章 Cython 中了解了这种差异的一些原因：numpy 数组的无符号整数索引给出了不同的结果但是即使我在这篇文章中学到了什么，我仍然无法让 cython 函数产生相同的结果作为蟒蛇之一。

所以我整理了 4 个函数来说明我尝试过的内容。有人可以帮助揭示为什么每个函数我得到的结果略有不同吗？以及如何获得一个与function1返回相同精确值的cython函数？我在下面发表一些评论：

%%cython
import numpy as np
cimport numpy as np    

def function1(response, max_loc):    
    x, y = int(max_loc[0]), int(max_loc[1])

    tmp1 = (response[y,x+1] - response[y,x-1]) / 2*(response[y,x] - min(response[y,x-1], response[y,x+1]))
    tmp2 = (response[y,x+1] - response[y,x-1])
    tmp3 = 2*(response[y,x] - min(response[y,x-1], response[y,x+1]))

    print tmp1, tmp2, tmp3        
    return tmp1, tmp2, tmp3

cpdef function2(np.ndarray[np.float32_t, ndim=2] response, np.ndarray[np.float64_t, ndim=1] max_loc):
    cdef unsigned int x, y 
    x, y = int(max_loc[0]), int(max_loc[1])

    tmp1 = (response[y,x+1] - response[y,x-1]) / 2*(response[y,x] - min(response[y,x-1], response[y,x+1]))        
    tmp2 = (response[y,x+1] - response[y,x-1])
    tmp3 = 2*(response[y,x] - min(response[y,x-1], response[y,x+1]))     

    print tmp1, tmp2, tmp3        
    return tmp1, tmp2, tmp3


cpdef function3(np.ndarray[np.float32_t, ndim=2] response, np.ndarray[np.float64_t, ndim=1] max_loc):     
    cdef unsigned int x, y 
    x, y = int(max_loc[0]), int(max_loc[1])

    cdef np.float32_t tmp1, tmp2, tmp3
    cdef np.float32_t r1 =response[y,x+1]
    cdef np.float32_t r2 =response[y,x-1]
    cdef np.float32_t r3 =response[y,x]
    cdef np.float32_t r4 =response[y,x-1]
    cdef np.float32_t r5 =response[y,x+1]    

    tmp1 = (r1 - r2) / 2*(r3 - min(r4, r5))  
    tmp2 = (r1 - r2)
    tmp3 = 2*(r3 - min(r4, r5))

    print tmp1, tmp2, tmp3        
    return tmp1, tmp2, tmp3

def function4(response, max_loc):     
    x, y = int(max_loc[0]), int(max_loc[1])

    tmp1 = (float(response[y,x+1]) - response[y,x-1]) / 2*(float(response[y,x]) - min(response[y,x-1], response[y,x+1]))
    tmp2 = (float(response[y,x+1]) - response[y,x-1])
    tmp3 = 2*(float(response[y,x]) - min(response[y,x-1], response[y,x+1]))

    print tmp1, tmp2, tmp3        
    return tmp1, tmp2, tmp3

max_loc = np.asarray([ 15., 25.], dtype=np.float64) 
response = np.zeros((49,49), dtype=np.float32)     
x, y = int(max_loc[0]), int(max_loc[1])

response[y,x] = 0.959878861904  
response[y,x-1] = 0.438348740339
response[y,x+1] = 0.753262758255  

result1 = function1(response, max_loc)
result2 = function2(response, max_loc)
result3 = function3(response, max_loc)
result4 = function4(response, max_loc)
print result1
print result2
print result3
print result4

结果：

0.0821185777156 0.314914 1.04306030273
0.082118573023 0.314914017916 1.04306024313
0.0821185708046 0.314914017916 1.04306030273
0.082118573023 0.314914017916 1.04306024313
(0.082118577715618812, 0.31491402, 1.043060302734375)
(0.08211857302303427, 0.3149140179157257, 1.0430602431297302)
(0.08211857080459595, 0.3149140179157257, 1.043060302734375)
(0.082118573023034269, 0.31491401791572571, 1.0430602431297302)

function1代表我在原始 python 函数中所做的操作。tmp1 是结果。

function2是我的第一个 cython 版本，它产生的结果略有不同。显然，如果响应数组使用类型化变量 unsigned int 或 int 进行索引，则即使数组的类型是 np.float32_t，结果也会被强制为 double（使用 PyFloat_FromDouble）。但是，如果数组是用 python int 索引的，则使用函数 PyObject_GetItem，我得到 np.float32_t，这就是函数 1 中发生的情况。所以 function1 中的表达式是使用 np.float32_t 操作数计算的，而 function2 中的表达式是使用双精度计算的。我得到的打印结果与 function1 中的略有不同。

function3是我第二次尝试获得与 function1 相同的输出。在这里，我使用 unsigned int 索引来访问数组响应，但结果留在 np.float32_t 中间变量上，然后我在计算中使用它们。我得到的结果略有不同。显然，打印语句将使用 PyFloat_FromDouble，因此它无法打印 np.float32_t。

然后我尝试更改 python 函数以匹配 cython 函数。function4尝试通过在每个表达式中转换为至少一个操作数来实现这一点，因此其余操作数也被强制转换为 python float，这是 cython 中的双精度数，并且表达式使用双精度数计算，如 function2 中一样。函数内部的打印和function2完全一样，但是返回的值略有不同？！

score 2 · Accepted Answer

让我们比较一下：

function1一直保持float32_t。
function2在索引时转换为float，使用执行中间步骤float，然后转换回float32_t为最终结果。
function3转换为float，但随后立即返回float32_t，执行中间步骤。
function4转换为float，执行中间步骤，然后将最终结果返回为float。

至于为什么function4打印与相同的东西function2，但返回不同的东西：如果您查看类型，这很简单。这些值显然足够接近以至于它们以print相同的方式发生，但不足以接近repr相同的方式。这并不奇怪，因为它们不是同一类型。

score 2 · Accepted Answer

如果您使用的是单精度浮点数，它只有 7.225 个十进制数字的精度，我不认为从强制到加倍的小差异很重要。

为了澄清您对的描述function2，如果您使用对象进行索引，Cython 会使用PyObject_GetItem来获取np.float32标量对象（不是np.float32_t，它只是 C 的 typedef float）。如果您改为直接索引到缓冲区，并且 Cython 需要一个对象，它会调用PyFloat_FromDouble. 它需要对象来分配tmp1,tmp2和tmp3, 因为它们没有被输入。

function3另一方面，您输入了变量tmp，但它仍然需要创建float对象来打印并返回结果。如果你改用 NumPy ndarray（见下文），你就不会有这个问题：

顺便说一下，在除以 2 时function1，您将结果提升到np.float64。例如：

>>> type(np.float32(1) / 2)
<type 'numpy.float64'>

对比

>>> type(np.float32(1) / np.float32(2))
<type 'numpy.float32'>

即使您确保所有操作都float32在defandcpdef函数中，最终结果在编译的扩展模块中仍然可能在两者之间有所不同。在以下示例中，我检查了中间结果function1是否都是np.float32对象。在生成的 C 中，function2我检查了没有强制转换为double（或等效的 typedef）。然而，这两个函数仍然产生略有不同的结果。我可能不得不深入研究编译的程序集以找出原因，但也许我忽略了一些简单的事情。

def function1(response, max_loc):    
    tmp = np.zeros(3, dtype=np.float32)
    x, y = int(max_loc[0]), int(max_loc[1])
    tmp[0] = (((response[y,x+1] - response[y,x-1]) / np.float32(2)) *
             (response[y,x] - min(response[y,x-1], response[y,x+1])))
    tmp[1] = response[y,x+1] - response[y,x-1]
    tmp[2] = 2*(response[y,x] - min(response[y,x-1], response[y,x+1]))

    print tmp[0], tmp[1], tmp[2]
    return tmp

cpdef function2(np.ndarray[np.float32_t, ndim=2] response, max_loc):
    cdef np.ndarray[np.float32_t, ndim=1] tmp = np.zeros(3, dtype=np.float32)
    cdef unsigned int x, y
    x, y = int(max_loc[0]), int(max_loc[1])
    tmp[0] = (((response[y,x+1] - response[y,x-1]) / <np.float32_t>2) *
             (response[y,x] - min(response[y,x-1], response[y,x+1])))
    tmp[1] = response[y,x+1] - response[y,x-1]
    tmp[2] = 2*(response[y,x] - min(response[y,x-1], response[y,x+1]))

    print tmp[int(0)], tmp[int(1)], tmp[int(2)]
    return tmp

比较：

>>> function1(response, max_loc)
0.0821186 0.314914 1.04306
array([ 0.08211858,  0.31491402,  1.0430603 ], dtype=float32)

>>> function2(response, max_loc)
0.0821186 0.314914 1.04306
array([ 0.08211857,  0.31491402,  1.0430603 ], dtype=float32)

python - cython 函数输出与 python 函数输出略有不同

2 回答 2

Related

Reference