image - CUDA浮点给出不同的结果

Question

我正在使用 CUDA 5 / VC 2008 将图像从彩色转换为灰度。

CUDA 内核是：

__global__ static void rgba_to_grayscale( const uchar4* const rgbaImage, unsigned char * const greyImage,
                                     int numRows, int numCols) 
{
    int pos = blockIdx.x * blockDim.x + threadIdx.x;
    if (pos < numRows * numCols) {
        uchar4 zz = rgbaImage[pos];
        float out = 0.299f * zz.x + 0.587f * zz.y + 0.114f * zz.z;
        greyImage[pos] = (unsigned char) out;
    }

}

C++ 函数是：

inline unsigned char rgba_to_grayscale( uchar4 rgbaImage) 
{
    return (unsigned char) 0.299f * rgbaImage.x + 0.587f * rgbaImage.y + 0.114f * rgbaImage.z;
}

并且它们都被适当地调用。然而，它们产生了不同的结果。

原图：

这个彩色图

CUDA 版本：

cuda 结果

串行 CPU 版本：

序列号结果

谁能解释为什么结果不同？

score 8 · Accepted Answer

您的 CUDA 功能没有问题。CPU 版本不正确。您正在对与以下代码等效的值0.299f * rgbaImage.x进行类型转换：unsigned char

inline unsigned char rgba_to_grayscale( uchar4 rgbaImage) 
{
    return ((unsigned char) 0.299f * rgbaImage.x) + 0.587f * rgbaImage.y + 0.114f * rgbaImage.z;
}

您必须将最终结果转换为unsigned char：

inline unsigned char rgba_to_grayscale( uchar4 rgbaImage) 
{
    return (unsigned char) (0.299f * rgbaImage.x + 0.587f * rgbaImage.y + 0.114f * rgbaImage.z);
}

score 0 · Accepted Answer

@sga91 几乎在那里....但似乎字节顺序也不同。

inline unsigned char rgba_to_grayscale( uchar4 rgbaImage) 
{
    return (unsigned char) (0.299f * rgbaImage.z + 0.587f * rgbaImage.y + 0.114f * rgbaImage.y);
}

注意 x 和 z 是转置的....

我确实记得以前读过它，但我现在找不到参考...

image - CUDA浮点给出不同的结果

原图：

CUDA 版本：

串行 CPU 版本：

2 回答 2

Related

Reference