0

我正在尝试使用 CUDA API 在计算能力 1.3 GPU 中执行内核。绑定一维数组按预期工作,但以下代码会产生错误:

#include <cuda.h>
#include <stdio.h>
#include <stdlib.h>

#define checkCudaErrors(err)           __checkCudaErrors (err, __FILE__, __LINE__)

inline static void __checkCudaErrors( cudaError err, const char *file, const int line )     {

    if( cudaSuccess != err) {
        fprintf(stderr, "%s(%i) : CUDA Runtime API error %d: %s.\n", file, line, (int)err, cudaGetErrorString( err ) );
        exit(-1);
    }
}

texture<int, cudaTextureType2D> tex_transition;

int main ( void ) {

    int m = 8, p_size = 100, alphabet = 20;

    size_t pitch;

    int *transition = ( int * ) malloc ( ( m * p_size + 1 ) * alphabet * sizeof ( int ) );
    memset ( transition, -1, ( m * p_size + 1 ) * alphabet * sizeof ( int ) );

    int *d_transition;

    checkCudaErrors ( cudaMallocPitch ( &d_transition, &pitch, alphabet * sizeof ( int ), ( m * p_size + 1 ) ) );

    checkCudaErrors ( cudaMemcpy2D ( d_transition, pitch, transition, alphabet * sizeof ( int ), alphabet * sizeof ( int ), ( m * p_size + 1 ), cudaMemcpyHostToDevice ) );

    cudaChannelFormatDesc desc = cudaCreateChannelDesc<int>();
    checkCudaErrors ( cudaBindTexture2D ( 0, tex_transition, d_transition, desc, alphabet * sizeof ( int ), ( m * p_size + 1 ), pitch ) );

    cudaFree ( d_transition );

    return 0;
}

执行时出现错误“test.cu(33): CUDA Runtime API error 11: invalid argument.”。通过将字母设置为 10,错误就消失了。如果我没记错的话,每个绑定到纹理的数组的最大大小可以是 65000 x 65000 字(在这种情况下是整数),但转换数组要小得多。

4

1 回答 1

2

您在 cudaBindTexture2D 调用中有一个参数错误。纹理的尺寸以texel为单位,而不是字节,所以调用应该是:

cudaChannelFormatDesc desc = cudaCreateChannelDesc<int>();
cudaBindTexture2D ( 0, 
                    tex_transition, 
                    d_transition, desc, 
                    alphabet,            // in texels
                    ( m * p_size + 1 ),  // in texels
                    pitch );

字节宽度仅在分配调用中是必需的。纹理绑定使用 pitch 参数来计算 2D 分配的内存布局。

于 2012-11-28T16:37:59.233 回答