我正在尝试使用 CUDA API 在计算能力 1.3 GPU 中执行内核。绑定一维数组按预期工作,但以下代码会产生错误:
#include <cuda.h>
#include <stdio.h>
#include <stdlib.h>
#define checkCudaErrors(err) __checkCudaErrors (err, __FILE__, __LINE__)
inline static void __checkCudaErrors( cudaError err, const char *file, const int line ) {
if( cudaSuccess != err) {
fprintf(stderr, "%s(%i) : CUDA Runtime API error %d: %s.\n", file, line, (int)err, cudaGetErrorString( err ) );
exit(-1);
}
}
texture<int, cudaTextureType2D> tex_transition;
int main ( void ) {
int m = 8, p_size = 100, alphabet = 20;
size_t pitch;
int *transition = ( int * ) malloc ( ( m * p_size + 1 ) * alphabet * sizeof ( int ) );
memset ( transition, -1, ( m * p_size + 1 ) * alphabet * sizeof ( int ) );
int *d_transition;
checkCudaErrors ( cudaMallocPitch ( &d_transition, &pitch, alphabet * sizeof ( int ), ( m * p_size + 1 ) ) );
checkCudaErrors ( cudaMemcpy2D ( d_transition, pitch, transition, alphabet * sizeof ( int ), alphabet * sizeof ( int ), ( m * p_size + 1 ), cudaMemcpyHostToDevice ) );
cudaChannelFormatDesc desc = cudaCreateChannelDesc<int>();
checkCudaErrors ( cudaBindTexture2D ( 0, tex_transition, d_transition, desc, alphabet * sizeof ( int ), ( m * p_size + 1 ), pitch ) );
cudaFree ( d_transition );
return 0;
}
执行时出现错误“test.cu(33): CUDA Runtime API error 11: invalid argument.”。通过将字母设置为 10,错误就消失了。如果我没记错的话,每个绑定到纹理的数组的最大大小可以是 65000 x 65000 字(在这种情况下是整数),但转换数组要小得多。