我有与帖子中描述的完全相同的问题: cudaBindTexture2D 上的 CUDA 错误
我什至有以下错误:
错误 18:无效的纹理参考。”并且还遇到“不会在 cudaMalloc 上抛出错误,而只会在 cudaBindTexture 上抛出错误
不幸的是,对于像我这样刚开始使用 CUDA 的人来说,发帖人 (Anton Roth) 回答他自己的问题的方式有点过于神秘:
答案在评论中,我使用了一个我的 GPU 不兼容的 sm。
“与 GPU 不兼容”是有道理的,因为示例程序FluidsGL(在 NVIDIA CUDA 示例浏览器中称为“Fluids(OpenGL 版本)”)在我的笔记本电脑上失败,但在我的工作桌面上运行良好。不幸的是,我仍然不知道“评论中”指的是什么,甚至不知道如何检查 GPU SM 的兼容性。
这是似乎导致问题的代码:
#define DIM 512
在main
:
setupTexture(DIM, DIM);
bindTexture();
在fluidsGL_kernels.cu
:
texture<float2, 2> texref;
static cudaArray *array = NULL;
void setupTexture(int x, int y)
{
// Wrap mode appears to be the new default
texref.filterMode = cudaFilterModeLinear;
cudaChannelFormatDesc desc = cudaCreateChannelDesc<float2>();
cudaMallocArray(&array, &desc, y, x);
getLastCudaError("cudaMalloc failed");
}
void bindTexture(void)
{
cudaBindTextureToArray(texref, array);//this function itself doesn't throw the error but error 18 is caught by the function below
getLastCudaError("cudaBindTexture failed");
}
硬件信息
这是输出deviceQuery
:
Device 0: "GeForce 9800M GS"
CUDA Driver Version / Runtime Version 5.0 / 5.0
CUDA Capability Major/Minor version number: 1.1
Total amount of global memory: 1024 MBytes (1073741824 bytes)
( 8) Multiprocessors x ( 8) CUDA Cores/MP: 64 CUDA Cores
GPU Clock rate: 1325 MHz (1.32 GHz)
Memory Clock rate: 799 Mhz
Memory Bus Width: 256-bit
Max Texture Dimension Size (x,y,z) 1D=(8192), 2D=(65536,32768), 3D
=(2048,2048,2048)
Max Layered Texture Size (dim) x layers 1D=(8192) x 512, 2D=(8192,8192)
x 512
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 8192
Warp size: 32
Maximum number of threads per multiprocessor: 768
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 2147483647 bytes
Texture alignment: 256 bytes
Concurrent copy and kernel execution: Yes with 1 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
CUDA Device Driver Mode (TCC or WDDM): WDDM (Windows Display Driver Mo
del)
Device supports Unified Addressing (UVA): No
Device PCI Bus ID / PCI location ID: 8 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simu
ltaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 5.0, CUDA Runtime Versi
on = 5.0, NumDevs = 1, Device0 = GeForce 9800M GS
我知道我的 GPU 有点老了,但它仍然可以很好地运行大多数示例。