cuda - cudaBindTexture 上的 CUDA 错误

Question

我有与帖子中描述的完全相同的问题： cudaBindTexture2D 上的 CUDA 错误

我什至有以下错误：

错误 18：无效的纹理参考。”并且还遇到“不会在 cudaMalloc 上抛出错误，而只会在 cudaBindTexture 上抛出错误

不幸的是，对于像我这样刚开始使用 CUDA 的人来说，发帖人 (Anton Roth) 回答他自己的问题的方式有点过于神秘：

答案在评论中，我使用了一个我的 GPU 不兼容的 sm。

“与 GPU 不兼容”是有道理的，因为示例程序FluidsGL（在 NVIDIA CUDA 示例浏览器中称为“Fluids（OpenGL 版本）”）在我的笔记本电脑上失败，但在我的工作桌面上运行良好。不幸的是，我仍然不知道“评论中”指的是什么，甚至不知道如何检查 GPU SM 的兼容性。

这是似乎导致问题的代码：

#define DIM 512

在main：

setupTexture(DIM, DIM);
bindTexture();

在fluidsGL_kernels.cu：

texture<float2, 2> texref;
static cudaArray *array = NULL;

void setupTexture(int x, int y)
{
    // Wrap mode appears to be the new default
    texref.filterMode = cudaFilterModeLinear;
    cudaChannelFormatDesc desc = cudaCreateChannelDesc<float2>();

    cudaMallocArray(&array, &desc, y, x);
    getLastCudaError("cudaMalloc failed");
}

void bindTexture(void)
{
    cudaBindTextureToArray(texref, array);//this function itself doesn't throw the error but error 18 is caught by the function below
    getLastCudaError("cudaBindTexture failed");
}

硬件信息

这是输出deviceQuery：

Device 0: "GeForce 9800M GS"
  CUDA Driver Version / Runtime Version          5.0 / 5.0
  CUDA Capability Major/Minor version number:    1.1
  Total amount of global memory:                 1024 MBytes (1073741824 bytes)
  ( 8) Multiprocessors x (  8) CUDA Cores/MP:    64 CUDA Cores
  GPU Clock rate:                                1325 MHz (1.32 GHz)
  Memory Clock rate:                             799 Mhz
  Memory Bus Width:                              256-bit
  Max Texture Dimension Size (x,y,z)             1D=(8192), 2D=(65536,32768), 3D
=(2048,2048,2048)
  Max Layered Texture Size (dim) x layers        1D=(8192) x 512, 2D=(8192,8192)
 x 512
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       16384 bytes
  Total number of registers available per block: 8192
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  768
  Maximum number of threads per block:           512
  Maximum sizes of each dimension of a block:    512 x 512 x 64
  Maximum sizes of each dimension of a grid:     65535 x 65535 x 1
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             256 bytes
  Concurrent copy and kernel execution:          Yes with 1 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  CUDA Device Driver Mode (TCC or WDDM):         WDDM (Windows Display Driver Mo
del)
  Device supports Unified Addressing (UVA):      No
  Device PCI Bus ID / PCI location ID:           8 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simu
ltaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 5.0, CUDA Runtime Versi
on = 5.0, NumDevs = 1, Device0 = GeForce 9800M GS

我知道我的 GPU 有点老了，但它仍然可以很好地运行大多数示例。

score 1 · Accepted Answer

您需要为正确的架构编译代码（如您链接的帖子中所述）。

由于您有 CC 1.1 设备，请使用以下 nvcc 编译选项：

-gencode arch=compute_11,code=sm_11

默认的 Visual Studio 项目或 Makefile 可能无法针对正确的体系结构进行编译，因此请始终确保它可以编译。

对于 Visual Studio，请参阅此答案：https ://stackoverflow.com/a/14413360/1043187

对于 Makefile，这取决于。CUDA SDK 示例通常有一个GENCODE_FLAGS可以修改的变量。

cuda - cudaBindTexture 上的 CUDA 错误

硬件信息

1 回答 1

Related

Reference