c++ - 复制到锯齿状数组的元素时出现 CUDA 运行时错误

Question

在主机上，我有一个用整数向量向量实现的锯齿状数组。

为了在设备上设置一个锯齿状数组，我首先分配一个指向 ints 指针的指针：

int **   adjlist;    // host pointer
int ** d_adjlist;    // device pointer

只是为了澄清一些术语，我将指针数组adjlist称为“基础”，并将指向adjlist[i]“牙齿”的数组称为。

// this is the width of the base
const int ens_size = 12;    

// allocate the base on the device
cutilSafeCall( cudaMalloc( (void***)&d_adjlist, ens_size*sizeof(int*) ) );

// to store the contents of base on host (I can't cudaMalloc the teeth directly, as that would require dereferencing a pointer to device memory)
adjlist = static_cast<int**>( malloc( ens_size*sizeof(int*) ) );

// copy the contents of base from the device to the host
cutilSafeCall( cudaMemcpy( adjlist, d_adjlist, ens_size*sizeof(int*), cudaMemcpyDeviceToHost) );

这一切都很好，现在基础已经完成。我一开始提到的向量的原始向量存储在nets[i]->adjlist. 现在我用以下循环分配牙齿：

int N = 6;
int numNets = 2;

for(int i=0; i < numNets; ++i)
{
    for(int j=0; j < N; ++j)
    {
        k = nets[i]->adjlist[j].size();

        // allocate the "teeth" of the adjacency list
        cutilSafeCall( cudaMalloc( (void**)&(adjlist[N*i+j]), k ) );
    }
 }

当我将牙齿从向量的向量复制到设备上的牙齿时，出现了我的问题，这里是代码：

// this holds the tooth to be copied to the device
int h_adjlist[Kmax];    // k <= Kmax

for(int i=0; i < numNets; ++i)
{
    for(int j=0; j < N; ++j)
    {
        k = nets[i]->adjlist[j].size();

        // copy the adjacency list of the (Ni+j)-th node
        copy( nets[i]->adjlist[j].begin(), nets[i]->adjlist[j].end(), h_adjlist );

        cutilSafeCall( cudaMemcpy( adjlist[N*i+j], 
                                   h_adjlist, 
                                   sizeof(int)*k,  
                                   cudaMemcpyHostToDevice ) );

    }
}

当我尝试运行代码时，出现以下Runtime API error: invalid argument.错误：

                                   cudaMemcpyHostToDevice ) );

至少那是cudaSafeCall函数说发生错误的那一行。

为什么这被标记为无效参数？或者，如果是其他论点，是哪一个？

c++ - 复制到锯齿状数组的元素时出现 CUDA 运行时错误

0 回答 0

Related

Reference