我有以下全局内核:
__global__ void pdegpu(PDE_ParabolicD1_Num_GPU **pdes)
{
PDE_ParabolicD1_Num_GPU *loc;
loc = new PDE_ParabolicD1_Num_GPU();
loc->Setup();
delete loc;
//above code was just an example to show that new and delete work fine
*pdes = new PDE_ParabolicD1_Num_GPU(); //error occurs here
(*pdes)->Setup();
}
我调用它来创建 PDE_ParabolicD1_Num_GPU 类型的对象并设置它。在 main() 中,我将使用同一个对象,这就是我在函数参数中使用双指针的原因。在 main() 中,我执行以下操作:
PDE_ParabolicD1_Num_GPU pdes_host;
PDE_ParabolicD1_Num_GPU *pdes_dev=0;
pdegpu<<<1,1>>>(&pdes_dev);
cudaStatus = cudaMemcpy(&pdes_host, pdes_dev, sizeof(PDE_ParabolicD1_Num_GPU), cudaMemcpyDeviceToHost);
...
delete [] pdes_dev;
但是,我收到代码中显示的错误,该错误的 CUDA Memory Checker 输出如下:
Memory Checker detected 1 access violations.
error = access violation on store (global memory)
gridid = 16
blockIdx = {0,0,0}
threadIdx = {0,0,0}
address = 0x0018f420
accessSize = 4
error MemoryChecker: #misaligned=0 #invalidAddress=1
据我了解,该错误是由于无效地址引起的。
谁能帮我解决这个问题?谢谢