c++ - 统一内存分配 cuda 的功能关键字

Question

我从 CUDA 编程开始，作为实现粒子积分器的开始，我创建了一个积分器类，它保存有关粒子的数据并且应该能够集成它。数据来自另一个容器类，我想把这些数据分配到统一内存上。为此，我有一个成员函数“_allocate”，它所做的只是为成员变量调用 cudaMallocManaged。现在我想知道我应该用什么样的函数关键字来包装这个函数。

我读到您不能在类定义中使用“全局”，现在我同时使用主机和设备，因为主机和设备都应该可以使用统一内存，但我不确定这是否是正确的方法。

这是我想在其中实现的类：


template <typename T>
class Leapfrog : public Integrator<T> {
  public:

   ...

  private:
    T *positions; 
    T *masses; 
    T *velocities; 
    T *types; 
    __device__ __host__ bool _allocate();
    __device__ __host__ bool _free();
    __device__ __host__ bool _load_data();
};

// allocates space on the unified memory for the 
// private variables positions, masses, velocities, types

template <typename T>
__host__ __device__ void Leapfrog<T>::_allocate(){
  cudaMallocManaged(&positions, particleset.N*3*sizeof(T));
  cudaMallocManaged(&masses, particleset.N*sizeof(T));
  cudaMallocManaged(&velocities, particleset.N*3*sizeof(T));
  cudaMallocManaged(&types, particleset.N*sizeof(T));
}

不知道这个是不是和function关键字有关，但是我也想在分配后查看cudaError看是否成功

score 0 · Accepted Answer

每个只能在设备上调用的可调用对象都应该用__device__. 如果主机只应该用__host__.

您__host__ __device__仅用于将在主机和设备上调用的可调用对象。

cudaMallocManaged是主机专用代码：

__host__cudaError_t cudaMallocManaged ( void** devPtr, size_t size, unsigned int  flags = cudaMemAttachGlobal )
Allocates memory that will be automatically managed by the Unified Memory system.

所以你的代码只能在主机上运行。

c++ - 统一内存分配 cuda 的功能关键字

1 回答 1

Related

Reference