cuda - CUDA 同步内核

Question

嗨，我对 CUDA 中的编程有疑问。我有以下代码：

int main () {

    for (;;) {
        kernel_1 (x1, x2, ....);
        kernel_2 (x1, x2 ...);
        kernel_3_Reduction (x1);

    // code manipulation host_x1
    // Copy the pointer device to host
        cpy (host_x1, x1, DeviceToHost)
        cpu_code_x1_manipulation;
        kernel_ (x1, x2, ....);
    }

}

那么当拷贝制作时如何保证kernel_1、kernel_2和kernel_3完成各自的任务呢？

score 11 · Accepted Answer

在同一流上启动的所有操作都是同步的。在上面的代码中，所有的内核都会一个接一个地运行。如果需要 kernel_1 和 kernel_2 并行运行，则必须明确指定流。

score 4 · Accepted Answer

cudaDeviceSynchronize();仅在您希望确保所有内核完成的地方使用。在此命令之后，您可以假设所有内核和所有挂起的设备函数调用都已完成。

cuda - CUDA 同步内核

2 回答 2

Related

Reference