c - 释放 CUDA 内存非常缓慢

Question

我在 GPU 上使用cudaMalloc((void**)&(storage->data), size * sizeof(float)). 在我的程序结束时，我使用cudaFree(storage->data);.

问题是第一次释放真的很慢，大约 10 秒，而其他的几乎是瞬时的。

我的问题如下：什么可能导致这种差异？GPU上的释放内存通常那么慢吗？

score 3 · Accepted Answer

正如 NVIDIA 论坛上所指出的，这几乎肯定是您计时的方式而不是 cudaFree 的问题。

score 1 · Accepted Answer

should not be that slow, on Linux with cuda 2.2 it takes fraction of a second. Have you tried to run host and device profilers to see exactly why a slow? how many separate allocation do you perfor?, that does have some penalty but not so large.

c - 释放 CUDA 内存非常缓慢

2 回答 2

Related

Reference