如果我concurrency::array_view
在一个concurrency::parallel_for_each
循环中被操作,我的理解是我可以在循环执行时继续 CPU 上的其他任务:
using namespace Concurrency;
array_view<int> av;
parallel_for_each(extent<1>(number),[=](index<1> idx)
{
// do some intense computations on av
}
// do some stuff on the CPU while we wait
av.synchronize(); // wait for the parallel_for_each loop to finish and copy the data
但是,如果我不想等待并行 for 循环,而是尽快开始从 GPU 复制数据回来怎么办。下面的工作吗?
using namespace Concurrency;
array_view<int> av;
parallel_for_each(extent<1>(number),[=](index<1> idx)
{
// do some intense computations on av
}
// I know that we won't be waiting to synch when I call this, but will we be waiting here
// until the data is available on the GPU end to START copying?
completion_future waitOnThis = av.synchronize_asynch();
// will this line execute before parallel_for_each has finished processing, or only once it
// has finished processing an the data from "av" has started copying back?
completion_future.wait();
我在The Moth上阅读了有关此主题的信息,但在阅读了以下内容后,我并没有真正变得更聪明:
请注意,parallel_for_each 的执行好像与调用代码同步,但实际上它是异步的。即一旦调用了parallel_for_each 并且内核已经被传递到运行时,some_code_B 区域继续由CPU 线程立即执行,而并行内核由GPU 线程执行。但是,如果您尝试访问您在 some_code_B 区域的 lambda 中捕获的(数组或数组视图)数据,您的代码将阻塞,直到结果可用。因此正确的说法是:parallel_for_each 在可见的副作用方面似乎是同步的,但实际上是异步的。