我很难调整我的想法以适应 OpenMP 的做事方式。
大致来说,我想要的是:
for(int i=0; i<50; i++)
{
doStuff();
thread t;
t.start(callback(i)); //each time around the loop create a thread to execute callback
}
我想我知道这将如何在 c++11 中完成,但我需要能够用 OpenMP 完成类似的事情。
与您想要的最接近的是 OpenMP 任务,它在 OpenMP v3.0 和更高版本的兼容编译器中可用。它是这样的:
#pragma omp parallel
{
#pragma omp single
for (int i = 0; i < 50; i++)
{
doStuff();
#pragma omp task
callback(i);
}
}
此代码将使循环仅在一个线程中执行,它将创建 50 个 OpenMP 任务,这些任务将callback()
使用不同的参数进行调用。然后它会在退出并行区域之前等待所有任务完成。任务将被空闲线程选择(可能是随机的)来执行。OpenMP 在每个并行区域的末尾强加了一个隐式屏障,因为它的 fork-join 执行模型要求只有主线程在并行区域之外运行。
这是一个示例程序(ompt.cpp
):
#include <stdio.h>
#include <unistd.h>
#include <omp.h>
void callback (int i)
{
printf("[%02d] Task stated with thread %d\n", i, omp_get_thread_num());
sleep(1);
printf("[%02d] Task finished\n", i);
}
int main (void)
{
#pragma omp parallel
{
#pragma omp single
for (int i = 0; i < 10; i++)
{
#pragma omp task
callback(i);
printf("Task %d created\n", i);
}
}
printf("Parallel region ended\n");
return 0;
}
编译和执行:
$ g++ -fopenmp -o ompt.x ompt.cpp
$ OMP_NUM_THREADS=4 ./ompt.x
Task 0 created
Task 1 created
Task 2 created
[01] Task stated with thread 3
[02] Task stated with thread 2
Task 3 created
Task 4 created
Task 5 created
Task 6 created
Task 7 created
[00] Task stated with thread 1
Task 8 created
Task 9 created
[03] Task stated with thread 0
[01] Task finished
[02] Task finished
[05] Task stated with thread 2
[04] Task stated with thread 3
[00] Task finished
[06] Task stated with thread 1
[03] Task finished
[07] Task stated with thread 0
[05] Task finished
[08] Task stated with thread 2
[04] Task finished
[09] Task stated with thread 3
[06] Task finished
[07] Task finished
[08] Task finished
[09] Task finished
Parallel region ended
请注意,任务的执行顺序与创建它们的顺序不同。
GCC 不支持 4.4 之前版本的 OpenMP 3.0。无法识别的 OpenMP 指令将被静默忽略,生成的可执行文件将串行地执行该代码部分:
$ g++-4.3 -fopenmp -o ompt.x ompt.cpp
$ OMP_NUM_THREADS=4 ./ompt.x
[00] Task stated with thread 3
[00] Task finished
Task 0 created
[01] Task stated with thread 3
[01] Task finished
Task 1 created
[02] Task stated with thread 3
[02] Task finished
Task 2 created
[03] Task stated with thread 3
[03] Task finished
Task 3 created
[04] Task stated with thread 3
[04] Task finished
Task 4 created
[05] Task stated with thread 3
[05] Task finished
Task 5 created
[06] Task stated with thread 3
[06] Task finished
Task 6 created
[07] Task stated with thread 3
[07] Task finished
Task 7 created
[08] Task stated with thread 3
[08] Task finished
Task 8 created
[09] Task stated with thread 3
[09] Task finished
Task 9 created
Parallel region ended
例如看看http://en.wikipedia.org/wiki/OpenMP。
#pragma omp for
是你的朋友。OpenMP 不需要您考虑线程。您只需声明(!)您想要并行运行的内容,OpenMP 兼容编译器会在编译期间在您的代码中执行所需的转换。
OpenMP 的规格也很有趣。他们很好地解释了可以做什么以及如何做:http: //openmp.org/wp/openmp-specifications/
您的示例可能如下所示:
#pragma omp parallel for
for(int i=0; i<50; i++)
{
doStuff();
thread t;
t.start(callback(i)); //each time around the loop create a thread to execute callback
}
for 循环中的所有内容都是并行运行的。您必须注意数据依赖性。'doStuff()' 函数在您的伪代码中按顺序运行,但在我的示例中将并行运行。您还需要指定哪些变量是线程私有的,以及类似的变量也将进入#pragma 语句。