c - OpenMP 中每个线程的私有“for”循环

Question

有关我的初步解决方案，请参阅下面的编辑

考虑以下代码：

#include <stdio.h>
#include <stdlib.h>
#include <omp.h>

int main(void) {

int counter = 0;
int i;

omp_set_num_threads(8); 

#pragma omp parallel
        { 
            int id = omp_get_thread_num();
            #pragma omp for private(i)
            for (i = 0; i<10; i++) {
                printf("id: %d thread: %d\n", i, id);
                #pragma omp critical // or atomic
                counter++;
            }
        }

printf("counter %d\n", counter);

return 0;
}

我将线程数定义为 8。对于 8 个线程中的每一个，我希望for每个单独的线程都有一个循环，该循环会增加变量counter。但是，OpenMP 似乎并行化了for循环：

i: 0 thread: 0
i: 1 thread: 0
i: 4 thread: 2
i: 6 thread: 4
i: 2 thread: 1
i: 3 thread: 1
i: 7 thread: 5
i: 8 thread: 6
i: 5 thread: 3
i: 9 thread: 7
counter 10

因此，counter=10，但我想要counter=80。我该怎么做才能让每个线程for在所有线程递增时执行自己的循环counter？

以下代码给出了预期的结果： 我添加了另一个从 0 循环到最大线程数的外部for循环。在这个循环中，我可以for为每个线程声明我的循环私有。确实，counter=80在这种情况下。这是这个问题的最佳解决方案还是有更好的解决方案？

int main(void) {


omp_set_num_threads(8); 

int mthreads = omp_get_max_threads();

#pragma omp parallel for private(i)
    for (n=0; n<mthreads; n++) {
            int id = omp_get_thread_num();
        for (i = 0; i<10; i++) {
            printf("i: %d thread: %d\n", i, id);
            #pragma omp critical
            counter++;
        }
    }

}
printf("counter %d\n", counter);

return 0;
}

score 3 · Accepted Answer

解决方案非常简单 - 删除工作共享结构for：

#pragma omp parallel
    { 
        int id = omp_get_thread_num();
        for (int i = 0; i<10; i++) {
            printf("id: %d thread: %d\n", i, id);
            #pragma omp critical // or atomic
            counter++;
        }
    }

i在的控制部分内声明for是 C99 的一部分，可能需要您向编译器传递类似于-std=c99. 否则，您可以简单地i在块的开头声明。或者您可以在该区域之外声明它并制作它private：

int i;

#pragma omp parallel private(i)
    { 
        int id = omp_get_thread_num();
        for (i = 0; i<10; i++) {
            printf("id: %d thread: %d\n", i, id);
            #pragma omp critical // or atomic
            counter++;
        }
    }

由于您没有使用counter并行区域内的值，因此您也可以使用 sum reduction 代替：

#pragma omp parallel reduction(+:counter)
    { 
        int id = omp_get_thread_num();
        for (int i = 0; i<10; i++) {
            printf("id: %d thread: %d\n", i, id);
            counter++;
        }
    }

score 2 · Accepted Answer

OpenMp 对此有一个概念，reduction. 坚持你的榜样

#pragma omp parallel for reduction(+:counter)
  for (unsigned n=0; n<mthreads; n++) {
    int id = omp_get_thread_num();
    for (unsigned i = 0; i<10; i++) {
      printf("i: %d thread: %d\n", i, id);
      counter++;
    }
  }

这样做的好处是不围绕增量定义关键部分。OpenMp 自己收集所有不同形式的总和counter，而且可能更有效。

这甚至可以更简单地表述为

#pragma omp parallel for reduction(+:counter)
  for (unsigned i=0; i<mthreads*10; i++) {
    int id = omp_get_thread_num();
    printf("i: %d thread: %d\n", i, id);
    counter++;
  }

对于某些编译器，您可能仍然必须坚持使用标志，例如-std=c99要在for循环中声明变量。将变量声明为尽可能本地的优点是，您不必坚持它们是私有的或类似的东西。最简单的方法当然是让 OpenMp 自己进行for-loop 的拆分。

c - OpenMP 中每个线程的私有“for”循环

2 回答 2

Related

Reference