0

我正在尝试在我编写的这个虚拟代码上使用 OpenMP 和 C++ 使用 GPU(GTX 1080Ti)卸载数组计算:

#include <omp.h>
#include <iostream>

using namespace std;

int main(){

        //int totalSum, ompSum;
        int totalSum=0, ompSum=0;
        const int N = 1000;
        int array[N];
        for (int i=0; i<N; i++){
                array[i]=i;
        }
        #pragma omp target
        {
                #pragma omp parallel private(ompSum) shared(totalSum)
                {
                        ompSum=0;
                        omp_set_num_threads(100);
                        printf ( "Total number of threads are %d!\n", omp_get_num_threads() );
                        #pragma omp for
                        for (int i=0; i<N; i++){
                                ompSum += array[i];
                        }

                        #pragma omp critical
                        totalSum += ompSum;

                }

                printf ( "Caculated sum should be %d but is %d\n", N*(N-1)/2, totalSum );
        }
        return 0;


}

运行代码后,这是我得到的输出:

Total number of threads are 8!
Total number of threads are 8!
Total number of threads are 8!
Total number of threads are 8!
Total number of threads are 8!
Total number of threads are 8!
Total number of threads are 8!
Total number of threads are 8!
Caculated sum should be 499500 but is 499500

计算的总和是正确的,但我很好奇为什么它只显示了 8 个线程,而我在代码中设置了 100 个线程。

设置omp_set_num_threads右下角#pragma omp target时,运行时会报

libgomp: cuCtxSynchronize error: an illegal memory access was encountered

我是 OpenMP 的新手,如果有人能帮助解释这个问题,我将不胜感激。

4

0 回答 0