1

我安装了gcc-7,gcc-8gcc-7-offload-nvptxgcc-8-offload-nvptx

我尝试使用这两种方法来编译一个带有卸载的简单 OpenMP 代码:

#include <omp.h>
#include <stdio.h>

int main(){
    #pragma omp target
    #pragma omp teams distribute parallel for
    for (int i=0; i<omp_get_num_threads(); i++)
        printf("%d in %d of %d\n",i,omp_get_thread_num(), omp_get_num_threads());
}

使用以下行(gcc-7也使用):

gcc-8 code.c -fopenmp -foffload=nvptx-none

但它没有编译,给出以下错误:

/tmp/ccKESWcF.o: In function "main":
teste.c:(.text+0x50): undefined reference to "GOMP_target_ext"
/tmp/cc0iOH1Y.target.o: In function "init":
ccPXyu6Y.c:(.text+0x1d): undefined reference to "GOMP_offload_register_ver"
/tmp/cc0iOH1Y.target.o: In function "fini":
ccPXyu6Y.c:(.text+0x41): undefined reference to "GOMP_offload_unregister_ver"
collect2: error: ld returned 1 exit status

一些线索?

4

1 回答 1

0

-foffload=disable -fno-stack-protector您使用gcc7andgcc-7-offload-nvptx和 Ubuntu 17.10为我编译和运行代码。

但是在 GPU 上(没有-foffload=disable)它无法编译。您不能printf从 GPU 调用。相反,您可以这样做:

#include <omp.h>
#include <stdio.h>
#include <stdlib.h>

int main(){
  int nthreads;
  #pragma omp target teams map(tofrom:nthreads)
  #pragma omp parallel
  #pragma omp single
  nthreads = omp_get_num_threads();

  int *ithreads = malloc(sizeof *ithreads *nthreads);

  #pragma omp target teams distribute parallel for map(tofrom:ithreads[0:nthreads])
  for (int i=0; i<nthreads; i++) ithreads[i] = omp_get_thread_num();

  for (int i=0; i<nthreads; i++)
    printf("%d in %d of %d\n", i, ithreads[i], nthreads);

  free(ithreads);  
}

对我来说,这输出

0 in 0 of 8
1 in 0 of 8
2 in 0 of 8
3 in 0 of 8
4 in 0 of 8
5 in 0 of 8
6 in 0 of 8
7 in 0 of 8
于 2018-03-16T09:24:14.867 回答