我正在尝试在 Ubuntu 16.04 LTS 上将我的 C 程序与 OpenACC 2.5 并行化。在添加一行的简单修改后,我可以将所有 .c 文件编译为 .o 文件。在链接步骤中,pgcc 编译器显示
未定义对“__pgi_uacc_multicorestart”的引用
和
对“__pgi_uacc_multicoreend”的未定义引用
. 谷歌搜索显示与这些错误消息无关。请帮我解决这个问题。
这是与我的系统和程序相关的信息和源代码。我尝试发布基本部分,如果您需要其他任何内容,请告诉我。
操作系统、软件:
LSB Version: core-9.20160110ubuntu0.2-amd64:core-9.20160110ubuntu0.2-noarch:printing-9.20160110ubuntu0.2-amd64:printing-9.20160110ubuntu0.2-noarch:security-9.20160110ubuntu0.2-amd64:security-9.20160110ubuntu0.2-noarch
Distributor ID: Ubuntu
Description: Ubuntu 16.04.3 LTS
Release: 16.04
Codename: xenial
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/5/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 5.4.0-6ubuntu1~16.04.5' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-5 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.5)
pgcc 17.10-0 64-bit target on x86-64 Linux -tp haswell
PGI Compilers and Tools
Copyright (c) 2017, NVIDIA CORPORATION. All rights reserved.
.bashrc:
#CUDA
export PATH=/usr/local/cuda/bin:$PATH;
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH;
#####
ulimit -s unlimited
#####
#Environment Modules
source /usr/share/modules/init/bash
module add /opt/pgi/modulefiles/pgi64/17.10
module add /opt/pgi/modulefiles/openmpi/2.1.2/2017
#####
#intel compiler
source /opt/intel/bin/compilervars.sh intel64
#intel vtune
source /opt/intel/vtune_amplifier/amplxe-vars.sh
#intel advisor
source /opt/intel/advisor/advixe-vars.sh
#intel inspector
source /opt/intel/inspector/inspxe-vars.sh
#intel mkl
source /opt/intel/mkl/bin/mklvars.sh intel64
生成文件:
CC = pgcc
CFLAGS_pgcc = -O0 -Minform=inform -Minfo -ta=multicore -g -pg -Mprof=time
CFLAGS = $(CFLAGS_$(CC)) -c
LFLAGS = $(LFLAGS_$(CC)) -L${MKLROOT}/lib/intel64 -lmkl_rt -lpthread -lm -ldl
IFLAGS = $(IFLAGS_$(CC)) -I${MKLROOT}/include
<content is partially neglected>
serial: $(C_OBJ)
$(CC) $(IFLAGS) $(CFLAGS) -c msg_ser.c
$(CC) $(IFLAGS) -o dplbe $(C_OBJ) msg_ser.o $(LFLAGS)
错误信息:
lbe.o: In function `equilibrium_distrib':
<content is partially neglected>lbe.c:548: undefined reference to `__pgi_uacc_multicorestart'
<content is partially neglected>lbe.c:583: undefined reference to `__pgi_uacc_multicoreend'
makefile:57: recipe for target 'serial' failed
make: *** [serial] Error 2
lbe.c,我只添加了一行作为使用 OpenACC 的第一步。
#include "header.h"
extern int max_x, max_y, max_z;
extern int num_x, x_min, x_max;
extern int num_proc, n_proc;
extern double tau[2], tau_v[2];
<content is partially neglected>
void equilibrium_distrib(int xy, int z, double ***velcs_df, double dt,
struct vector forceDen, struct vector *correctedVel, double *f_eq)
{
<content is partially neglected>
#pragma acc kernels
{
for(int q=0; q < 19; q++)
{
double term1 = (c_x[q] * correctedVel->x + c_y[q] * correctedVel->y +
c_z[q] * correctedVel->z)*3.;
double term2 = 0.5*term1*term1;
f_eq[q] = weight[q]*density*(1 + term1 + term2 - term3);
}
}
}
将 lbe.c 编译为 lbe.o 消息:
pgcc-Warning--Mprof=time is not supported
PGC-I-0222-Redundant definition for symbol __THROW (/usr/include/x86_64-linux-gnu/sys/cdefs.h: 74)
PGC-I-0222-Redundant definition for symbol __extension__ (/usr/include/x86_64-linux-gnu/sys/cdefs.h: 358)
lbe_zcol:
<content is partially neglected>
equilibrium_distrib:
558, FMA (fused multiply-add) instruction(s) generated
559, FMA (fused multiply-add) instruction(s) generated
560, FMA (fused multiply-add) instruction(s) generated
565, FMA (fused multiply-add) instruction(s) generated
566, FMA (fused multiply-add) instruction(s) generated
567, FMA (fused multiply-add) instruction(s) generated
573, FMA (fused multiply-add) instruction(s) generated
577, Loop is parallelizable
Generating Multicore code
577, #pragma acc loop gang
580, FMA (fused multiply-add) instruction(s) generated