2

我正在将 openmpi-1.4.5 从源代码安装到具有 GLIBC_2.11 向后兼容性的 gentoo 中,以便在具有一个 debian gnu/linux(挤压)计算节点的 NFS(集群 HPC)下运行。类似于拥有两个具有独立书店的系统,但两者都可以在高性能网络中执行文件。这个想法是两个操作系统都可以通过 MPI 运行文件。这些是我为配置和制作而执行的步骤:

configure:
.././configure --prefix=/usr/local/ompi-compat --build=x86_64-pc-linux-gnu --with-openib     CC=x86_64-pc-linux-gnu-gcc -include /usr/local/include/gcc-preinclude.h

make:
make LDFLAGS="-Wl,-rpath -Wl,/spoa/usr/lib64 -Wl,-rpath -Wl,/usr/local/ompi-compat/lib" 2>&1 | tee make02.log

后来,制作阶段被打破:

...

..

libtool: compile:  x86_64-pc-linux-gnu-gcc -include /usr/local/include/gcc-preinclude.h  -DHAVE_CONFIG_H -I. -I../../.././opal/asm -I../../opal/include -I../../orte/include -I../../ompi/include -I../../opal/mca/paffinity/linux/plpa/src/libplpa -I../../../. -I../.. -I../../.././opal/include -I../../.././orte/include -I../../.././ompi/include -O3 -DNDEBUG -finline-functions -fno-strict-aliasing -MT atomic-asm.lo -MD -MP -MF .deps/atomic-asm.Tpo -c atomic-asm.S  -fPIC -DPIC -o .libs/atomic-asm.o

/usr/local/include/gcc-preinclude.h:汇编程序消息:

/usr/local/include/gcc-preinclude.h:1: Error: invalid character '(' in mnemonic
make[2]:  [atomic-asm.lo] Error 1
make[2]: Leaving directory `/usr/local/src/openmpi-1.4.5/build/opal/asm'
make[1]:  [all-recursive] Error 1
make[1]: Leaving directory `/usr/local/src/openmpi-1.4.5/build/opal'
make:  [all-recursive] Error 1

模块 asm 存在问题,因为我使用一个 include c 适配来从头文件 /usr/local/include/gcc-preinclude.h 中排除 memcpy@2_2_5 符号:

__asm__(".symver memcpy,memcpy@GLIBC_2.2.5");

我将这种情况修复到文件夹 add --tag=CC 并从 x86_64-pc-linux-gnu-gcc 编译器中剪切“-include /usr/local/include/gcc-preinclude.h”:

/bin/sh ../../libtool  --tag=CC --mode=compile x86_64-pc-linux-gnu-gcc -DHAVE_CONFIG_H -I. -I../../.././opal/asm -I../../opal/include -I../../orte/include -I../../ompi/include -I../../opal/mca/paffinity/linux/plpa/src/libplpa   -I../../../. -I../.. -I../../.././opal/include -I../../.././orte/include -I../../.././ompi/include    -O3 -DNDEBUG -finline-functions -fno-strict-aliasing -MT atomic-asm.lo -MD -MP -MF $depbase.Tpo -c -o atomic-asm.lo atomic-asm.S &&mv -f $depbase.Tpo $depbase.Plo

将此行编译到 opal/asm/ 文件夹中很棒!

返回主构建文件夹并继续编译:

make LDFLAGS="-Wl,-rpath -Wl,/spoa/usr/lib64 -Wl,-rpath -Wl,/usr/local/ompi-compat/lib" 2>&1 | tee make02.log

编译后几分钟就生效了。

但是当我测试一些可执行文件以运行时,显示为“分段错误”:

cd opal/tools/wrappers/.libs

.libs # ./opal_wrapper 
Segmentation fault

因此,对于所有可执行文件...

Now, veryfing the Shared Library path linking to ELF File:
ldd opal_wrapper 
linux-vdso.so.1 =>  (0x00007fff3ffff000)
libopen-pal.so.0 => /usr/local/ompi-compat/lib/libopen-pal.so.0 (0x00002b8ea8239000)
libdl.so.2 => /spoa/usr/lib64/libdl.so.2 (0x00002b8ea8493000)
libnsl.so.1 => /spoa/usr/lib64/libnsl.so.1 (0x00002b8ea8697000)
libutil.so.1 => /spoa/usr/lib64/libutil.so.1 (0x00002b8ea88af000)
libm.so.6 => /lib64/libm.so.6 (0x00002b8ea8ad5000)
libpthread.so.0 => /spoa/usr/lib64/libpthread.so.0 (0x00002b8ea8d56000)
libc.so.6 => /spoa/usr/lib64/libc.so.6 (0x00002b8ea8f72000)
/lib64/ld-linux-x86-64.so.2 (0x00002b8ea8017000)

有与 -rpath 链接的共享对象文件。我使用debian“squeeze”附带的主要原始共享对象到/spoa/usr/lib64文件夹并链接OK!但我认为失败源于与原始 gentoo 动态链接器 /lib64/ld-linux-x86_64.so.2 的链接

There is my debian "squeeze" main toolchain shared objects:
ls  /spoa/usr/lib64/
ld-2.11.3.so          libdl.so.2            libnsl-2.11.3.so      libstdc++.so.6.0.13        libz.so.1.2.3.4
ld-linux-x86-64.so.2  libgcc_s.so           libnsl.so.1           libutil-2.11.3.so
libc-2.11.3.so        libgcc_s.so.1         libpthread-2.11.3.so  libutil.so.1
libc.so.6             libgfortran.so.3.0.0  libpthread.so.0       libz.so
libdl-2.11.3.so       libm.so               libstdc++.so.6        libz.so.1

如果我手动将此可执行文件的链接器更改为 gentoo 和 debian(通过 NFS)并最终运行:

/bin/sh ../../../libtool --tag=CC   --mode=link x86_64-pc-linux-gnu-gcc -include    /usr/local/include/gcc-preinclude.h  -O3 -DNDEBUG -finline-functions -fno-strict-aliasing -pthread -fvisibility=hidden  -L/spoa/usr/lib64 -Wl,-rpath -Wl,/spoa/usr/lib64 -o opal_wrapper opal_wrapper.o ../../../opal/libopen-pal.la -lnsl -lutil  -lm -Wl,-rpath -Wl,/usr/local/ompi-compat/lib -Wl,-dynamic-linker /spoa/usr/lib64/ld-linux-x86-64.so.2

结果:

libtool: link: x86_64-pc-linux-gnu-gcc -include /usr/local/include/gcc-preinclude.h -O3     -DNDEBUG -finline-functions -fno-strict-aliasing -pthread -fvisibility=hidden -Wl,-rpath -    Wl,/spoa/usr/lib64 -o .libs/opal_wrapper opal_wrapper.o -Wl,-rpath -Wl,/usr/local/ompi-    compat/lib -Wl,-dynamic-linker /spoa/usr/lib64/ld-linux-x86-64.so.2  -L/spoa/usr/lib64     ../../../opal/.libs/libopen-pal.so -ldl -lnsl -lutil -lm -pthread

和 ldd 正确显示适当的链接器:

wrappers ldd .libs/opal_wrapper 

linux-vdso.so.1 =>  (0x00007fffc75a7000)
libopen-pal.so.0 => /usr/local/ompi-compat/lib/libopen-pal.so.0 (0x00002b396dd38000)
libdl.so.2 => /spoa/usr/lib64/libdl.so.2 (0x00002b396df91000)
libnsl.so.1 => /spoa/usr/lib64/libnsl.so.1 (0x00002b396e196000)
libutil.so.1 => /spoa/usr/lib64/libutil.so.1 (0x00002b396e3ae000)
libm.so.6 => /lib64/libm.so.6 (0x00002b396e5d3000)
libpthread.so.0 => /spoa/usr/lib64/libpthread.so.0 (0x00002b396e855000)
libc.so.6 => /spoa/usr/lib64/libc.so.6 (0x00002b396ea71000)
/spoa/usr/lib64/ld-linux-x86-64.so.2 (0x00002b396db18000)

我的问题是是否可以使用 -Wl, -dynamic-linker 生成链接到适当共享对象文件的正确 ELF?

运行:./opal_wrapper
无法打开配置文件 /usr/local/ompi-compat/share/openmpi/opal_wrapper-wrapper-data.txt 解析数据文件 opal_wrapper 时出错:未找到

我的一些主机信息:

hostname = master
uname -m = x86_64
uname -r = 3.4.5-gentoo
uname -s = Linux
uname -v = #1 SMP Mon Jul 23 21:35:06 UTC 2012
4

0 回答 0