最近我试图在单机(Ubuntu 12.04 - 64 bits core i7 2670 QM)上编译和运行我的 mpi 代码,我使用以下配置安装了 mpich2 版本 1.2:
./configure --prefix=/opt/mpich2 --enable-f77 --enable-fc --enable-cxx --with-device=ch3:sock --with-pm=mpd CC=icc CXX=icpc F77=ifort FC=ifort 2>&1 | tee configure.log
安装没问题,我的 mpd 运行良好,我用示例测试了 mpd,一切都很完美。
我使用 mpif77 编译我的代码,因为我不知道为什么当我编译 mpich2 时没有创建 mpif90。但即使使用 mpif77 我编译的代码也没有错误。
我用来编译代码的标志是:
对于编译器:
LN_FLAGS= -lm -larpack -lsparskit -lfftw3 -lrt -llapack -lblas
对于 MPI 链接器:
LN_FLAGS_MPI= $(LN_FLAGS) -I$(MPIHOME)/include -L$(MPIHOME) $(MPIHOME)/lib/libmpich.a -lfmpich -lopa -lmpe
所以问题是当我尝试在我的机器上运行代码时:
首先我调用 mpd 为:
mpd &
然后将代码运行为:
mpirun -np 4 ./code_mpi
我尝试了很多变化:
mpiexec -np 4 ./code_mpi
mpirun -n 2 ./code_mpi
mpiexec -n 2 ./code_mpi
所有结果都导致相同的错误:
Fatal error in MPI_Comm_rank: Invalid communicator, error stack:
MPI_Comm_rank(106): MPI_Comm_rank(MPI_COMM_NULL, rank=0x14b46a0) failed
MPI_Comm_rank(64).: Null communicator
Fatal error in MPI_Comm_rank: Invalid communicator, error stack:
MPI_Comm_rank(106): MPI_Comm_rank(MPI_COMM_NULL, rank=0x14b46a0) failed
MPI_Comm_rank(64).: Null communicator
[cli_2]: aborting job:
Fatal error in MPI_Comm_rank: Invalid communicator, error stack:
MPI_Comm_rank(106): MPI_Comm_rank(MPI_COMM_NULL, rank=0x14b46a0) failed
MPI_Comm_rank(64).: Null communicator
[cli_1]: aborting job:
Fatal error in MPI_Comm_rank: Invalid communicator, error stack:
MPI_Comm_rank(106): MPI_Comm_rank(MPI_COMM_NULL, rank=0x14b46a0) failed
MPI_Comm_rank(64).: Null communicator
rank 2 in job 1 ubuntu_38132 caused collective abort of all ranks
exit status of rank 2: killed by signal 9
Fatal error in MPI_Comm_rank: Invalid communicator, error stack:
MPI_Comm_rank(106): MPI_Comm_rank(MPI_COMM_NULL, rank=0x14b46a0) failed
MPI_Comm_rank(64).: Null communicator
[cli_3]: aborting job:
Fatal error in MPI_Comm_rank: Invalid communicator, error stack:
MPI_Comm_rank(106): MPI_Comm_rank(MPI_COMM_NULL, rank=0x14b46a0) failed
MPI_Comm_rank(64).: Null communicator
Fatal error in MPI_Comm_rank: Invalid communicator, error stack:
MPI_Comm_rank(106): MPI_Comm_rank(MPI_COMM_NULL, rank=0x14b46a0) failed
MPI_Comm_rank(64).: Null communicator
[cli_0]: aborting job:
Fatal error in MPI_Comm_rank: Invalid communicator, error stack:
MPI_Comm_rank(106): MPI_Comm_rank(MPI_COMM_NULL, rank=0x14b46a0) failed
MPI_Comm_rank(64).: Null communicator
rank 1 in job 1 ubuntu_38132 caused collective abort of all ranks
exit status of rank 1: return code 1
我花了将近 2 周的时间试图解决这个问题,因为我真的需要在我的个人电脑上运行这段代码才能在家工作。我感谢所有可以帮助我的东西!
这是我初始化 MPI 库的方法
subroutine init()
integer :: provided
call mpi_init(mpi_err)
call mpi_comm_rank(mpi_comm_world,rank,mpi_err)
call mpi_comm_size(mpi_comm_world,an_proc,mpi_err)
call MPI_BARRIER(MPI_COMM_WORLD,mpi_err)
end subroutine init