我目前正在尝试将以下mpi_send/recv
调用转换为 one mpi_scatterv
,因为通过将我的数组复制到临时缓冲区并发送该临时缓冲区,我遇到了相当大的性能损失。与等效的串行实现相比,它仍然值得,但我希望无需复制到临时缓冲区即可分发工作。这似乎mpi_scatterv
是我想要的功能,但我在实现方面的各种尝试都没有奏效,而且大多都很困惑。
执行 mpi_send/recv 调用的代码如下:
if(me_image.eq.root_image) then
do i = 0, max_proc-1, 1
allocate(temp_dCpqdR(3*nat_sl, 3*nat_sl, n_pairs(i+1), 3))
do j = 1, n_pairs(i+1), 1
temp_dCpqdR(:,:,j,:) = dCpqdR(:,:,j+offset,:)
end do
offset = offset + n_pairs(i+1)
if(i.ne.0) then
call mpi_send(temp_dCpqdR, 3*nat_sl*3*nat_sl*3*n_pairs(i+1), mpi_double_precision,&
i, 0, intra_image_comm,ierr)
call mpi_send(Cpq, 3*nat_sl*3*nat_sl, mpi_double_precision,&
i, 1, intra_image_comm,ierr)
call mpi_send(eigenvalues, 3*nat_sl, mpi_double_precision,&
i, 2, intra_image_comm,ierr)
else
my_dCpqdR(:,:,:,:) = temp_dCpqdR(:,:,:,:)
end if
deallocate(temp_dCpqdR)
end do
else
if(me_image.le.(max_proc-1)) then
call mpi_recv(my_dCpqdR,& ! Buffer
3*nat_sl*3*nat_sl*3*n_pairs(me_image+1),& ! Count
mpi_double_precision,& ! Type
0,& ! Source
0,& ! Tag
intra_image_comm,& ! Communicator
rstatus,& ! Status var
ierr) ! Error flag
call mpi_recv(Cpq,& ! Buffer
3*nat_sl*3*nat_sl,& ! Count
mpi_double_precision,& ! Type
0,& ! Source
1,& ! Tag
intra_image_comm,& ! Communicator
rstatus,& ! Status var
ierr) ! Error flag
call mpi_recv(eigenvalues,& ! Buffer
3*nat_sl,& ! Count
mpi_double_precision,& ! Type
0,& ! Source
2,& ! Tag
intra_image_comm,& ! Communicator
rstatus,& ! Status var
ierr) ! Error flag
end if
end if
我自己尝试将上面的代码翻译成 scatterv 调用,但我不知道该怎么做。我想我需要有这样的行:
call mpi_type_create_subarray(4, (/ nat_sl, nat_sl, nat, 3 /), (/nat_sl, nat_sl, n_pairs(me_image+1), 3/),&
(/0, 0, 0, 0/), mpi_order_fortran, mpi_double_precision, subarr_typ, ierr)
call mpi_type_commit(subarr_typ, ierr)
和
call mpi_scatterv(dCpqdR, n_pairs(me_image+1), f_displs, subarr_typ,&
my_dCpqdR, 3*nat_sl*3*nat_sl*3*n_pairs(me_image+1), subarr_typ,&
root_image, intra_image_comm, ierr)
我读过我需要设置范围,所以我像这样实现它们:
extent = 3*nat_sl*3*nat_sl*3*n_pairs(me_image+1)
call MPI_Type_create_resized(subarr_typ, 0, extent, resized_subarr, ierr)
call MPI_Type_commit(resized_subarr, ierr)
但这给了我很多错误,包括
[MathBook Pro:58100] *** An error occurred in MPI_Type_create_subarray
[MathBook Pro:58100] *** reported by process [2560884737,2314885530279477248]
[MathBook Pro:58100] *** on communicator MPI_COMM_WORLD
[MathBook Pro:58100] *** MPI_ERR_ARG: invalid argument of some other kind
[MathBook Pro:58100] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[MathBook Pro:58100] *** and potentially your MPI job)
无论如何,我相信错误在于我处理内存布局的方式。如果您需要我提供更多信息,请告诉我,我期待您提出任何建议。