parallel-processing - Memory efficient parallelization of loop using MPI and mpi4py

Question

I am new to parallel programming and MPI, and I am stuck on this possibly easy problem ...

I am trying to write a code which evolve a system of N gravitationally interacting particles forward in time. This is quite easy using a naive algorithm, which is what I have done. But now I want to parallelize my code. Specifically I am writing in Python using mpi4py. A simplified (and heavily optimizable, but that is not the point), non-parallel implementation would look something like this:

# pos and vel are arrays storing the positions and velocities of all particles
dt = 0.01  # The time step
for i in range(N):
    for j in range(N):
        if i == j:
            continue
        # Calculate force between i'th and j'th particle
        r = pos[j] - pos[i]
        force -= r/norm(r)**3
        # Update velocity
        vel[i] -= dt*force
# Now use vel to update pos ...

How do I go about parallelizing this algorithm? Since the number of particles N could be very large, I want to store pos and vel only on the root process, to save memory. My initial thought was to parallelize the i-loop, but every process still needs access to pos and vel in their entirety! That is, a Scatter/Gather scheme is not helpful.

Am I forced to have a copy of pos and vel in memory for every process, or are there some way out of this? A simple way out would be to share the memory containing pos and vel across all processes, without making duplicates. I do not know if this is possible with MPI (and specifically mpi4py).

Any help would be gratefully accepted!

score 1 · Accepted Answer

我认为解决这个问题的常用方法是使用域分解。您可以根据需要将粒子划分为多个域（通常每个 MPI 进程一个，或者如果您正在执行多线程，则每个内核一个）。然后，您使用幻影区域和 MPI 通信来定义不同域之间的交互。

给你一个比这更大的答案是相当复杂的，所以我鼓励你去看看那些想法，然后带着具体的问题回来。

parallel-processing - Memory efficient parallelization of loop using MPI and mpi4py

1 回答 1

Related

Reference