I am new to parallel programming and MPI, and I am stuck on this possibly easy problem ...
I am trying to write a code which evolve a system of N gravitationally interacting particles forward in time. This is quite easy using a naive algorithm, which is what I have done. But now I want to parallelize my code. Specifically I am writing in Python using mpi4py. A simplified (and heavily optimizable, but that is not the point), non-parallel implementation would look something like this:
# pos and vel are arrays storing the positions and velocities of all particles
dt = 0.01 # The time step
for i in range(N):
for j in range(N):
if i == j:
continue
# Calculate force between i'th and j'th particle
r = pos[j] - pos[i]
force -= r/norm(r)**3
# Update velocity
vel[i] -= dt*force
# Now use vel to update pos ...
How do I go about parallelizing this algorithm? Since the number of particles N
could be very large, I want to store pos
and vel
only on the root process, to save memory. My initial thought was to parallelize the i
-loop, but every process still needs access to pos
and vel
in their entirety! That is, a Scatter/Gather scheme is not helpful.
Am I forced to have a copy of pos
and vel
in memory for every process, or are there some way out of this?
A simple way out would be to share the memory containing pos
and vel
across all processes, without making duplicates. I do not know if this is possible with MPI (and specifically mpi4py).
Any help would be gratefully accepted!