MPI
Work distribution and shared memory.
- class elphmod.MPI.Buffer(buf=None)[source]
Wrapper for
pickle
in parallel context.- Parameters:
Create array whose memory is shared among all processes on same node.
See also
- elphmod.MPI.distribute(size, bounds=False, comm=<elphmod.MPI.Communicator object>, chunks=None)[source]
Distribute work among processes.
- elphmod.MPI.elphmodenv(num_threads=1)[source]
Print commands to change number of threads.
Run eval $(elphmodenv) before your Python scripts to prevent NumPy from using multiple processors for linear algebra. This is advantageous when parallelizing with MPI already (mpirun python3 script.py) or on shared servers, where CPU resources must be used with restraint.
Combine eval $(elphmodenv X) with mpirun -n Y python3 script.py, where the product of X an Y is the number of available CPUs, for hybrid parallelization.
- elphmod.MPI.info(message, error=False, comm=<elphmod.MPI.Communicator object>)[source]
Print status message from first process.
- elphmod.MPI.load(filename, shared_memory=False, comm=<elphmod.MPI.Communicator object>)[source]
Read and broadcast NumPy data.
Create array whose memory is shared among all processes on same node.
With
shared_memory=False
(single_memory=True
) a conventional array is created on each (only one) processor, which however allows for the same broadcasting syntax as shown below.# Set up huge array: node, images, array = shared_array(2 ** 30, dtype=np.uint8) # Write data on one node: if comm.rank == 0: array[:] = 0 # Broadcast data to other nodes: if node.rank == 0: images.Bcast(array) # Wait if node.rank != 0: comm.Barrier()
- elphmod.MPI.shm_split(comm=<elphmod.MPI.Communicator object>, shared_memory=True)[source]
Create communicators for use with shared memory.
- Parameters:
- commMPI.Intracomm
Overarching communicator.
- shared_memorybool, default True
Use shared memory? Provided for convenience.
- Returns:
- nodeMPI.Intracomm
Communicator between processes that share memory (on the same node).
- imagesMPI.Intracomm
Communicator between processes that have the same
node.rank
.
Warning
If shared memory is not implemented, each process shares memory only with itself. A warning is issued.
Notes
Visualization for a machine with 2 nodes with 4 processors each:
________________ ________________ ________________ ________________ | comm.rank: 0 | comm.rank: 1 | comm.rank: 2 | comm.rank: 3 | | node.rank: 0 | node.rank: 1 | node.rank: 2 | node.rank: 3 | | images.rank: 0 | images.rank: 0 | images.rank: 0 | images.rank: 0 | |________________|________________|________________|________________| ________________ ________________ ________________ ________________ | comm.rank: 4 | comm.rank: 5 | comm.rank: 6 | comm.rank: 7 | | node.rank: 0 | node.rank: 1 | node.rank: 2 | node.rank: 3 | | images.rank: 1 | images.rank: 1 | images.rank: 1 | images.rank: 1 | |________________|________________|________________|________________|
Since both
node.rank
andimages.rank
are sorted bycomm.rank
,comm.rank == 0
is equivalent tonode.rank == images.rank == 0
.