Message Passing Interface (MPI)

Message Passing Interface (MPI) Jonathan Carroll-Nellenback CIRC Summer School

Background • MPI – message passing interface • Language independent communications protocol • MPI-1, MPI-2, and MPI-3 standards • Implementations typically consist of a specific set of routines callable from C, C++, or Fortran • MPICH - 3.1.2 (implements MPI – 3) • Open MPI – 1.8 (Implements MPI – 2) • IMPI – Intel MPI • MVAPICH, • Commercial implementations • Bindings for Python, Java, etc... http://web.eecs.utk.edu/~dongarra/WEB-PAGES/SPRING-2006/mpi-quick-ref.pdf

Outline • First MPI Program • /public/jcarrol5/mpi/example1.f90 • MPI_Init – initializes mpi • MPI_Comm_rank – gets the current task rank within the communicator (starting at 0) • MPI_Comm_size – gets the size of the communicator • MPI_Finalize – closes mpi • MPI_Reduce – Collective communication • MPI_Allreduce – Collective communication • Compiling • module load openmpi • mpif90 example1.f90 (mpicc example1.c) • srun –p debug –n 4 –o output_%ta.out

Exercise 1 • Parallelize exercise1.f90 or exercise1.c

Collective Communication Routines • /public/jcarrol5/mpi/example2.f90 • 1 to All • MPI_Bcast – Broadcasts the same data to all ranks • MPI_Scatter – Evenly Distributes data to all ranks • MPI_Scatterv – Unevenly distributes data to all ranks • All to 1 • MPI_Reduce – Performs a reduction operation towards a single rank • MPI_Gather – Collects evenly distributed data on one rank • MPI_Gatherv – Collects unevenly distributed data on one rank • All to All • MPI_Allreduce – Performs a reduction and broadcasts the result • MPI_Allgather – Collects evenly distributed data onto all ranks • MPI_Allgatherv – Collects unevenly distributed data onto all ranks • MPI_Alltoall – Scatter/Gather • MPI_Alltoallv – Scatterv/Gatherv

Important Constants • Reduction Operations – MPI_MAX, MPI_MIN, MPI_SUM, MPI_PROD, MPI_BAND, MPI_BOR, MPI_BXOR, MPI_LAND, MPI_LOR, MPI_LXOR • C Data types – MPI_CHAR, MPI_SHORT, MPI_INT, MPI_LONG, MPI_UNSIGNED_CHAR, MPI_UNSIGNED_SHORT, MPI_UNSIGNED, MPI_UNSIGNED_LONG, MPI_FLOAT, MPI_DOUBLE, MPI_LONG_DOUBLE, MPI_BYTE, MPI_PACKED • Fortran Data types – MPI_CHARACTER, MPI_INTEGER, MPI_REAL, MPI_LOGICAL, MPI_INTEGER1, MPI_INTEGER2, MPI_INTEGER4, MPI_REAL2, MPI_REAL4, MPI_REAL8, MPI_DOUBLE_PRECISION, MPI_COMPLEX, MPI_DOUBLE_COMPLEX, MPI_BYTE, MPI_PACKED

Exercise 2 • Parallelize exercise2.f90 or exercise2.c

Basic Sending and Receiving • /public/jcarrol5/mpi/example3.f90 • Tags – additional identifiers on messages • MPI_Send • MPI_Recv

Exercise 3 • Modify your program from exercise2 to not use any global communication routines.

Sending modes • Blocking vs Non-blocking • Non Blocking sends and receives will immediately return control to the calling routine. However, they usually will require buffering and testing later on to see whether the send/recv has completed. • Good for overlapping communication with computation • May lead to extra buffering • Synchronous vs Asynchronous • Synchronous sends require a matching recv to be called before returning. Blocking only if recv has not been posted. Does not require any additional buffering. • Buffered vsNonBuffered • Buffered sends explicitly buffer the data to be sent so that the calling routine can release the memory. • Ready send • Assumes that the receiver has already posted the recv.

Send Routines • /public/jcarrol5/mpi/example4.f90 • MPI_Send – May or may not block • MPI_Bsend – May buffer – returns immediately • MPI_Ssend – Synchronous Send (returns after matching recv posted) • MPi_Rsend – Ready send (matching recv must be posted) • MPI_Isend – Nonblocking send (must check for completion) • MPI_Ibsend – Nonblocking buffered send • MPI_Issend – Nonblocking synchronous send • MPI_Irsend - Nonblocking ready send • MPI_Recv – Blocking receive • MPI_IRecv – Nonblocking receive

Exercise 5 • Rewrite exercise 3 using ready sends(rsend), synchronous sends (ssend), and nonblocking sends (isend) and see if it is any faster.

Communicators and Groups • /public/jcarrol5/mpi/example5.f90 • MPI starts with one communicator (MPI_COMM_WORLD) • Separate communicator groups can be formed using • MPI_Comm_split • Or you can extract the group belonging to mpi_comm_world and create subgroups through various routines. • Multiple communicators can use the same group.

Message Passing Interface (MPI)