Message Passing Interface (MPI)

Message Passing Interface (MPI) Jonathan Carroll-Nellenback CIRC Summer School

Review • Global Communicator • MPI_COMM_WORLD • Global Communication Routines: • [All]Gather[v] • Scatter[v] • [All]Reduce[v] • Alltoall[v] • BCast • Barrier • Reduction Operators • MPI_[MAX,MIN,SUM,PROD], MPI[B,L][AND,OR,XOR] • Basic Data Types (put MPI_ in front of name of data type) • Fortran - MPI_[CHARACTER,INTEGER,REAL,LOGICAL,...] • C – MPI_[CHAR,SHORT,INT,LONG,...]

Types of MPI Arguments • Send Buffer – The starting address of the data to be sent • Send Count – The number of elements in the send buffer • Send Type – The type of elements in the send buffer • Recv Buffer – The starting address of the recv buffer • Recv Count – The number of elements to recv • Recv Type – The type of element to recv • Displacements – The offsets for Gatherv& Scatterv etc... • Tag – A message identifier • Root – The '1' in all-to-1 or 1-to-all communication • Dest – The destination for a point to point send • Source – The source for a point to point recv • Communicator – An independent collection of mpi tasks • Request – A handle to keep track of non-blocking sends or receives • Status – The status of a non-blocking send or any receive

Various way to parallelize Do Loops DO i=1,n a(i)=f(i) END DO a=0 DO i=rank+1,n,procs a(i)=f(i) END DO CALL MPI_Allreduce(MPI_IN_PLACE, a, n, MPi_REAL, MPI_SUM, MPI_COMM_WORLD, err) m=n/procs ALLOCATE(b(m)) DO i=1,m b(i)=f(m*rank+i) END DO CALL MPI_ALLGather(b, m, MPI_REAL, a, m, MPi_REAL, MPI_COMM_WORLD, err) DEALLOCATE(b)

Gather vs. GatherV m=n/procs ALLOCATE(b(m)) DO i=1,m b(i)=f(m*rank+i) END DO CALL MPI_ALLGather(b, m, MPI_REAL, a, m, MPi_REAL, MPI_COMM_WORLD, err) DEALLOCATE(b) m=n/procs rem=mod(n,procs) ALLOCATE(sizes(procs), displacements(procs+1)) sizes=(/(m+1,i=1,rem),(m,i=rem+1,procs)/) displacements=(/0,(sum(sizes(1:i)), i=1,procs)/) ALLOCATE(b(sizes(rank))) DO i=1,sizes(rank) b(i)=f(displacements(rank)+i) END DO CALL MPI_ALLGatherv(b, sizes(rank), MPI_REAL, a, sizes, displacements, & MPI_REAL, MPI_COMM_WORLD, err) DEALLOCATE(b,sizes,displacements)

Ahmdal's Law (The Hard Truth) Speed up S expected for a program run on n processors whereP is the fraction of the program that runs in parallel Glass half full Glass half empty

Measuring PerFormance • Fortran: • CALL MPI_WTIME(time) • C: • time=MPI_WTIME() • Measure the performance of exercise2p.f90 or exercise2p.c

Exercise 3 • Parallelize exercise3.f90 using an MPI_Reduce and measure the scaling with N=512, and N=1024 and 1, 4, and 16 procs.

Basic Sending and Receiving • /public/jcarrol5/mpi/example4.f90 • Tags – additional identifiers on messages • MPI_Send • MPI_Recv

Exercise 4 • Modify your program from exercise2 to only use point to point communication routines. (You can start with exercise2p.f90 or exercise2p.c)

Sending modes • Blocking vs Non-blocking • Non Blocking sends and receives will immediately return control to the calling routine. However, they usually will require buffering and testing later on to see whether the send/recv has completed. • Good for overlapping communication with computation • May lead to extra buffering • Synchronous vs Asynchronous • Synchronous sends require a matching recv to be called before returning. Blocking only if recv has not been posted. Does not require any additional buffering. • Buffered vsNonBuffered • Buffered sends explicitly buffer the data to be sent so that the calling routine can release the memory. • Ready send • Assumes that the receiver has already posted the recv.

Send Routines • /public/jcarrol5/mpi/example4.f90 • MPI_Send – May or may not block • MPI_Bsend – May buffer – returns immediately • MPI_Ssend – Synchronous Send (returns after matching recv posted) • MPi_Rsend – Ready send (matching recv must be posted) • MPI_Isend – Nonblocking send (must check for completion) • MPI_Ibsend – Nonblocking buffered send • MPI_Issend – Nonblocking synchronous send • MPI_Irsend - Nonblocking ready send • MPI_Recv – Blocking receive • MPI_IRecv – Nonblocking receive

Exercise 5 • Rewrite exercise 3 using ready sends(rsend), synchronous sends (ssend), and nonblocking sends (isend) and see if it is any faster.

Communicators and Groups • /public/jcarrol5/mpi/example5.f90 • MPI starts with one communicator (MPI_COMM_WORLD) • Separate communicator groups can be formed using • MPI_Comm_split • Or you can extract the group belonging to mpi_comm_world and create subgroups through various routines. • Multiple communicators can use the same group.

Message Passing Interface (MPI)