1 / 33

Lecture 6: Message Passing Interface (MPI)

Lecture 6: Message Passing Interface (MPI). Parallel Programming Models. Message Passing Model Used on Distributed memory MIMD architectures Multiple processes execute in parallel asynchronously Process creation may be static or dynamic

merrill
Download Presentation

Lecture 6: Message Passing Interface (MPI)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 6:Message Passing Interface (MPI)

  2. Parallel Programming Models Message Passing Model • Used on Distributed memory MIMD architectures • Multiple processes execute in parallel asynchronously • Process creation may be static or dynamic • Processes communicate by using send and receive primitives

  3. Parallel Programming Models Example: Pi calculation P = f01 f(x) dx = f014/(1+x2) dx = w∑f(xi) f(x) = 4/(1+x2) n = 10 w = 1/n xi = w(i-0.5) f(x) x 0 0.1 0.2 xi1

  4. Parallel Programming Models Sequential Code #define f(x) 4.0/(1.0+x*x); main(){ int n,i; float w,x,sum,pi; printf(“n?\n”); scanf(“%d”, &n); w=1.0/n; sum=0.0; for (i=1; i<=n; i++){ x=w*(i-0.5); sum += f(x); } pi=w*sum; printf(“%f\n”, pi); } f(x) x 0 0.1 0.2 xi1 P = w ∑ f(xi) f(x) = 4/(1+x2) n = 10 w = 1/n xi = w(i-0.5)

  5. Message-Passing Interface (MPI) http://www.mpi-forum.org

  6. SPMD Parallel MPI Code #include <stdio.h> #include <mpi.h> #define f(x) 4.0/(1.0+x*x) main(int argc, char * argv[]){ int myid, nproc, root, err; int n, i, start, end; float w, x, sum, pi; err = MPI_Init(&argc, &argv); if (err != MPI_SUCCESS) { printf(stderr, “initialization error\n”); exit(1); } MPI_Comm_size(MPI_COMM_WORLD, &nproc); MPI_Comm_rank(MPI_COMM_WORLD, &myid); root=0; if (myid == root) { f1=fopen(“indata”, “r”); fscanf(f1, “%d”, &n); fclose(f1); } MPI_Bcast(&n, 1, MPI_INT, root, MPI_COMM_WORLD); w=1.0/n; sum=0.0; start = myid*(n/nproc); end = (myid+1)*(n/nproc); for (i=start; i<end; i++){ x = w*(i-0.5); sum += f(x); } MPI_Reduce(&sum, &pi, MPI_FLOAT, MPI_SUM, root, MPI_COMM_WORLD); if (myid == root) { f1=fopen(“outdata”, “w”); fprintf(f1, “pi=%f”, &pi); fclose(f1); } MPI_Finalize(); }

  7. Message-Passing Interface (MPI) • MPI_INIT(int *argc, char ***argv): Initiate an MPI computation. • MPI_FINALIZE(): Terminate a computation. • MPI_COMM_SIZE (comm, size): Determine number of processes. • MPI_COMM_RANK(comm, pid): Determine my process identifier. • MPI_SEND(buf, count, datatype, dest, tag, comm): Send a message. • MPI_RECV(buf, count, datatype, source, tag, comm, status): Receive a message. • tag: message tag or MPI_ANY_TAG • source: process id of source process or MPI_ANY_SOURCE

  8. Message-Passing Interface (MPI) Deadlock: • MPI_SEND and MPI_RECV are blocking. Consider the program where the two processes exchange data: ... if (rank .eq. 0) then call mpi_send( abuf, n, MPI_INTEGER, 1, 0, MPI_COMM_WORLD, ierr ) call mpi_recv( buf, n, MPI_INTEGER, 1, 0, MPI_COMM_WORLD, &status, ierr ) else if (rank .eq. 1) then call mpi_send( abuf, n, MPI_INTEGER, 1, 0, MPI_COMM_WORLD, ierr ) call mpi_recv( buf, n, MPI_INTEGER, 1, 0, MPI_COMM_WORLD, &status, ierr ) endif

  9. Message-Passing Interface (MPI) Communicators • If two processes use different contexts for communication, there can be no danger of their communication being confused. • Each MPI communicator contains a separate communication context; this defines a separate virtual communication space. • Communicator Handle: identifies the process group and context with respect to which the operation is to be performed • MPI_COMM_WORLD: contains all the processes in a parallel computation

  10. Message-Passing Interface (MPI) Collective Operations These operations are all executed in a collective fashion, meaning that each process in a process group calls the communication routine • Barrier:Synchronize all processes. • Broadcast:Send data from one process to all processes. • Gather: Gather data from all processes to one process. • Scatter: Scatter data from one process to all processes. • Reduction operations: addition, multiplication, etc. of distributed data.

  11. Message-Passing Interface (MPI) Collective Operations • Barrier (comm):Synchronize all processes

  12. Message-Passing Interface (MPI) Collective Operations • MPI_BCAST (inbuf, incnt, intype, root, comm): 1-to-all Ex:MPI_BCAST(A, 5, MPI_INT, 0, MPI_COMM_WORLD); A0 A1 A2 A3 A4 A0 A1 A2 A3 A4 P0 A0 A1 A2 A3 A4 P0 A0 A1 A2 A3 A4 P1 A0 A1 A2 A3 A4 P2 P3

  13. Message-Passing Interface (MPI) Collective Operations • MPI_SCATTER (inbuf, incnt, intype, outbuf, outcnt, outtype, root, comm): 1-to-all Ex:int A[100], B[25]; MPI_SCATTER(A, 25, MPI_INT, B, 25, MPI_INT, 0, MPI_COMM_WORLD); A A0 A1 A2 A3 B A0 P0 P0 P1 A1 P2 A2 P3 A3

  14. Message-Passing Interface (MPI) Collective Operations • MPI_GATHER (inbuf, incnt, intype, outbuf, outcnt, outtype, root, comm): all-to-1 Ex:int A[100], B[25]; MPI_GATHER(B, 25, MPI_INT, A, 25, MPI_INT, 0, MPI_COMM_WORLD); B B0 A B0 B1 B2 B3 P0 P0 B1 P1 B2 P2 B3 P3

  15. Message-Passing Interface (MPI) Collective Operations • Reduction operations: Combine the values in the input buffer of each process using an operator Operations: • MPI_MAX, MPI_MIN • MPI_SUM, MPI_PROD • MPI_LAND, MPI_LOR, MPI_LXOR (logical) • MPI_BAND, MPI_BOR, MPI_BXOR (bitwise)

  16. Message-Passing Interface (MPI) Collective Operations • MPI_REDUCE (inbuf, outbuf, count, type, op, root, comm) • Returns the combined value to the output buffer of a single root process Ex:int A[2], B[2]; MPI_REDUCE(A, B, 2, MPI_INT, MPI_MIN, 0, MPI_COMM_WORLD); A 2 4 A 2 4 B 0 2 P0 5 7 0 3 6 2 P0 min 5 7 P1 B 0 2 0 3 P2 6 2 P3

  17. Message-Passing Interface (MPI) Collective Operations • MPI_ALLREDUCE (inbuf, outbuf, count, type, op, comm) • Returns the combined value to the output buffers of all processes Ex:int A[2], B[2]; MPI_ALLREDUCE(A, B, 2, MPI_INT, MPI_MIN, 0, MPI_COMM_WORLD); A 2 4 B 0 2 A 2 4 P0 P0 5 7 0 3 6 2 min 5 7 P1 P1 0 2 B 0 2 0 3 P2 P2 0 2 6 2 P3 P3 0 2

  18. Message-Passing Interface (MPI) Asynchronous Communication • Data is distributed among processes which must then poll periodically for pending read and write requests • Local computation may interleave with the processing of incoming messages Non-blocking send/receive • MPI_ISEND (buf, count, datatype, dest, tag, comm): Send a message. • MPI_IRECV (buf, count, datatype, source, tag, comm, status): Receive a message. • MPI_WAIT (MPI_Request *request, MPI_Status *status): Complete a non-blocking operation

  19. Message-Passing Interface (MPI) Asynchronous Communication • MPI_IPROBE (source, tag, comm, flag, status): Polls for a pending message without receiving it, and sets a flag. The message can then be received by using MPI_RECV. • MPI_PROBE (source, tag, comm, status): Blocks until the message is available. • MPI_GET_COUNT (status, datatype, count): Determines size of the message. • status (must be set by a previous probe): • status.MPI_SOURCE • status.MPI_TAG

  20. Message-Passing Interface (MPI) Asynchronous Communication Ex: int count, *buf, source; MPI_PROBE (MPI_ANY_SOURCE, 0, MPI_COMM_WORLD, &status); source = status.MPI_SOURCE; MPI_GET_COUNT(status, MPI_INT, count); buf = malloc(count*sizeof(int)); MPI_RECV (buf, count, MPI_INT, source, 0, MPI_COMM_WORLD, &status);

  21. Message-Passing Interface (MPI) Communicators • Communicator Handle: identifies the process group and context with respect to which the operation is to be performed • MPI_COMM_WORLD: contains all the processes in a parallel computation (default) • New communicators are formed by either including or excluding processes from an existing communicator. • MPI_COMM_SIZE() : Determine number of processes. • MPI_COMM_RANK() : Determine my process identifier.

  22. Message-Passing Interface (MPI) Communicators • MPI_COMM_DUP (comm, newcomm): creates a new handle for the same process group • MPI_COMM_SPLIT (comm, color, key, newcomm): creates a new handle for a subset of a given process group • MPI_INTERCOMM_CREATE (comm, leader, peer, rleader, tag, inter): links processes in two groups • MPI_COMM_FREE (comm): destroys a handle

  23. Message-Passing Interface (MPI) Communicators Ex: Two processes communicating with a new handle MPI_COMM newcomm; MPI_COMM_DUP (MPI_COMM_WORLD, newcomm); if (myid == 0) MPI_SEND (A, 100, MPI_INT, 1, 0, newcomm); else MPI_RECV (A, 100, MPI_INT, 0, 0, newcomm); MPI_COMM_FREE (newcomm);

  24. Message-Passing Interface (MPI) Communicators Ex: Creating a new group with 4 members MPI_COMM comm, newcomm; int myid, color; ... MPI_COMM_RANK (comm, &myid); if (myid<4) color=1; else color=MPI_UNDEFINED; MPI_COMM_SPLIT (comm, color, myid, &newcomm); MPI_SCATTER (A, 10, MPI_INT, B, 10, MPI_INT, 0, newcomm); Processes:P0 P1 P2 P3 P4 P5 P6 P7 Ranks in comm: 0 1 2 3 4 5 6 7 Color: 1 1 1 1 ? ? ? ? Ranks in newcomm: 0 1 2 3

  25. Message-Passing Interface (MPI) Communicators Ex: Splitting processes into 3 independent groups MPI_COMM comm, newcomm; int myid, color; ... MPI_COMM_RANK (comm, &myid); color = myid % 3; MPI_COMM_SPLIT (comm, color, myid, &newcomm); Processes:P0 P1 P2 P3 P4 P5 P6 P7 Ranks in comm: 0 1 2 3 4 5 6 7 Color: 01201201 Ranks in newcomm:0 1 20 1 20 1

  26. Message-Passing Interface (MPI) Communicators MPI_INTERCOMM_CREATE (comm, local_leader, peer_comm, remote_leader, tag, intercomm): links processes in two groups • comm: intracommunicator (within group) • local_leader: leader within the group • peer_comm: parent communicator • remote_leader: other groups’ leader within the parent communicator

  27. Message-Passing Interface (MPI) Communicators Ex: Communication of processes in two different groups MPI_COMM newcomm, intercomm; int myid, color; ... MPI_COMM_SIZE (MPI_COMM_WORLD, &count); if (count % 2 == 0){ MPI_COMM_RANK (MPI_COMM_WORLD, &myid); color = myid % 2; MPI_COMM_SPLIT (MPI_COMM_WORLD, color, myid, &newcomm); MPI_COMM_RANK (newcomm, &newid); if (newid % 2 == 0){ // group 0 MPI_INTERCOMM_CREATE(newcomm, 0, MPI_COMM_WORLD, 1, 99, intercomm); MPI_SEND (msg, 1, type, newid, 0, intercomm); } else { // group 1 MPI_INTERCOMM_CREATE(newcomm, 0, MPI_COMM_WORLD, 0, 99, intercomm); MPI_RECV (msg, 1, type, newid, 0, intercomm, &status); } } MPI_COMM_FREE (intercomm); MPI_COMM_FREE (newcomm); P0 P1 P2 P3 P4 P5 P6 P7 remote_leader local_leader destination local_leader remote_leader

  28. Message-Passing Interface (MPI) Communicators Ex: Communication of processes in two different groups Processes:P0 P1 P2 P3 P4 P5 P6 P7 Rank in MPI_COMM_WORLD: 0 1 2 3 4 5 6 7 Processes:P0 P2 P4 P6 P1 P3 P5 P7 Rank in MPI_COMM_WORLD: 0 2 4 6 1 3 5 7 Rank in newcomm:0 1 2 30 1 2 3 newcomm newcomm local_leader local_leader remote_leader remote_leader

  29. Message-Passing Interface (MPI) Derived Types Allow noncontiguous data elements to be grouped together in a message. Constructor functions: • MPI_TYPE_CONTIGUOUS (): constructs data type from contiguous elements • MPI_TYPE_VECTOR (): constructs data type from blocks separated by stride • MPI_TYPE_INDEXED (): constructs data type with variable indices and sizes • MPI_TYPE_COMMIT (): commit data type so that it can be used in communication • MPI_TYPE_FREE (): used to reclaim storage

  30. Message-Passing Interface (MPI) Derived Types • MPI_TYPE_CONTIGUOUS (count, oldtype, newtype): constructs data type from contiguous elements Ex: MPI_TYPE_CONTIGUOUS (10, MPI_REAL, &newtype); • MPI_TYPE_VECTOR (count, blocklength, stride, oldtype, newtype): constructs data type from blocks separated by stride Ex: MPI_TYPE_VECTOR (5, 1, 4, MPI_FLOAT, &floattype); A Memory

  31. Message-Passing Interface (MPI) Derived Types • MPI_TYPE_INDEXED (count, blocklengths, indices, oldtype, newtype): constructs data type with variable indices and sizes Ex: MPI_TYPE_INDEXED (3, BLenghts, Indices, MPI_INT, &newtype); Data 0 1 2 3 4 5 6 7 8 9 10 Blengths 2 3 1 Indices 1 5 10 Block 1 Block 2 Block 0

  32. Message-Passing Interface (MPI) Derived Types • MPI_TYPE_COMMIT (type): commit data type so that it can be used in communication • MPI_TYPE_FREE (type): used to reclaim storage

  33. Message-Passing Interface (MPI) Derived Types Ex: MPI_TYPE_INDEXED (3, BLenghts, Indices, MPI_INT, &newtype); MPI_TYPE_COMMIT (&newtype); MPI_SEND (A, 1, newtype, dest, 0, MPI_COMM_WORLD); MPI_TYPE_FREE (newtype); A 0 1 2 3 4 5 6 7 8 9 10 Blengths 2 3 1 Indices 1 5 10 Block 1 Block 2 Block 0

More Related