MPI Basics

Learn about message passing interface (MPI) basics, specifications, and libraries with examples in C, C++, and FORTRAN for writing efficient parallel programs. Understand MPI initialization and termination processes for effective parallelization.

MPI Basics

  2. MPI • MPI = Message Passing Interface • Specification of message passing libraries for developers and users • Not a library by itself, but specifies what such a library should be • Specifies application programming interface (API) for such libraries • Many libraries implement such APIs on different platforms – MPI libraries • Goal: provide a standard for writing message passing programs • Portable, efficient, flexible • Language binding: C, C++, FORTRAN programs

  3. The Program #include <stdio.h> #include <string.h> #include "mpi.h" main(int argc, char* argv[]) { int my_rank; /* rank of process */ int p; /* number of processes */ int source; /* rank of sender */ int dest; /* rank of receiver */ int tag = 0; /* tag for messages */ char message[100]; /* storage for message */ MPI_Status status; /* return status for */ /* receive */ /* Start up MPI */ MPI_Init(&argc, &argv); /* Find out process rank */ MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);

  4. The Program /* Find out number of processes */ MPI_Comm_size(MPI_COMM_WORLD, &p); if (my_rank != 0) { /* Create message */ sprintf(message, "Greetings from process %d!", my_rank); dest = 0; /* Use strlen+1 so that '\0' gets transmitted */ MPI_Send(message, strlen(message)+1, MPI_CHAR, dest, tag, MPI_COMM_WORLD); } else { /* my_rank == 0 */ for (source = 1; source < p; source++) { MPI_Recv(message, 100, MPI_CHAR, source, tag, MPI_COMM_WORLD, &status); printf("%s\n", message); } } /* Shut down MPI */ MPI_Finalize(); } /* main */

  5. General MPI programs #include <mpi.h> main( int argc, char** argv ) { MPI_Init( &argc, &argv ); /* main part of the program */ /* Use MPI function call depend on your data partitioning and the parallelization architecture */ MPI_Finalize(); }

  6. MPI Basics • MPI’s pre-defined constants, function prototypes, etc., are included in a header file. This file must be included in your code wherever MPI function calls appear (in “main” and in user subroutines/functions) : • #include “mpi.h” for C codes • #include “mpi++.h” * for C++ codes • include “mpif.h” for f77 and f9x codes • MPI_Init must be the first MPI function called • Terminates MPI by calling MPI_Finalize • These two functions must only be called once in user code.

  8. MPI Basics • C is case-sensitive language. MPI function names always begin with “MPI_”, followed by specific name with leading character capitalized, e.g., MPI_Comm_rank. MPI pre-defined constant variables are expressed in upper case characters, e.g.,MPI_COMM_WORLD.

  9. Basic MPI Datatypes MPI datatypeC datatype MPI_CHAR signed char MPI_SIGNED_CHAR signed char MPI_UNSIGNED_CHAR unsigned char MPI_SHORT signed short MPI_UNSIGNED_SHORT unsigned short MPI_INT signed int MPI_UNSIGNED unsigned int MPI_LONG signed long MPI_UNSIGNED_LONG unsigned long MPI_FLOAT float MPI_DOUBLE double MPI_LONG_DOUBLE long double

  10. MPI is Simple • Many parallel programs can be written using just these six functions, only two of which are non-trivial: • MPI_Init • MPI_Finalize • MPI_COMM_size • MPI_COMM_rank • MPI_Send • MPI_Recv

  11. Initialization • Initialization: MPI_Init()initializes MPI environment • Must be called before any other MPI routine (so put it at the beginning of code) • Can be called only once; subsequent calls are erroneous. int MPI_Init(int *argc, char ***argv)

  12. Termination • MPI_Finalize() cleans up MPI environment • Must be called before exits. • No other MPI routine can be called after this call, even MPI_Init()

  14. Processes • MPI is process-oriented: program consists of multiple processes, each corresponding to one processor. • MIMD: Each process runs its own code. In practice, runs its own copy of the same code (SPMD).. • MPI processes are identified by their ranks: • If total nprocs processes in computation, rank ranges from 0, 1, …, nprocs-1. • nprocs does not change during computation.

  15. Communicators • Communicator: is a group of processes that can communicate with one another. • Most MPI routines require a communicator argument to specify the collection of processes the communication is based on. • All processes in the computation form the communicator MPI_COMM_WORLD. • MPI_COMM_WORLD is pre-defined by MPI, available anywhere • Can create subgroups/subcommunicators within MPI_COMM_WORLD. • A process may belong to different communicators, and have different ranks in different communicators.

  16. Size and Rank • Number of processors: MPI_COMM_size() • Which processor: MPI_COMM_rank() • Can compute data decomposition etc. • Know total number of grid points, total number of processors and current processor id; can calculate which portion of data current processor is to work on. • Ranks also used to specify source and destination of communications. int my_rank, ncpus; MPI_Comm_rank(MPI_COMM_WORLD, &my_rank); MPI_Comm_size(MPI_COMM_WORLD, &ncpus);

  17. Compile and Run Program • Compile the MPI program mpicc –o greetings greetings.c • After compiling, a executable file greetings is generated. • If running on the head node mpirun –np 4 ./greetings Greetings from process 1! Greetings from process 2! Greetings from process 3! This is NOT allowed in HPC supercomputers.

  18. PBS scripts • PBS: Portable Batch System • A cluster is shared with others • Need to use a job submission system • PBS will allocate the job to some other computer, log in as the user, and execute it • Useful Commands • qsub : submits a job • qstat : monitors status • qdel : deletes a job from a queue

  19. A Job with PBS scripts vi myjob1 #!/bin/bash #PBS -N job1 #PBS -q production #PBS -l select=4:ncpus=1 #PBS -l place=free #PBS -V cd $PBS_O_WORKDIR mpirun -np 4 -machinefile $PBS_NODEFILE ./greetings

  20. Submit Jobs • Submit the job qsub myjob1 283724.service0 • Check the job status Qstat PBS Pro Server andy.csi.cuny.edu at CUNY CSI HPC Center Job id Name User Time Use S Queue ---------------- ---------------- ---------------- -------- - ----- 276540.service0 methane_g09 michael.green 10265259 R qlong8_gau 276544.service0 methane_g09 michael.green 10265100 R qlong8_gau 277189.service0 BEAST_serial edward.myers 2373:38: R qserial 277828.service0 2xTDR e.sandoval 0 H qlong16_qdr

  21. Submit Jobs • See the output cat job1.o283724 Greetings from process 1! Greetings from process 2! Greetings from process 3! • See the error file Cat job1.e283724

  22. PBS scripts production is the normal queue for processing your work N <job_name> The user must assign a name to each job they run. Names can be up to 15 alphanumeric characters in length. -l select=<chunks>: A chunk is a collection of resources (cores, memory, disk space etc…). -l ncpus=<cpus> The number of cpus (or cores) that the user wants to use on a node. -l mem=<mem>mb This parameter is optional. It specifies how much memory is needed per chunk. If not included, PBSpro assumes a default memory size on a per cpu (core) basis. -l ngpus=<gpus> The number of graphics processing units that the user wants to use on a node (This parameter is only available on Penzias).

  23. PBS scripts • -l place=<placement> This parameter tells PBSpro how to distribute requested chunks of resources across nodes. placement can take one of three values: free, scatter or pack. • If you select free, PBSpro will place your job chunks on any nodes that have the required number of available resources. • If you select scatter, PBSpro will schedule your job so that only one chunk is taken from any virtual compute node. • If you select pack, PBSpro will only schedule your job to take all the requested chunks from one node (and if no such node is available job will be queued up)

