320 likes | 480 Views
Portable MPI and Related Parallel Development Tools. Rusty Lusk Mathematics and Computer Science Division Argonne National Laboratory (The rest of our group: Bill Gropp, Rob Ross, David Ashton, Brian Toonen, Anthony Chan). Outline . MPI What is it? Where did it come from?
E N D
Portable MPI and Related Parallel Development Tools Rusty Lusk Mathematics and Computer Science Division Argonne National Laboratory (The rest of our group: Bill Gropp, Rob Ross, David Ashton, Brian Toonen, Anthony Chan)
Outline • MPI • What is it? • Where did it come from? • One implementation • Why has it “succeeded”? • Case study: an MPI application • Portability • Libraries • Tools • Future developments in parallel programming • MPI development • Languages • Speculative approaches
What is MPI? • A message-passing library specification • extended message-passing model • not a language or compiler specification • not a specific implementation or product • For parallel computers, clusters, and heterogeneous networks • Full-featured • Designed to provide access to advanced parallel hardware for • end users • library writers • tool developers
Where Did MPI Come From? • Early vendor systems (NX, EUI, CMMD) were not portable. • Early portable systems (PVM, p4, TCGMSG, Chameleon) were mainly research efforts. • Did not address the full spectrum of message-passing issues • Lacked vendor support • Were not implemented at the most efficient level • The MPI Forum organized in 1992 with broad participation by vendors, library writers, and end users. • MPI Standard (1.0) released June, 1994; many implementation efforts. • MPI-2 Standard (1.2 and 2.0) released July, 1997.
Informal Status Assessment • All MPP vendors now have MPI-1. (1.0, 1.1, or 1.2) • Public implementations (MPICH, LAM, CHIMP) support heterogeneous workstation networks. • MPI-2 implementations are being undertaken now by all vendors. • MPI-2 is harder to implement than MPI-1 was. • MPI-2 implementations will appear piecemeal, with I/O first.
MPI Sources • The Standard itself: • at http://www.mpi-forum.org • All MPI official releases, in both postscript and HTML • Books on MPI and MPI-2: • Using MPI: Portable Parallel Programming with the Message-Passing Interface (2nd edition), by Gropp, Lusk, and Skjellum, MIT Press, 1999. • Using MPI-2: Extending the Message-Passing Interface, by Gropp, Lusk, and Thakur, MIT Press, 1999 • MPI: The Complete Reference, volumes 1 and 2, MIT Press, 1999. • Other information on Web: • at http://www.mcs.anl.gov/mpi • pointers to lots of stuff, including other talks and tutorials, a FAQ, other MPI pages
The MPICH Implementation of MPI • As a research project: exploring tradeoffs between performance and portability; conducting research in implementation issues. • As a software project: providing a free MPI implementation on most machines; enabling vendors and others to build complete MPI implementation on their own communication services. • MPICH 1.2.2 just released, with complete MPI-1, parts of MPI-2 (I/O and C++), port to Windows2000. • Available at http://www.mcs.anl.gov/mpi/mpich
Lessons From MPIWhy Has It Succeeded? • The MPI Process • Portability • Performance • Simplicity • Modularity • Composability • Completeness
The MPI Process • Started with open invitation to all those interested in standardizing message-passing model • Participation from • Parallel computing vendors • Computer Scientists • Application scientists • Open process • All invited, but hard work required • All deliberations available at all times • Reference implementation developed during design process • Helped debug design • Immediately available when design completed
Portability • Most important property of a programming model for high-performance computing • Application lifetimes 5 to 20 years • Hardware lifetimes much shorter • (not to mention corporate lifetimes!) • Need not lead to lowest common denominator approach • Example: MPI semantics allow direct copy of data from user space send buffer to user space receive buffer • Might be implemented by hardware data mover • Might be implemented by netwrk hardware • Might be implemented by socket • The hard part: portability with performance
Performance • MPI can help manage the crucial memory hierarchy • Local vs. remote memory is explicit • A received message is likely to be in cache • MPI provides collective operations for both communication and computation that hide complexity or non-portability of scalable algorithms from the programmer. • Can interoperate with optimising compilers • Promotes use of high-performance libraries • Doesn’t provide performance portability • This problem is still too hard, even for the best compilers • E.g. BLAS
Simplicity • Simplicity is in the eye of the beholder • MPI-1 has about 125 functions • Too big! • Too small! • MPI-2 has about 150 more • Even this is not very many by comparison • Few applications use all of MPI • But few MPI functions go unused • One can write serious MPI programs with as few as six functions • Other programs with a different six… • Economy of concepts • Communicators encapsulate both process groups and contexts • Datatypes both enable heterogeneous communication and allow non-contiguous messages buffers • Symmetry helps make MPI easy to understand.
Modularity • Modern applications often combine multiple parallel components. • MPI supports component-oriented software through its use of communicators • Support of libraries means applications may contain no MPI calls at all.
Composability • MPI works with other tools • Compilers • Since it is a library • Debuggers • Debugging interface used by MPICH, TotalView, others • Profiling tools • The MPI “profiling interface” is part of standard • MPI-2 provides precise interaction with multi-threaded programs • MPI_THREAD_SINGLE • MPI_THREAD_FUNNELLED (OpenMP loops) • MPI_THREAD_SERIAL (Open MP single) • MPI_THREAD_MULTIPLE • The interface provides for both portability and performance
Completeness • MPI provides a complete programming model. • Any parallel algorithm can be expressed. • Collective operations operate on subsets of processes. • Easy things are not always easy, but • Hard things are possible.
The Center for the Study of Astrophysical Thermonuclear Flashes • To simulate matter accumulation on the surface of compact stars, nuclear ignition of the accreted (and possibly underlying stellar) material, and the subsequent evolution of the star’s interior, surface, and exterior • X-ray bursts (on neutron star surfaces) • Novae (on white dwarf surfaces) • Type Ia supernovae (in white dwarf interiors)
FLASH Scientific Results • Wide range of compressibility • Wide range of length and time scales • Many interacting physical processes • Only indirect validation possible • Rapidly evolving computing environment • Many people in collaboration Flame-vortex interactions Compressible turbulence Laser-driven shock instabilities Nova outbursts on white dwarfs Richtmyer-Meshkov instability Cellular detonations Helium burning on neutron stars Rayleigh-Taylor instability Gordon Bell prize at SC2000
The FLASH Code: MPI in Action • Solves complex systems of equations for hydrodynamics and nuclear burning • Written primarily in Fortran-90 • Uses Paramesh library for adaptive mesh refinement; Paramesh is implemented with MPI • I/O (for checkpointing, visualization, other purposes) done with HDF-5 library, which is implemented with MPI-IO • Debugged with TotalView, using standard debugger interface • Tuned with Jumpshot and Vampir, using MPI profiling interface • Gordon Bell prize winner in 2000 • Portable to all parallel computing environments (since MPI)
Processes Logfile Jumpshot Display MPI Performance Visualization with Jumpshot • For detailed analysis of parallel program behavior, timestamped events are collected into a log file during the run. • A separate display program (Jumpshot) aids the user in conducting a post mortem analysis of program behavior. • Log files can become large, making it impossible to inspect the entire program at once. • The FLASH Project motivated an indexed file format (SLOG) that uses a preview to select a time of interest and quickly display an interval.
Using Jumpshot • MPI functions and messages automatically logged • User-defined states • Nested states • Zooming and scrolling • Spotting opportunities for optimization
Future Developments in Parallel Programming: MPI and Beyond • MPI not perfect • Any widely-used replacement will have to share the properties that made MPI a success. • Some directions (in decreasing order of speculativeness) • Improvements to MPI implementations • Improvements to the MPI definition • Continued evolution of libraries • Research and development for parallel languages • Further out: radically different programming models for radically different architectures.
MPI Implementations • Implementations beget implementation research • Datatypes, I/O, memory motion elimination • On most platforms, better collective • Most MPI implementations build collective on point-to-point, too high-level • Need stream-oriented methods that understand MPI datatypes • Optimize for new hardware • In progress for VIA, Infiniband • Need more emphasis on collective operations • Off-loading message processing onto NIC • Scaling beyond 10,000 processes • Parallel I/O • Clusters • Remote • Fault-tolerance • Intercommunicators provide an approach • Working with multithreading approaches
Improvements to MPI Itself • Better Remote-memory-access interface • Simpler for some simple operations • Atomic fetch-and-increment • Some minor fixup already in progress • MPI 2.1 • Building on experience with MPI-2 • Interactions with compilers
Libraries and Languages • General Libraries • Global Arrays • PETSc • ScaLAPACK • Application-specific libraries • Most built on MPI, at least for portable version.
More Speculative Approaches • HTMT for Petaflops • Blue Gene • PIMS • MTA • All will need a programming model that explicitly manages a deep memory hierarchy. • Exotic + small benefit = dead
Summary • MPI is a successful example of a community defining, implementing, and adopting a standard programming methodology. • It happened because of the open MPI process, the MPI design itself, and early implementation. • MPI research continues to refine implementations on modern platforms, and this is the “main road” ahead. • Tools that work with MPI programs are thus a good investment. • MPI provides portability and performance for complex applications on a variety of architectures.