1 / 20

Parallel MxN Communication Using MPI-I

merv
Download Presentation

Parallel MxN Communication Using MPI-I

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. Parallel MxN Communication Using MPI-I/O Felipe Bertrand, Yongquan Yuan, Kenneth Chiu, Randy Bramley Department Of Computer Science, Indiana University Supported by NSF Grants 0116050 and EIA-0202048 Department of Energy's Office of Science SciDAC grants

    2. Motivation Interesting problems span multiple regimes of time, space, and physics. Multiphysics, multidiscipline: Climate models Combustion models Fusion simulation Protein simulation Climate models: the biggest supercomputer (at least outside of NSA) is being built in Japan and used for weather simulation. This topic was very present in the SuperComputing Conference 2002. Combustion modes: rockets, explosives. Fusion simulation: there is a huge international effort to build the next generation of fusion reactors. Expect to sustain fusion for a few minutes. Protein simulation: Find the shape of a protein from the sequence of aminoacids (DNA -> RNA (sequence of codons = 3 nucleotid bases) -> proteins). Our task as computer scientists in scientific computation is Climate models: the biggest supercomputer (at least outside of NSA) is being built in Japan and used for weather simulation. This topic was very present in the SuperComputing Conference 2002. Combustion modes: rockets, explosives. Fusion simulation: there is a huge international effort to build the next generation of fusion reactors. Expect to sustain fusion for a few minutes. Protein simulation: Find the shape of a protein from the sequence of aminoacids (DNA -> RNA (sequence of codons = 3 nucleotid bases) -> proteins). Our task as computer scientists in scientific computation is

    3. Community Climate System Model (CCSM) CCSM models are parallel but the inter-model communication is serial CCSM is an evolution of a number of projects A fully-coupled, global climate model that provides state-of-the-art computer simulations of the Earth's past, present, and future climate states. Based on a framework which divides the complete climate system into component models connected by a coupler. This design requires four components: atmosphere, land, ocean, and ice In development by the National Center for Atmospheric Research since 1984. Resolution is very poor: there is a lot of room for improvement. Atmosphere and land grid: High resolution grid (T42): 128x64 = 8,192 points. Resolution is 300km at equator (2.8 degree). 26 levels in the vertical (total points: 213k). Ocean and Ice grid: High resolution grid (gx1v3): 320x384 = 123 kpoints. Resolution is 1 degree longitudinal (constant) and 0.3 latitudinal at equator. 40 levels in the vertical, 10 to 250 meters thick (total points: 5M). Component have to communicate state variables (for example the atm component has to communicate to the ocean component the pressure on the surface of the sea; maybe that has some relevance to evaporation) and fluxes (like heat, or salt). Albedo: amount of radiation that is reflected by a surface. This is what is more important to us: the architectural design of the application. We can see components, processes, and communication links. The coupler does more work that just being a hub of communication. It does data interpolation between the grids and guarantees the conservation of the fluxes (if that much heat is emitted, then the couples guarantees that the target grid will get as much heat). This picture suggests that a parallel communication scheme will be very beneficial. Things I would ponder if I had to make the decision of migrating to a parallel communication scheme: load of communication versus computation (also: if we make communication cheaper maybe we can benefit from more frequent communication), effort to maintain the parallel communication infrastructure (this is an management decision) including training and the cost of software dependencies. CCSM is an evolution of a number of projects A fully-coupled, global climate model that provides state-of-the-art computer simulations of the Earth's past, present, and future climate states. Based on a framework which divides the complete climate system into component models connected by a coupler. This design requires four components: atmosphere, land, ocean, and ice In development by the National Center for Atmospheric Research since 1984. Resolution is very poor: there is a lot of room for improvement. Atmosphere and land grid: High resolution grid (T42): 128x64 = 8,192 points. Resolution is 300km at equator (2.8 degree). 26 levels in the vertical (total points: 213k). Ocean and Ice grid: High resolution grid (gx1v3): 320x384 = 123 kpoints. Resolution is 1 degree longitudinal (constant) and 0.3 latitudinal at equator. 40 levels in the vertical, 10 to 250 meters thick (total points: 5M). Component have to communicate state variables (for example the atm component has to communicate to the ocean component the pressure on the surface of the sea; maybe that has some relevance to evaporation) and fluxes (like heat, or salt). Albedo: amount of radiation that is reflected by a surface. This is what is more important to us: the architectural design of theapplication. We can see components, processes, and communication links.The coupler does more work that just being a hub of communication. It doesdata interpolation between the grids and guarantees the conservation of thefluxes (if that much heat is emitted, then the couples guarantees that the target grid will get as much heat). This picture suggests that a parallel communication scheme will be verybeneficial. Things I would ponder if I had to make the decision of migrating to a parallel communication scheme: load of communication versus computation(also: if we make communication cheaper maybe we can benefit from more frequent communication), effort to maintain the parallel communicationinfrastructure (this is an management decision) including training and the costof software dependencies.

    4. Existing Approaches Refactor codes: integrate all components into a single, much larger program More efficient data sharing Large time investment in rewriting program Closer coordination between teams: more costly in development, testing and deployment Component models Simplify the overall application development. Complicate the efficient coordination and sharing of data between components (MxN problem). Closer coordination between teams:Closer coordination between teams:

    5. Goals Scalably connect codes to create new multiphysics simulations Target existing and currently evolving codes Created by teams with disparate research interests Spanning multiple time scales, spatial domains, and disciplines Rapid prototyping/testing without extensive rewriting Use standard APIs and paradigms familiar to most application area scientists (MPI I/O) Climate models: the biggest supercomputer (at least outside of NSA) is being built in Japan and used for weather simulation. This topic was very present in the SuperComputing Conference 2002. Combustion modes: rockets, explosives. Fusion simulation: there is a huge international effort to build the next generation of fusion reactors. Expect to sustain fusion for a few minutes. Protein simulation: Find the shape of a protein from the sequence of aminoacids (DNA -> RNA (sequence of codons = 3 nucleotid bases) -> proteins). Our task as computer scientists in scientific computation is Climate models: the biggest supercomputer (at least outside of NSA) is being built in Japan and used for weather simulation. This topic was very present in the SuperComputing Conference 2002. Combustion modes: rockets, explosives. Fusion simulation: there is a huge international effort to build the next generation of fusion reactors. Expect to sustain fusion for a few minutes. Protein simulation: Find the shape of a protein from the sequence of aminoacids (DNA -> RNA (sequence of codons = 3 nucleotid bases) -> proteins). Our task as computer scientists in scientific computation is

    6. The MxN Problem Transfer data from a parallel program running on M processors to another running on N processors. M and N may differ May require complex all-to-all communications, data redistribution

    7. Solving the MxN Problem Existing solutions Use process 0 on all components Used by CCSM model Not scalable Read/Write through files Scalable if parallel I/O used Slow because involves hard drive read/write Our solution Use MPI I/O interface, create middleware to transfer data via network Treat application codes as software components Provide easy migration path for existing applications

    8. Solving the MxN Problem MPI-I/O defines an API for parallel I/O using file-like semantics. ROMIO is an implementation of MPI-IO. Provides an abstract device interface (ADIO) that allows different physical I/O mechanisms to be plugged in.

    9. MxN MPI-IO Communication Application level: Components live in different MPI instances Transparency: Not aware of the MxN communication Reads and writes data through regular MPI interface. No change in the source code is required: Switch to MxN backend with filename prefix mxn: Communication can be established between different MPI ROMIO-based implementations. I am presenting now one solution to the MxN problem developed by us. This solution is a back-end to an MPI-IO implementation (MPI is a message passing library, a library for communication). Explain the picture. The design goals were those shown on the slide. Also: allows extreme decoupling for debugging and testing. Also: no new paradigm (easy to learn) Also: easy migration from current file-based applications (some only require relinking and changing a file name, others might need to change the access pattern to the file)I am presenting now one solution to the MxN problem developed by us. This solution is a back-end to an MPI-IO implementation (MPI is a message passing library,a library for communication). Explain the picture. The design goals were those shown on the slide. Also: allows extreme decoupling for debugging and testing. Also: no new paradigm (easy to learn) Also: easy migration from current file-based applications (some only requirerelinking and changing a file name, others might need to change the accesspattern to the file)

    10. MxN MPI-IO Communication MxN backend: Logical serialization: intuitive paradigm. Parallel implementation: high performance. The interface between the components is the format of the (virtual) file. The abstraction is that of a stream file, like writing to a tape. There is not random access supported, because the data is released as soon as it is read by the other side. The transfer is done through many read and write operations.The interface between the components is the format of the (virtual) file. The abstraction is that of a stream file, like writing to a tape. There is not random access supported, because the data is released assoon as it is read by the other side.

    11. MxN MPI-IO Communication MxN backend: The interface between the components is the format of the (virtual) file. The abstraction is that of a stream file, like writing to a tape. There is not random access supported, because the data is released as soon as it is read by the other side. The transfer is done through many read and write operations.The interface between the components is the format of the (virtual) file. The abstraction is that of a stream file, like writing to a tape. There is not random access supported, because the data is released assoon as it is read by the other side.

    12. Timing: first MxN connection between discretizer and solver components. 4 discretizer processes, 16 solver processes MxN MPI-IO Communication This technology was demonstrated in Super Computing with the setup shown above. Four components.This technology was demonstrated in Super Computing with the setup shown above. Four components.

    13. Thor Results: Time vs. Bytes

    14. Future work Incorporate MxN communication system into a CCA component Explore standard API for MxN components Identify current computations challenges in areas of scientific application and design supporting middleware MxN introduces a new capability into scientific computing, one not previously available. This means a new application space is open and not yet full explored or utilized by application scientists.MxN introduces a new capability into scientific computing, one not previously available. This means a new application space is open and not yet full explored or utilized by application scientists.

    15. References Ian Foster, David Kohr, Jr., Rakesh Krishnaiyer, Jace Mogill. Remote I/O: Fast Access to Distant Storage. Proceedings of the Fifth Workshop on Input/Output in Parallel and Distributed Systems, 1997. Climate and UCAR Global Dynamic Division. Community Climate System Model, http://www.cgd.ucar.edu/csm Message Passing Interface Forum. http://www.mpi-forum.org R..Thakur, W..Gropp, E. Lusk. An abstract-device interface for implementing portable parallel-I/O interfaces. In Proceedings of the Sixth Symposium on the Frontiers of Massively Parallel /computation, p.180, 1996 S.A. Hutchinson, J.N. Shadid, R.S. Tuminaro. Aztec Users Guide: Version 2.30. Sandia National Laboratories. http://www.cs.sandia.gov/CRF/aztec1.html, 1998

    16. Extra

    17. Goals Large-scale scientific computations span multiple scales, domains, disciplines developed by large and diverse teams (multi-physics simulations) Create multi-physics simulations using existing community parallel codes Rapid prototype/testing without rewriting codes

    18. MxN Problem Defined The transfer of data from a parallel program running on M processors to another parallel program running on N processors. Ideally neither program knows the number of processes on the other one.

    19. Solving MxN Problem Existing approaches Use process 0 on all components example: CCSM models Read/Write through files Our Approaches Decouple application into components Provide easy migration path for existing application Enable an intuitive model Use MPI I/O interface

    20. RI Support Critical to have cluster where we can install variant file systems, modified middleware such as ROMIO with new abstract devices Next phase: components on different clusters; fast network connection to university clusters critical for testing Storage updates allow ability to switch between MxN and standard file I/O. MxN introduces a new capability into scientific computing, one not previously available. This means a new application space is open and not yet full explored or utilized by application scientists.MxN introduces a new capability into scientific computing, one not previously available. This means a new application space is open and not yet full explored or utilized by application scientists.

More Related