1 / 19

A GCM-Based Runtime Support for Parallel Grid Applications

A GCM-Based Runtime Support for Parallel Grid Applications. Elton Mathias, Fran çoise Baude and Vincent Cave {Elton.Mathias, Francoise.Baude, Vincent.Cave}@inria.fr CBHPC’08: Component-Based High Performance Computing Karlsruhe - October 16th, 2008. Outline. Introduction

dgoldman
Download Presentation

A GCM-Based Runtime Support for Parallel Grid Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A GCM-Based Runtime Support for Parallel Grid Applications Elton Mathias, Françoise Baude and Vincent Cave {Elton.Mathias, Francoise.Baude, Vincent.Cave}@inria.fr CBHPC’08: Component-Based High Performance Computing Karlsruhe - October 16th, 2008

  2. Outline • Introduction • Related works and positioning • Context • The ProActive Middleware • The Grid Component Model (GCM) • Extensions to GCM • The DiscoGrid Project and Runtime • Evaluation • Conclusion and perspectives

  3. Introduction • Strong trend to integrate parallel resources into grids • Non-embarrassingly parallel applications must deal with heterogeneity, scalability and performance (+legacy applications) • Message Passing (MPI!) is established as the main paradigm to develop scientific apps. • Asynchronism and group communications at application level • Applications must adapt to cope with a changing environment • DiscoGrid project intends to solve this issues by offering a more high-level API, with treatment of these issues at runtime level

  4. Related works and positioning • Grid-oriented MPI • optimizations at comm. layer • unmodified apps. • strong code coupling • Ex: GridMPI, MPICH-G2, PACX-MPI, MagPIE • Code Coupling • use of components • standalone apps. • weak code coupling • simplified communication • Ex.: DCA, Xchangemxn, Seine Approach: Code Coupling, but with advanced support to multipoint interactions (fine grained tightly component-based code coupling) DiscoGrid Project / Runtime: MPI boosted with advanced collective operations supported by a flexible component-based runtime that provides inter-cluster communication

  5. ProActive Middleware • Grid middleware for parallel, distributed and multi-threaded computing • Featuring: • Deployment with support to several network protocols and cluster/grid tools • Reference implementation of the GCM • Legacy code wrapping • C <-> Java communication • Deployment and control of legacy MPI application

  6. Grid Component Model (GCM) • Defined in the context of the Institute on Programming Models of CoreGRID Network of Excellence (EU project) • Extension to the Fractal comp. model adressing key grid problematics: programmability, interoperability, code reuse and efficiency • Main characteristics: • Hierarchical component model • primitive and composite components • Collective interfaces

  7. ProActive/GCM standard interfaces • Collective interfaces are complementary: • gathercast (many-to-one):synchronization parameter gatherind and result dispatch • multicast (one-to-many): parallel invocation, param. dispatch and result gather • Standard collective interfaces are enough to support broadcast, scatter, gather and barriers • But are not general enough to define many-to-many operation (MxN)

  8. Extending GCM collective interfaces: gather-multicast itfs • server gather-mcast • exposes gcast itf • connected do internal components • client gather-mcast • connects internal comp. • exposes mcast itf • Communication semantic relies on 2 policies: gather policy dispatch policy • Naïve gather-multicast MxN leads to • bottlenecks in both communication policies

  9. Gather-Multicast Optimization • Efficient communication requires direct bindings • Solution: controllers responsible for establishing direct MxN bindings + distribution of the comm. policies • Configured along 3 operations: • (re) binding configuration (R) • dispatch policy (D) • gather policy (G) • For now, these operations must be coded by developers

  10. The DiscoGrid Project • Promotes a new paradigm to develop non-embarrassingly parallel grid applications • Target applications: • domain decomposition, requiring solution of PDEs • Ex.: electromagnetic wave propagation, fluid flow pbs • DiscoGrid API • Resources seen as a hierarchical organization • Hierarchical identifiers • Neighborhood-based communication (update) • C/C++ and Fortran bindings • DiscoGrid Runtime • - Grid-aware partitioner • Modeling resources hierarchy • Support the DG API • converging to MPI when possible • Support to inter-cluster comm.

  11. ProActive/GCM DiscoGrid Runtime

  12. Optimizing the “update” operation DGOptimizationController.optimize (AggregationMode, DispatchMode, DGNeighborhood, … )

  13. DiscoGrid Communication • DG Runtime convert calls to DG API into MPI or DG calls • Point-to-point Communications • Collective Communication • Bcast, gather, scatter • Neighborhood Based • update

  14. Evaluation • Conducted in the Grid5000 • 8 sites (sophia, rennes, grenoble, lyon, bordeaux toulouse, lille) • Machines with different processors (Intel Xeon EM64T 3GHz and IA32 2.4GHz, AMD Opterons 218, 246, 248 and 285) • Memory of 2 or 4GB/node • 2 clusters Myrinet-10G and 2000, backbone 2.5Gb/s • ProActive 3.9, MPICH 1.2.7p1, Java SDK 1.6.0_02

  15. Experiment: P3D • Poisson3D equation discretized by finite differences and iterative resolution by Jacobi • Bulk-synchronous behavior: • Concurrent computation • jacobi over subdomain • Reduction of results • reduce operation • Update of subdomain borders • update operation • Regular mesh of 10243 elements (4GB of data) • 100 iterations of the algorithm • 2 versions: legacy version (pure MPI), DG version

  16. Experiment: P3D (cont.) P3D Execution Time • The entire DG communication is asynchronous • DG update is faster • ReduceAll in the DG version happens in parallel P3D update Time • Time reduce as data/node reduces • Simple gather-mcast interface is not scalable • The update with the neighborhood happens in parallel

  17. Conclusion • Extensions to GCM provide many-to-many (MxN) communication • versatile mechanism • any kind of communication • even with limited connectivity • optimizations ensure efficiency and scalability • The goal is not compete with MPI, but experimental results showed a good performance • The DiscoGrid Runtime itself can be considered a successful component-based grid programming approach supporting an SPMD model • The API and Runtime also permitted a more high-level approach to develop non-embarrassingly applications, where group communication/synchronization are handled as non-functional aspects

  18. Perspectives • Better evaluate the work through more complex simulations (BHE, CEM) and real-size data meshes • Evaluate the work done in comparison to grid-oriented versions of MPI • Explore deeper: • the separation of concerns in component architectures: consider SPMD parallel programming as a component configuration and (re)assembling activity instead of message passing. • adaptation of applications to contexts • the definition and usage of collective interfaces (GridComp)

  19. Questions

More Related