190 likes | 222 Views
Explore the innovative GCM-based runtime support for parallel grid applications, focusing on scalability, performance, and heterogeneous environments. Learn about the DiscoGrid project's advanced API and runtime solutions designed to address evolving application needs efficiently.
E N D
A GCM-Based Runtime Support for Parallel Grid Applications Elton Mathias, Françoise Baude and Vincent Cave {Elton.Mathias, Francoise.Baude, Vincent.Cave}@inria.fr CBHPC’08: Component-Based High Performance Computing Karlsruhe - October 16th, 2008
Outline • Introduction • Related works and positioning • Context • The ProActive Middleware • The Grid Component Model (GCM) • Extensions to GCM • The DiscoGrid Project and Runtime • Evaluation • Conclusion and perspectives
Introduction • Strong trend to integrate parallel resources into grids • Non-embarrassingly parallel applications must deal with heterogeneity, scalability and performance (+legacy applications) • Message Passing (MPI!) is established as the main paradigm to develop scientific apps. • Asynchronism and group communications at application level • Applications must adapt to cope with a changing environment • DiscoGrid project intends to solve this issues by offering a more high-level API, with treatment of these issues at runtime level
Related works and positioning • Grid-oriented MPI • optimizations at comm. layer • unmodified apps. • strong code coupling • Ex: GridMPI, MPICH-G2, PACX-MPI, MagPIE • Code Coupling • use of components • standalone apps. • weak code coupling • simplified communication • Ex.: DCA, Xchangemxn, Seine Approach: Code Coupling, but with advanced support to multipoint interactions (fine grained tightly component-based code coupling) DiscoGrid Project / Runtime: MPI boosted with advanced collective operations supported by a flexible component-based runtime that provides inter-cluster communication
ProActive Middleware • Grid middleware for parallel, distributed and multi-threaded computing • Featuring: • Deployment with support to several network protocols and cluster/grid tools • Reference implementation of the GCM • Legacy code wrapping • C <-> Java communication • Deployment and control of legacy MPI application
Grid Component Model (GCM) • Defined in the context of the Institute on Programming Models of CoreGRID Network of Excellence (EU project) • Extension to the Fractal comp. model adressing key grid problematics: programmability, interoperability, code reuse and efficiency • Main characteristics: • Hierarchical component model • primitive and composite components • Collective interfaces
ProActive/GCM standard interfaces • Collective interfaces are complementary: • gathercast (many-to-one):synchronization parameter gatherind and result dispatch • multicast (one-to-many): parallel invocation, param. dispatch and result gather • Standard collective interfaces are enough to support broadcast, scatter, gather and barriers • But are not general enough to define many-to-many operation (MxN)
Extending GCM collective interfaces: gather-multicast itfs • server gather-mcast • exposes gcast itf • connected do internal components • client gather-mcast • connects internal comp. • exposes mcast itf • Communication semantic relies on 2 policies: gather policy dispatch policy • Naïve gather-multicast MxN leads to • bottlenecks in both communication policies
Gather-Multicast Optimization • Efficient communication requires direct bindings • Solution: controllers responsible for establishing direct MxN bindings + distribution of the comm. policies • Configured along 3 operations: • (re) binding configuration (R) • dispatch policy (D) • gather policy (G) • For now, these operations must be coded by developers
The DiscoGrid Project • Promotes a new paradigm to develop non-embarrassingly parallel grid applications • Target applications: • domain decomposition, requiring solution of PDEs • Ex.: electromagnetic wave propagation, fluid flow pbs • DiscoGrid API • Resources seen as a hierarchical organization • Hierarchical identifiers • Neighborhood-based communication (update) • C/C++ and Fortran bindings • DiscoGrid Runtime • - Grid-aware partitioner • Modeling resources hierarchy • Support the DG API • converging to MPI when possible • Support to inter-cluster comm.
Optimizing the “update” operation DGOptimizationController.optimize (AggregationMode, DispatchMode, DGNeighborhood, … )
DiscoGrid Communication • DG Runtime convert calls to DG API into MPI or DG calls • Point-to-point Communications • Collective Communication • Bcast, gather, scatter • Neighborhood Based • update
Evaluation • Conducted in the Grid5000 • 8 sites (sophia, rennes, grenoble, lyon, bordeaux toulouse, lille) • Machines with different processors (Intel Xeon EM64T 3GHz and IA32 2.4GHz, AMD Opterons 218, 246, 248 and 285) • Memory of 2 or 4GB/node • 2 clusters Myrinet-10G and 2000, backbone 2.5Gb/s • ProActive 3.9, MPICH 1.2.7p1, Java SDK 1.6.0_02
Experiment: P3D • Poisson3D equation discretized by finite differences and iterative resolution by Jacobi • Bulk-synchronous behavior: • Concurrent computation • jacobi over subdomain • Reduction of results • reduce operation • Update of subdomain borders • update operation • Regular mesh of 10243 elements (4GB of data) • 100 iterations of the algorithm • 2 versions: legacy version (pure MPI), DG version
Experiment: P3D (cont.) P3D Execution Time • The entire DG communication is asynchronous • DG update is faster • ReduceAll in the DG version happens in parallel P3D update Time • Time reduce as data/node reduces • Simple gather-mcast interface is not scalable • The update with the neighborhood happens in parallel
Conclusion • Extensions to GCM provide many-to-many (MxN) communication • versatile mechanism • any kind of communication • even with limited connectivity • optimizations ensure efficiency and scalability • The goal is not compete with MPI, but experimental results showed a good performance • The DiscoGrid Runtime itself can be considered a successful component-based grid programming approach supporting an SPMD model • The API and Runtime also permitted a more high-level approach to develop non-embarrassingly applications, where group communication/synchronization are handled as non-functional aspects
Perspectives • Better evaluate the work through more complex simulations (BHE, CEM) and real-size data meshes • Evaluate the work done in comparison to grid-oriented versions of MPI • Explore deeper: • the separation of concerns in component architectures: consider SPMD parallel programming as a component configuration and (re)assembling activity instead of message passing. • adaptation of applications to contexts • the definition and usage of collective interfaces (GridComp)