A GCM-Based Runtime Support for Parallel Grid Applications

A GCM-Based Runtime Support for Parallel Grid Applications Elton Mathias, Françoise Baude and Vincent Cave {Elton.Mathias, Francoise.Baude, Vincent.Cave}@inria.fr CBHPC’08: Component-Based High Performance Computing Karlsruhe - October 16th, 2008

Outline • Introduction • Related works and positioning • Context • The ProActive Middleware • The Grid Component Model (GCM) • Extensions to GCM • The DiscoGrid Project and Runtime • Evaluation • Conclusion and perspectives

Introduction • Strong trend to integrate parallel resources into grids • Non-embarrassingly parallel applications must deal with heterogeneity, scalability and performance (+legacy applications) • Message Passing (MPI!) is established as the main paradigm to develop scientific apps. • Asynchronism and group communications at application level • Applications must adapt to cope with a changing environment • DiscoGrid project intends to solve this issues by offering a more high-level API, with treatment of these issues at runtime level

Related works and positioning • Grid-oriented MPI • optimizations at comm. layer • unmodified apps. • strong code coupling • Ex: GridMPI, MPICH-G2, PACX-MPI, MagPIE • Code Coupling • use of components • standalone apps. • weak code coupling • simplified communication • Ex.: DCA, Xchangemxn, Seine Approach: Code Coupling, but with advanced support to multipoint interactions (fine grained tightly component-based code coupling) DiscoGrid Project / Runtime: MPI boosted with advanced collective operations supported by a flexible component-based runtime that provides inter-cluster communication

ProActive Middleware • Grid middleware for parallel, distributed and multi-threaded computing • Featuring: • Deployment with support to several network protocols and cluster/grid tools • Reference implementation of the GCM • Legacy code wrapping • C <-> Java communication • Deployment and control of legacy MPI application

Grid Component Model (GCM) • Defined in the context of the Institute on Programming Models of CoreGRID Network of Excellence (EU project) • Extension to the Fractal comp. model adressing key grid problematics: programmability, interoperability, code reuse and efficiency • Main characteristics: • Hierarchical component model • primitive and composite components • Collective interfaces

ProActive/GCM standard interfaces • Collective interfaces are complementary: • gathercast (many-to-one):synchronization parameter gatherind and result dispatch • multicast (one-to-many): parallel invocation, param. dispatch and result gather • Standard collective interfaces are enough to support broadcast, scatter, gather and barriers • But are not general enough to define many-to-many operation (MxN)

Extending GCM collective interfaces: gather-multicast itfs • server gather-mcast • exposes gcast itf • connected do internal components • client gather-mcast • connects internal comp. • exposes mcast itf • Communication semantic relies on 2 policies: gather policy dispatch policy • Naïve gather-multicast MxN leads to • bottlenecks in both communication policies

Gather-Multicast Optimization • Efficient communication requires direct bindings • Solution: controllers responsible for establishing direct MxN bindings + distribution of the comm. policies • Configured along 3 operations: • (re) binding configuration (R) • dispatch policy (D) • gather policy (G) • For now, these operations must be coded by developers

The DiscoGrid Project • Promotes a new paradigm to develop non-embarrassingly parallel grid applications • Target applications: • domain decomposition, requiring solution of PDEs • Ex.: electromagnetic wave propagation, fluid flow pbs • DiscoGrid API • Resources seen as a hierarchical organization • Hierarchical identifiers • Neighborhood-based communication (update) • C/C++ and Fortran bindings • DiscoGrid Runtime • - Grid-aware partitioner • Modeling resources hierarchy • Support the DG API • converging to MPI when possible • Support to inter-cluster comm.

ProActive/GCM DiscoGrid Runtime

Optimizing the “update” operation DGOptimizationController.optimize (AggregationMode, DispatchMode, DGNeighborhood, … )

DiscoGrid Communication • DG Runtime convert calls to DG API into MPI or DG calls • Point-to-point Communications • Collective Communication • Bcast, gather, scatter • Neighborhood Based • update

Evaluation • Conducted in the Grid5000 • 8 sites (sophia, rennes, grenoble, lyon, bordeaux toulouse, lille) • Machines with different processors (Intel Xeon EM64T 3GHz and IA32 2.4GHz, AMD Opterons 218, 246, 248 and 285) • Memory of 2 or 4GB/node • 2 clusters Myrinet-10G and 2000, backbone 2.5Gb/s • ProActive 3.9, MPICH 1.2.7p1, Java SDK 1.6.0_02

Experiment: P3D • Poisson3D equation discretized by finite differences and iterative resolution by Jacobi • Bulk-synchronous behavior: • Concurrent computation • jacobi over subdomain • Reduction of results • reduce operation • Update of subdomain borders • update operation • Regular mesh of 10243 elements (4GB of data) • 100 iterations of the algorithm • 2 versions: legacy version (pure MPI), DG version

Experiment: P3D (cont.) P3D Execution Time • The entire DG communication is asynchronous • DG update is faster • ReduceAll in the DG version happens in parallel P3D update Time • Time reduce as data/node reduces • Simple gather-mcast interface is not scalable • The update with the neighborhood happens in parallel

Conclusion • Extensions to GCM provide many-to-many (MxN) communication • versatile mechanism • any kind of communication • even with limited connectivity • optimizations ensure efficiency and scalability • The goal is not compete with MPI, but experimental results showed a good performance • The DiscoGrid Runtime itself can be considered a successful component-based grid programming approach supporting an SPMD model • The API and Runtime also permitted a more high-level approach to develop non-embarrassingly applications, where group communication/synchronization are handled as non-functional aspects

Perspectives • Better evaluate the work through more complex simulations (BHE, CEM) and real-size data meshes • Evaluate the work done in comparison to grid-oriented versions of MPI • Explore deeper: • the separation of concerns in component architectures: consider SPMD parallel programming as a component configuration and (re)assembling activity instead of message passing. • adaptation of applications to contexts • the definition and usage of collective interfaces (GridComp)

Questions

A GCM-Based Runtime Support for Parallel Grid Applications

A GCM-Based Runtime Support for Parallel Grid Applications

Presentation Transcript

Java-based Technologies to Support Parallel and Distributed Applications

Applications and Runtime for multicore/manycore

Communication Support for Task-Based Runtime Reconfiguration in FPGAs

A graphical specification environment for GCM component-based applications

Paraprox : Pattern-Based Approximation for Data Parallel Applications

Component-based Grid Environment for Programming Scientific Applications

Checkpointing-based Rollback Recovery for Parallel Applications on the InteGrade Grid Middleware

OCL-based Runtime Monitoring of JVM hosted Applications

The Knowledge-based Workflow System for Grid Applications

Component-Based Parallel Meshing Support for Accelerator Applications

A Grid-Based Middleware’s Support for Processing Distributed Data Streams

A Grid Architecture for Medical Applications

Stream-computing Based Synchrophasor Applications for Power Grid

Exploiting Domain-Specific High-level Runtime Support for Parallel Code Generation

A Grid Parallel Application Framework

Knowledge-based Workflow System for Grid Applications

A Concept of a Monitoring Infrastructure for Workflow-Based Grid Applications

GRID superscalar: a programming paradigm for GRID applications

Parallel Computing for 3D Data Visualisation and Transmission in Grid based Applications

Bioinformatics Grid-based Applications and IOIT-HCM Grid

Isolation Support for Services-based Applications

Applications and Runtime for multicore/manycore