Coupling Parallel Programs via MetaChaos

Coupling Parallel Programs via MetaChaos Alan Sussman Computer Science Dept. University of Maryland With thanks to Mike Wiltberger (Dartmouth/NCAR)

What is MetaChaos? • A runtime meta-library that achieves direct data transfers between data structures managed by different parallel libraries • Runtime meta-library means that it interacts with data parallel libraries and languages used for separate programs (including MPI) • Can exchange data between separate (sequential or parallel) programs, running on different machines • Also manages data transfers between different libraries in the same application • This often referred to as the MxN problem in parallel programming (e.g. CCA Forum)

How does MetaChaos work? • It all starts with the Data Descriptor (ESMF state) • Information about how the data in each program is distributed across the processors • Usually supplied by the library/program developer • We are working on generalizing to work with complex data distributions • MetaChaos then uses a linearization (LSA) of the data to be moved (the regions) to determine the optimal method to move data from set of regions in A (SA ) to a set of regions in B (SB) • Moving the data is a three step process LSA = lProgX(SA)LSB = LSASB = l-1ProgY(LSB) • Only constraint on this operation is each region must have the same number of elements

MetaChaos goals • Main goal is minimal modification to existing programs • To enable a program to be coupled to others, add calls to: • describe data distribution across processors – build a data descriptor • describe data to be moved (imported or exported) – build set of regions • move the data – build a communication pattern/schedule, then use it • this is the part that requires interaction with the other program

MetaChaos goals • Other main goal is low overhead and efficient data transfers • Low overhead from building schedules efficiently • take advantage of characteristics of data descriptor • Efficient data transfers via customized all-to-all message passing between source and destination processes

More details • Bindings for C/C++, Fortran77, Fortran90 coming (data descriptor issues) • similar interface to MCEL, but get direct communication (no server) • Currently message passing and program interconnection via PVM • programs/components run on whatever • heading towards Globus and other Grid services • Each model/program can do whatever it wants internally (MPI, pthreads, sockets, …) – and startup by whatever mechanism it wants (CCSM)

A Simple Example: Wave Eq Using P++ #include <A++.h>main(int argc, char **argv) {Optimization_Manager::Initialize_Virtual_Machine("",iNPES,argc,argv); doubleArray daUnm1(iNumX+2,iNumY+2),daUn(iNumX+2,iNumY+2); doubleArray daUnp1(iNumX+2,iNumY+2); Index I(1,iNumX), J(1,iNumY); // Indices for computational domain for(j=1;j<iNumY+1;j++) { daUnm1(I,j) = sin(dW*dTime + (daX(I)*2*dPi)/dLenX); daUn(If,j) = sin(dW*0 + (daX(If)*2*dPi)/dLenX); } // Apply BC omitted for space // Evolve a step forward in time for(i=1;i<iNSteps;i++) { daUnp1(I,J) = ((dC*dC*dDT*dDT)/(dDX*dDX))* (daUn(I-1,J)-2*daUn(I,J)+daUn(I+1,J)) + 2*daUn(I,J) - daUnm1(I,J); // Apply BC Omitted for space }Optimization_Manager::Exit_Virtual_Machine();}

Split into two using MetaChaos #include <A++.h>main(int argc, char **argv) {Optimization_Manager::Initialize_Virtual_Machine("",NPES,argc,argv);this_pgm = InitPgm(pgm_name,NPES); other_pgm = WaitPgm(other_pgm_name,NPES_other); Sync2Pgm(this_pgm,other_pgm); BP_set = Alloc_setOfRegion(); left[0] = 4; right[0] = 4; stride[0] = 1; left[1] = 5; right[1] = 5; stride[0] = 1; reg = Alloc_R_Block(DIM,left,right,stride); Add_Region_setOfRegion(reg,BP_Set); BP_da = getPartiDescriptor(&daUn); sched = ComputeScheduleForSender(…,BP_da,BP_set,…); for(i=1;i<iNSteps;i++) { daUnp1(I,J) = ((dC*dC*dDT*dDT)/(dDX*dDX))* (daUn(I-1,J)-2*daUn(I,J)+daUn(I+1,J)) + 2*daUn(I,J) - daUnm1(I,J);iDataMoveSend(other_pgm,sched,daUn,getLocalArray().getDataPointer); iDataMoveRecv(other_pgm,sched,daUn,getLocalArray().getDataPointer); Sync2Pgm(this_pgm,other_pgm); }Optimization_Manager::Exit_Virtual_Machine();}

We are using MetaChaos and Overture for Space Science

Space weather framework • A set of tools/services • not an integrated framework • To allow new models/programs to interoperate (exchange data) with ones that already use the tools/interfaces • Application builder plugs together various models, specifies how/when they interact (exchange data) • There are already at least 5 physical models currently, with more to come • from CISM (Center for Integrated Space Weather Modeling, let by Boston U.)

What are we working on now? • Adding generalized block data distributions and completely irregular, explicit distributions • Infrastructure for controlling interactions between programs • the tools for building coupled applications to run in the high performance, distributed, heterogeneous Grid environment – not just a coordination language • built on top of basic Grid services (Globus, NWS, resource schedulers/co-schedulers, etc.)

What is Overture? • A Collection of C++ Classes that can be used to solve PDEs on overlapping grids • Key Features • High level interface for PDEs on adaptive and curvilinear grids • Provides a library of finite differences operators • Conservative/NonConservative • 2nd and 4Th order • Uses A++/P++ array class for serial parallel array operations • Extensive grid generation capablities

Solvers Oges, Ogmg, OverBlown Operators div, grad, bc's Grid Generator Ogen Adaptive Mesh Refinement Mappings (geometry) MappedGrid GridCollection MappedGridFunction GridCollectionFunction A++P++ array class Graphics (OpenGL) Data base (HDF) Boxlib (LBL) Overture: A toolkit for solving PDEs

Coupling Parallel Programs via MetaChaos

Coupling Parallel Programs via MetaChaos

Presentation Transcript

Inductive-Dynamic Magnetosphere-Ionosphere Coupling via MHD Waves

Two Example Parallel Programs using MPI

Designing and Evaluating Parallel Programs

Performance of Parallel Programs

Progress Guarantee for Parallel Programs via Bounded Lock-Freedom

Analysis of Fork-Join Parallel Programs

Correctness of parallel programs

AUTOMATICALLY TUNING PARALLEL AND PARALLELIZED PROGRAMS

Analytical Modeling of Parallel Programs

Coupling Parallel IO with Remote Data Access

Parallel Programs

Designing Parallel Programs

Designing Parallel Operating Systems via Parallel Programming

Service Learning Programs via Universities

Source Level Debugging of Parallel Programs

Code Optimization of Parallel Programs

Evaluating Parallel Programs

Coupling HYCOM to CICE via the NCAR Coupler

Helical Coupling Jaw Coupling Disk Coupling Oldham Coupling Rigid Coupling

Exploration of HHFW coupling via modulation.

Designing Parallel Programs

Lecture 3 : Performance of Parallel Programs