310 likes | 483 Views
Charm++ Data-driven Objects. L. V. Kale. Parallel Programming. Decomposition what to do in parallel Mapping: Which processor does each task Scheduling (sequencing) On each processor Machine dependent expression Express the above decisions for the particular parallel machine.
E N D
Charm++Data-driven Objects L. V. Kale
Parallel Programming • Decomposition • what to do in parallel • Mapping: • Which processor does each task • Scheduling (sequencing) • On each processor • Machine dependent expression • Express the above decisions for the particular parallel machine The parallel objects model of Charm++ automates Mapping, Scheduling, and machine dependent expression
Shared objects model: • Basic philosophy: • Let the programmer decide what to do in parallel • Let the system handle the rest: • Which processor executes what, and when • With some override control to the programmer, when needed • Basic model: • The program is set of communicating objects • Objects only know about other objects (not processors) • System maps objects to processors • And may remap the objects for load balancing etc. dynamically • Shared objects, not shared memory • in-between “shared nothing” message passing, and “shared everything” of SAS • Additional information sharing mechanisms • “Disciplined” sharing
Charm++ • Charm++ programs specify parallel computations consisting of a number of “objects” • How do they communicate? • By invoking methods on each other, typically asynchronously • Also by sharing data using “specifically shared variables” • What kinds of objects? • Chares: singleton objects • Chare arrays: generalized collections of objects • Advanced: Chare group (Used by library writers, system)
Data Driven Execution in Charm++ Objects Scheduler Scheduler Message Q Message Q
Need for Proxies • Consider: • Object x of class A wants to invoke method f of obj y of class B. • x and y are on different processors • what should the syntax be? • y->f( …)? : doesn’t work because y is not a local pointer • Needed: • Instead of “y” we must use an ID that is valid across processors • Method Invocation should use this ID • Some part of the system must pack the parameters and send them • Some part of the system on the remote processor must invoke the right method on the right object with the parameters supplied
Charm++ solution: proxy classes • Classes with remotely invocable methods • inherit from “chare” class (system defined) • entry methods can only have one parameter: a subclass of message • For each chare class D • which has methods that we want to remotely invoke • The system will automatically generate a proxy class Cproxy_D • Proxy objects know where the real object is • Methods invoked on this class simply put the data in an “envelope” and send it out to the destination • Each chare object has a global ID • CkChareID thishandle; // thishandle inherited from “chare” • Also you can get the id of a chare when you create it: • Cproxy_D *p = new Cproxy_D(msgPtr);
Chare creation and method invocation Msg * m = new Msg(); m->arg = 25; CProxy_D *x = new CProxy_D(m); Msg2 * m2 = new Msg2(); m2->a = 5; m2->b= 7; x->f(); Sequential equivalent: y = new D(25); y->f(5,7); Alternatively: x->f(new Msg2(5,7));
Chares (Data driven Objects) • Regular C++ classes, • with some methods designated as remotely invokable (called entrymethods) • entry methods have only one parameter: • of type message • Creation: of an instance of chare class C • Cproxy_C * p = new CProxy_C(msg); • Creates an instance of C on a specified processor “pe” • new CProxy_C (msg, pe); • Cproxy_C: a proxy class generated by Charm for chare class C declared by the user
Messages • A user-defined C++ class • inherits from a system-defined class • messages can be communicated to others as parameters • Has regular data fields • Declaration: normal C++, • inherit from a system defined class • Creation: (just usual C++) • MsgType * m = new MsgType;
Remote method invocation • Proxy Classes: • For each chare class C, the system generates a proxy class. • (C : CProxy_C) • Each chare has a global ID (ChareID) • Global: in the sense of being valid on all processors • thishandle (analogous to this) gets you the ChareID • You can send thishandle in messages • Given a handle h, you can create a proxy • CProxy_C p(h); // or q = new CProxy_C(h) • p.method(msg); // or q->method(msg);
CkChareID mainhandle; main::main(CkArgMsg * m) { int i = 0; for (i=0; i<100; i++) new CProxy_piPart(); responders = 100; count = 0; mainhandle = thishandle; // readonly initialization } void main::results(DataMsg *msg) { count += msg->count; if (0 == --responders) { CkPrintf("pi=: %f \n", 4.0*count/100000); CkExit(); } } Executionbegins here argc/argv Exit scheduler after method returns
piPart::piPart() { // declarations.. CProxy_main mainproxy(mainhandle); srand48((long) this); mySamples = 100000/100; for (i= 0; i<= mySamples; i++) { x = drand48(); y = drand48(); if ((x*x + y*y) <= 1.0) localCount++; } DataMsg *result = new DataMsg; result->count = localCount; mainproxy.results(result); delete this; } mainproxy.results( new DataMsg(localCount));
Generation of proxy classes • How does charm generate the proxy classes? • Needs help from the programmer • name classes and methods that can be remotely invoked • declare this in a special “charm interface” file (pgm.ci) • Include the generated code in your program pgm.ci mainmodule PiMod { message DataMsg; mainchare main { entry main(); entry results(DataMsg *); }; chare piPart { entry piPart(void); }; pgm.h #include “PiMod.decl.h” .. Generates PiMod.def.h PiMod.def.h Pgm.c … #include “PiMod.def.h”
Charm++ • Data Driven Objects • Message classes • Asynchronous method invocation • Prioritized scheduling • Object Arrays • Object Groups: • global object with a “representative” on each PE • Information sharing abstractions • readonly data • accumulators • distributed tables
Object Arrays • A collection of chares, • with a single global name for the collection, and • each member addressed by an index • Mapping of element objects to processors handled by the system User’s view A[0] A[1] A[2] A[3] A[..] System view A[0] A[3]
Introduction • Elements are parallel objects like chares • Elements are indexed by a user-defined data type-- [sparse] 1D, 2D, 3D, tree, ... • Send messages to index, receive messages at element. Reductions and broadcasts across the array • Dynamic insertion, deletion, migration-- and everything still has to work! • Interfaces with automatic load balancer.
1D Declare & Use module m{ message HiMsg; array [1D] Hello { entry Hello(void); entry void SayHi(HiMsg *); }; }; In the interface (.ci) file In the .C file CProxy_Hello p = CProxy_Hello::ckNew(); for (int i=12;i<73;i+=7) p[i].insert(); p.doneInserting(); p[12].SayHi(new HiMsg(...));
1D Definition class Hello:public ArrayElement1D{ public: Hello(void) { ... thisArrayID ... ... thisIndex ... } void SayHi(HiMsg *m) { ... } Hello(CkMigrateMessage *m) {} }; Inherited from ArrayElement1D
3D Declare & Use module m{ message HiMsg; array [3D] Hello { entry Hello(void); entry void SayHi(HiMsg *); }; }; CProxy_Hello p= CProxy_Hello::ckNew(); for (int i=0;i<800000;i++) p(x(i),y(i),z(i)).insert(); p.doneInserting(); p(12,23,7).SayHi(new HiMsg(...));
3D Definition class Hello:public ArrayElement3D{ public: Hello(void) { ... thisArrayID ... ... thisIndex.x, thisIndex.y, thisIndex.z ... } void SayHi(HiMsg *m) { ... } Hello(CkMigrateMessage *m) {} };
3D Definition class Hello:public ArrayElement3D{ public: Hello(void) { ... thisArrayID ... ... thisIndex.x, .y, .z ... } void SayHi(HiMsg *m) { ... } Hello(CkMigrateMessage *m) {} void pup(PUP::er &p) { ArrayElement3D::pup(p); p(myVar1);p(myVar2); ... } };
Generalized “arrays”: Declare & Use module m{ message HiMsg; array [Foo] Hello { entry Hello(void); entry void SayHi(HiMsg *); }; }; CProxy_Hello p= CProxy_Hello::ckNew(); for (...) p[CkArrayIndexFoo(..)].insert(); p.doneInserting(); p[CkArrayIndexFoo(..)].SayHi(..);
General Definition class CkArrayIndexFoo: public CkArrayIndex { Barb; //char b[8]; float b[2];.. public: CkArrayIndexFoo(...) {... nInts=sizeof(b)/sizeof(int); } }; class Hello:public ArrayElementT<CkArrayIndexFoo> { public: Hello(void) { ... thisIndex ...
Collective ops Broadcast message SayHi: p.SayHi(new HiMsg(...)); Reduce x across all elements: contribute(sizeof(x),&x,CkReduction::sum_int); Where do reduction results go? To a reduction “client” function, registered by the caller (typically as soon as the array is created) CProxy_A a = Cproxy_A::ckNew(); a.setReductionClient(clientFunction, (void *) refData);
Migration support Delete element i: p[i].destroy(); Migrate to processor destPe: migrateMe(destPe); Enable load balancer: by creating a load balancing object Provide pack/unpack functions: Each object that needs this, provides a “pup” method. (pup is a single abstraction that allows data traversal for determining size, packing and unpacking)
Object Groups • A group of objects (chares) • with exactly one representative on each processor • A single Id for the group as a whole • invoke methods in a branch (asynchronously), all branches (broadcast), or in the local branch • creation: • groupId = new Cproxy_C(msg) • remote invocation: • CProxy_C p(groupId); • p.methodName(msg); // p.methodName(msg, peNum); • p.LocalBranch->f(….);
Information sharing abstractions • Observation: • Information is shared in several specific modes in parallel programs • Other models support only a limited sets of modes: • Shared memory: everything is shared: sledgehammer approach • Message passing: messages are the only method • Charm++: identifies and supports several modes • Readonly / writeonce • Tables (hash tables) • accumulators • Monotonic variables
Compiling Charm++ programs • Need to define an interface specification file • mod.ci for each module mod • Contains declarations that the system uses to produce proxy classes • These produced classes must be included in your mod.C file • See examples provided on the class web site. • More information: • Manuals, example programs, papers • http://charm.cs.uiuc.edu • These slides are currently at: • http://charm.cs.uiuc.edu/kale/cse320
Fortran 90 version • Quick implementation on top of Charm++ • How to use: • follow example program, with the same basic concepts • Only use object arrays, for now • Most useful construct • Object groups can be implemented in C++, if needed
Further Reading • More information: • Manuals, example programs, papers • http://charm.cs.uiuc.edu • These slides are currently at: • http://charm.cs.uiuc.edu/kale/cse320