330 likes | 469 Views
Parallel Virtual Machines in Kepler. Daniel Zinn Xuan Li Bertram Ludaescher. Eighth Biennial Ptolemy Mini-conference. Thursday, April 16, 2009, Berkeley, California. Outline. Motivation & Related Work Lightweight Parallel PN Engine Kepler PPN Director Demo
E N D
Parallel Virtual Machines in Kepler Daniel Zinn Xuan Li Bertram Ludaescher Eighth Biennial Ptolemy Mini-conference Thursday, April 16, 2009, Berkeley, California
Outline • Motivation & Related Work • Lightweight Parallel PN Engine • Kepler PPN Director • Demo • Future Directions & Conclusion
Motivation • Kepler used to automate and build scientific apps • maybe compute intensive • maybe data intensive long-running • Speed-up executions to save scientists time • … by leveraging distributed resources
Distribution Efforts in Kepler / PTII • Remote execution of a complete workflow • Hydrant (Tristan King) • Web service for remote execution (Jianwu Wang) • Parameter sweeps with Nimrod/K (Colin Enticott, David Abramson, Ilkay Altintas) • Distribution within actors • “Plumping Workflows” with ad-hoc ssh-control (Nortbert Podhorszki) • Globus actors in Kepler: GlobusJob, GlobusProxy, GridFTP, GridJob. • GLite actors available through ITER • Webservice executions by actors • Distribution of few or all actors • Distributed SDF Director (Daniel Cuadrado) • Pegasus Director (Daniel Cuadrado and Yang Zhao) • Master-Slave Distributed Execution (Chad Berkley and Lucas Gilbert) with DistributedCompositeActor • PPN Director (Daniel Zinn and Xuan Li) Thanks to Jianwu for help with overview
Outline • Motivation & Related Work • Parallel PN Engine • Kepler PPN Director • Demo • Future Directions & Conclusion
Lightweight Parallel PN Engine (LPPN) • Motivation • PN as inherently parallel MoC • Build simple, efficient distributed PN-engine • Design Requirements • KISS • Avoid centralization as much as possible • Provide Actor and Port abstractions • Allow actors being written in different languages • “Experimentation Platform” for scheduling, data routing, … • Design Priniciples • One actor = one process • Communication between actors • Central component only for setup, termination detection, …
LPPN – Technology Choices • C++ for core libraries • Actor, Port, Token as C++ classes • Parallel Virtual Machine (PVM) for parallelization • Thin layer on top of machine clusters (pool of hosts) • Message passing • Implemented simple RPC on top of this • SWIG for adding higher-languages above core • Perl/Python interfaces for writing actors • Perl interfaces for composing and starting workflow • Java interface for composing, starting, monitoring workflows
LPPN – C++ Core Library: Ports • Ports are parameterized by the data type sent through the port. • Data types: • string • int, double, … • Custom structs • BLOB • Transfer via PVM messages
LPPN – C++ Core Library: Tokens • BLOB_Token to encapsulate BLOB as files • Construct from files • LinkTo(path) • File data is known to the workflow system • Data is sent via rsync/scp by the LPPN system • Each actor has private workspace in file system • Generic Token • Polymorphic Tokens for COMAD workflows • Open, Close, BLOB, …
LPPN – Actor Communication (Ports) • During Setup-Time • Connect OutPorts to InPorts (1:1 mapping) • During Runtime • Port implementation sends data from port to port • BlobToken-Ports handle file movement • Block on read • Block on write when buffer full (still fixed-size) • Tokenbuffers in PVM, file-buffers in actor directory • Gathering of statistics • Actor status, token counts, …
Command-line Actor and Actors in Perl • Example: ConvertResize Actor • Automatically create input and output ports • Read from input ports, call command, write to output ports
Workflow Setup • Start actors • Connect all ports • Unleash actors
Workflow-Script (sneak preview) • Simple DSL for defining workflows • Create Composite Actors • Specify partial parameterizations • Connect ports easily • Check sanity (all ports connected, are types ok,…) • Give deployment directives (hostname, co-locations) • Run workflow
Outline • Motivation & Related Work • Parallel PN Engine • Kepler PPN Director • Demo • Future Directions & Conclusion
Kepler PPN Director • Idea: Use Kepler as sophisticated GUI • Create, run and monitor LPPN workflows • Marrying LPPN and Kepler – The PPN Director • Drag’n’drop workflow creation (1:1 mapping for actors) • Parameter support • Hints for deployment from user • Monitor token sending and receiving • Monitor actor status • …
PPN Director – Architecture Overview Kepler Local Machine LPPN
PPN Director – Design Decisions • Proxy-Actors in Kepler represent Actors in LPPN • Repository of available LPPN Actors in XML file (next slide) • Actor-name • Parameters and default values • Ports • Generic PPN-Actor is configured using this information • Monitor actor state • Send data from Kepler Actors to LPPN actors and vice versa • PPN Director • Start Actors with parameters, deployment info • Connect Actors according to Kepler workflow • Unleash and stop workflow execution
Monitoring Support • PPN Actors periodically probe LPPN actors for info • Number of tokens sent and received • Current actor state: • Working • Block on receive • Block on write • Sending BLOB tokens • Displayed on actor while workflow is running …
Communication with Regular PN Actors? • Sending data from regular Kepler • Actors to LPPN and vice versa
Communication with Regular PN Actors! • Sending data from regular Kepler • Actors to LPPN and vice versa
Outline • Motivation & Related Work • Lightweight Parallel PN Engine • Kepler PPN Director • Demo • Future Directions & Conclusion
Outline • Motivation & Related Work • Lightweight Parallel PN Engine • Kepler PPN Director • Demo • Future Directions & Conclusion
Future Directions • Adding Black-box (Java) actors as actors in LPPN • Detailed measurements when actors need time for what • Automatic movement of actors for CPU congestions (deploying spring/mass model) • Automatic data parallelism (actor cloning and scatter+gather) • Overhaul of LPPN, maybe in Java, RMI, JNI • Better resource management
Conclusions • LPPN – a simple, fast and extensible PN engine • Kepler successfully used as front-end for LPPN • Kepler for staging & monitoring • Interoperability between Kepler and LPPN