430 likes | 574 Views
Cactus in GrADS (HFA). Ian Foster Dave Angulo, Matei Ripeanu, Michael Russell. Presentation Outline. Introduction to Cactus Cactus Applications Cactus Architecture Cactus Worm Thorn Tequila Thorn (Cactus in GrADs) Tequila Architectur Issues. What is Cactus?.
E N D
Cactus in GrADS (HFA) Ian Foster Dave Angulo, Matei Ripeanu, Michael Russell
Presentation Outline • Introduction to Cactus • Cactus Applications • Cactus Architecture • Cactus Worm Thorn • Tequila Thorn (Cactus in GrADs) • Tequila Architectur • Issues
What is Cactus? Cactus is a freely available, modular, portable and manageable environment for collaboratively developing parallel, high-performance multidimensional simulations
Example Cactus Output Example output from Numerical Relativity Simulations
Cactus Applications • Application Thorns are Astrophysics applications • Calculate Schwartzchild Event Horizons for colliding black holes
Cactus Applications (cont.) • Candidate apps are Elliptical Solver or BenchADM • Abstract Topologies are simple 3D Grid
Cactus Applications (cont.) • Applications can easily be “linked” in with the other thorns used as tools. • Application Thorns are just selected and run with the other selected thorns
Cactus Architecture Cactus Thorns Computational Toolkit Toolkit Toolkit Flesh Make Configure CST Operating Systems Irix SuperUX Linux Unicos HP-UX Solaris OSF NT AIX
Building an executable Cactus Source Configuration Thorns Flesh • Compiler options • Tool options • MPI options • HDF5 options IOBasic IOASCII WaveToy LDAP Worm … Cactus Model (cont.)
Parameter File • Specify which thorns to activate • Specify global parameters • Specify restricted parameters • Specify private parameters Running Cactus
Cactus “flesh” internals Cactus Application Thorn(s) Other Cactus Thorn(s) Cactus “Worm” Thorn Cactus model • This is the currently working Cactus application framework that we will modify
Worm Thorn Functions • Initiates moving to new resource when scheduled time is exhausted • Contacts IS to get a new node to run on • Checkpoints application • Restarts application on the new node • Runs on single node only
Cactus “flesh” internals Cactus Application Thorn(s) Other Cactus Thorn(s) Cactus “Tequila” Thorn GrADS Cactus Model • We will start with “Worm” thorn code to make new “Tequila” thorn (Apotheosized Cactus Worm).
Tequila thorn functions • Receives event (generated by user) to initiate adapting resources. • Contacts ResourceSelector to get new bag of resources • Checkpoints application • Restarts application on the new resources
Events • Events that cause the user to want to adapt resources: • User changes parameters during runtime that requires additional resources • Example: starting an analysis routine • Example: running an event horizon finder • User specifies that performance is not meeting expectations
Future Events • Possible Future Plans for automatic resource adapting: • User changes parameters during runtime that requires additional resources • Contract violations fire similar events • we were wrong first time • resources get overloaded • more (or fewer) (or different) processors appear • distribution changed • resolution changed
Tequila thorn contacts ResourceSelector • ResourceSelector must be set up as service • Tequila thorn sends request for new bag of resources • ResourceSelector responds with the new bag
Request and Response • The Request to the ResourceSelector will be stored in the InformationService • Only the pointer to the data in the IS will be passed to the ResourceSelector • The Response from the ResourceSelector will also be stored in the IS • Only the pointer to the data in the IS will be passed back.
Resource Selector Information Service Cactus Tequila Thorn Tequila communication overview
Grads Communi- cation library Toolkit Cactus Architecture in GrADS Cactus Thorns Computational Toolkit Toolkit Flesh Make Configure CST Operating Systems Irix SuperUX Linux Unicos HP-UX Solaris OSF NT AIX
Open Issues • How does Contract Monitor fit into architecture? • How does PPS fit into architecture? • How does COP and Aplication Launcher fit into architecture (Cactus has its own launcher and compiles its own code)? • How does Pablo fit into architecture (Which thorns are monitored, is flesh monitored)?
Slides Explaining Communication Details • ********************
Resource Selector Cactus Tqeuila Thorn Information Service Communication Details step 1 • Event sent to Tequila thorn requesting restart
Resource Selector Cactus Tqeuila Thorn Information Service Communication Details step 2 • Tequila store AART in IS
Resource Selector Cactus Tqeuila Thorn Information Service Communication Details step 3 • Tequila sends request to ResourceSelector passing pointer to data in IS
Resource Selector Cactus Tqeuila Thorn Information Service Communication Details step 4 • ResourceSelector retrieves AART from IS
Resource Selector Cactus Tqeuila Thorn Information Service Communication Details step 5 • ResourceSelector stores bag of resources (in AART) in IS
Resource Selector Cactus Tqeuila Thorn Information Service Communication Details step 6 • ResourceSelector responds to Tequila passing pointer to data in IS
Resource Selector Cactus Tqeuila Thorn Information Service Communication Details step 7 • Tequila retrieves AART with new bag of resources from IS
Requirements • Using the IS for communication adds overhead. • Why do this? • GrADS requirement 1: do some things (e.g. compile) at one time and have the results stored in a persistent storage area. Pick these stored results up later and complete other phases.
Requirements (cont.) • GrADS requirement 2: Application people want to be able to allow users to manually interact in any of the "module interfaces." Tequila allows this to be done with a web client.
Slide Explaining Parallelism in Cactus • ***************
Parallelism in Cactus • Cactus is designed around a distributed memory model. Each thorn is passed a section of the global grid. • The actual parallel driver (implemented in a thorn) can use whatever method it likes to decompose the grid across processors and exchange ghost zone information - each thorn is presented with a standard interface, independent of the driver. • Standard driver distributed with Cactus (PUGH) is for a parallel unigrid and uses MPI for the communication layer • PUGH can do custom processor decomposition and static load balancing
Slide with Alternate Tequila Architecture • ***************
Sample Tequila Scenario • User asks to run an ADM simulation 400x400x400 for 1000 timesteps in 10s. • Resource selector contacted to obtain virtual machines • Best virtual machine selected based on performance model. • AM starts Cactus on that virtual machine (and monitors execution Contracts?) • User (or application manager) decides that computation advances too slow and decides to search for a better virtual machine • AM finds a better machine, commands the Cactus run to Checkpoint, transfers files and restart Cactus
Slides Explaining Different Tequila Architectures • *********************
Tequila Architecture Choices • Main presentation explained the short term Tequila Architecture • Open issues covered not-yet-resolved architectural choices for longer term integration
2. Application Manager instructed to spawn new instance Cactus Flesh Cactus Flesh Worm Thorn Worm Thorn GIIS Application Manager 1. Resources Obtained 3. New instance spawned Worm Spawning
Tequila Spawning • The short-term plan is to simply replace the GIIS with the UCSD Resource Selector • Tequila would make the request for new resources to the RS instead of the GIIS
UCSD Resource Selector 2. Application Manager instructed to spawn new instance Cactus Flesh Cactus Flesh Tequila Thorn Tequila Thorn Application Manager 3. New instance spawned Tequila Spawning 1. Resources Obtained
Tequila Spawning • Longer term plan is not yet resolved. • One possibility is to put all grads pieces into Application Manager
Cactus Flesh Cactus Flesh Tequila Thorn Tequila Thorn 3. New instance spawned Application Manager UCSD Resource Selector 1. Application Manager instructed to spawn new instance 2. Resources Obtained Application Manager