1 / 33

Cactus in GrADS

Cactus in GrADS. Dave Angulo, Ian Foster Matei Ripeanu, Michael Russell Distributed Systems Laboratory The University of Chicago With: Gabrielle Allen, Thomas Dramlitsch, Ed Seidel, John Shalf, Thomas Radke. Presentation Outline. Cactus Overview Architecture Applications

jace
Download Presentation

Cactus in GrADS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cactus in GrADS Dave Angulo, Ian Foster Matei Ripeanu, Michael Russell Distributed Systems Laboratory The University of Chicago With: Gabrielle Allen, Thomas Dramlitsch, Ed Seidel, John Shalf, Thomas Radke

  2. Presentation Outline • Cactus Overview • Architecture • Applications • Cactus and Grid computing • Metacomputing, Worms, … • Proposed Cactus-GrADS project • The “Cactus-G worm” • Tequila thorn and architecture • Issues

  3. What is Cactus? Cactus is a freely available, modular, portable and manageable environment for collaboratively developing parallel, high-performance multidimensional simulations • Originally developed for astrophysics, but nothing about it is astrophysics-specific

  4. Cactus Applications Example output from Numerical Relativity Simulations

  5. Cactus Architecture • Codes are constructed by linking a small core (flesh) with selected modules (thorns) • Custom linking/configuration tools • Core provides basic management services • A wide variety of thorns are supported • Numerical methods • Grids and domain decompositions • Visualization and steering • Etc.

  6. Cactus Architecture Cactus Thorns Computational Toolkit Toolkit Toolkit Flesh Make Configure CST Operating Systems Irix SuperUX Linux Unicos HP-UX Solaris OSF NT AIX

  7. Cactus Applications • A Cactus “application” is just another thorn, “linked” with other tool thorns • Numerous Astrophysics applications • E.g., Calculate Schwartzchild Event Horizons for colliding black holes • Potential candidates for GrADS work • Elliptical Solver, BenchADM • Both use 3-D grid abstract topology

  8. Building an executable Cactus Source Configuration Thorns • Compiler options • Tool options • MPI options • HDF5 options Flesh IOBasic IOASCII WaveToy LDAP Worm … Cactus Model (cont.)

  9. Parameter File • Specify which thorns to activate • Specify global parameters • Specify restricted parameters • Specify private parameters Running Cactus

  10. Parallelism in Cactus • Distributed memory model: each thorn is passed a section of the global grid • The parallel driver (implemented in a thorn) can use whatever method it likes to decompose the grid across processors and exchange ghost zone information - each thorn is presented with a standard interface, independent of the driver • Standard driver distributed with Cactus (PUGH) is for a parallel unigrid and uses MPI for the communication layer • PUGH can do custom processor decomposition and static load balancing • AMR driver also provided

  11. Cactus and Grid Computing:General Observations • Reasons to work with Cactus • Rich structure, computationally intensive, numerous opportunities for Grid computing • Talented and motivated developer/user community • Issues • At core, relatively simple structure • Cactus system is relatively complex • User community is relatively small

  12. Cactus-G: Possible Opportunities • “Metacomputing”: use heterogeneous systems as source of low-cost cycles • Departmental pool or multi-site system • Dynamic resource selection, e.g. • “Cheapest” resources to achieve interactivity • “Fastest” resource for best turnaround • “Best” resolution to meet turnaround goal • Spawn independent tasks: e.g., analysis • Migration to “better” resource for all above

  13. Cactus-G: Common Building Blocks • Resource selection based on resource and application characterizations • Implementation and management of distributed output • (De)centralized logging, accounting for resource usage, parameter selection, etc. • Fault discovery, recovery, tolerance • Code/executable management and creation • Next-generation Cactus that increases flexibility with respect to parameter selection

  14. Proposed Cactus-G Challenge Problem: Cactus-G Worm • Migrate to “faster/cheaper/bigger” system • When system identified by resource discovery • When resource requirements change • Why? • Tests much of the machinery required for Cactus-G (source code mgmt, discovery, …) • Places substantial demands on GrADS • Good potential to show real benefit • Migration approach simplifies infrastructure demands (MPI-2 support not required)

  15. (0) Possible user input (5) Cactus startup (4) Migration request (7) Read checkpoint (3) Write checkpoint (6) Load code (1) Adapt. request (2) Resource request Store models, etc. Query (1’) Resource notification Cactus-G WormBasic Architecture and Operation Application Manager Compute resource … Compute resource Appln & other thorns Cactus “flesh” “Tequila” Thorn Storage resource … GrADS Resource Selector Storage resource Code repository … Grid Information Service Code repository

  16. Tequila Thorn Functions • Initiates adaptation on application request or on notification of new resources • Can include user input (e.g., HTTP thorn) • Requests resources from external entity • GIS or ResourceSelector • Checkpoints application • Contacts Application Manager to request restart on new resources • AppManager has security, robustness advantages vs. direct restart

  17. Cactus-G Worm: Approach • Uniproc Tequila thorn that speaks to GIS, adapts periodically [done: Cactus group] • Tequila thorn that speaks to UCSD Resource Selector [current focus] • Integrate accurate performance models • Support multiprocessor execution • Detailed evaluation • Add adaptation triggers: e.g., contract violation, new regime, user input

  18. Tequila Thorn + ResourceSelector • ResourceSelector must be set up as service • Tequila thorn sends request for new bag of resources • ResourceSelector responds with the new bag

  19. Current Status • Tequila thorn prototype developed that speaks to ResourceSelector • Dummy ResourceSelector that returns a static bag of resources • Demonstrated Cactus+Tequila operating • Performance model developed • Expected by May: multiprocessor support, ResourceSelector interface, real performance model

  20. Open Issues • Should we move more management logic into Application Manager? • How does Contract Monitor fit into architecture? • How does PPS fit into architecture? • How does COP and Aplication Launcher fit into architecture (Cactus has its own launcher and compiles its own code)? • How does Pablo fit into architecture (Which thorns are monitored, is flesh monitored)?

  21. The End

  22. Request and Response • The Request to the ResourceSelector will be stored in the InformationService • Only the pointer to the data in the IS will be passed to the ResourceSelector • The Response from the ResourceSelector will also be stored in the IS • Only the pointer to the data in the IS will be passed back.

  23. Resource Selector Information Service Cactus Tequila Thorn Tequila communication overview

  24. Grads Communi- cation library Toolkit Cactus Architecture in GrADS Cactus Thorns Computational Toolkit Toolkit Flesh Make Configure CST Operating Systems Irix SuperUX Linux Unicos HP-UX Solaris OSF NT AIX

  25. Resource Selector Cactus Tqeuila Thorn Information Service Communication Details step 1 • Event sent to Tequila thorn requesting restart

  26. Resource Selector Cactus Tqeuila Thorn Information Service Communication Details step 2 • Tequila store AART in IS

  27. Resource Selector Cactus Tqeuila Thorn Information Service Communication Details step 3 • Tequila sends request to ResourceSelector passing pointer to data in IS

  28. Resource Selector Cactus Tqeuila Thorn Information Service Communication Details step 4 • ResourceSelector retrieves AART from IS

  29. Resource Selector Cactus Tqeuila Thorn Information Service Communication Details step 5 • ResourceSelector stores bag of resources (in AART) in IS

  30. Resource Selector Cactus Tqeuila Thorn Information Service Communication Details step 6 • ResourceSelector responds to Tequila passing pointer to data in IS

  31. Resource Selector Cactus Tqeuila Thorn Information Service Communication Details step 7 • Tequila retrieves AART with new bag of resources from IS

  32. Requirements • Using the IS for communication adds overhead. • Why do this? • GrADS requirement 1: do some things (e.g. compile) at one time and have the results stored in a persistent storage area. Pick these stored results up later and complete other phases.

  33. Sample Tequila Scenario • User asks to run an ADM simulation 400x400x400 for 1000 timesteps in 10s. • Resource selector contacted to obtain virtual machines • Best virtual machine selected based on performance model. • AM starts Cactus on that virtual machine (and monitors execution Contracts?) • User (or application manager) decides that computation advances too slow and decides to search for a better virtual machine • AM finds a better machine, commands the Cactus run to Checkpoint, transfers files and restart Cactus

More Related