90 likes | 248 Views
GridLab WP-2 Cactus GAT (CGAT). Ed Seidel, AEI & LSU Co-chair, GGF Apps RG, Gridstart Apps TWG Gabrielle Allen, Robert Engel, Tom Goodale, *Thomas Radke Others from WP-1. Vision. Goal: enable all Cactus apps to use GAT for Grid scenarios.
E N D
GridLab WP-2Cactus GAT (CGAT) Ed Seidel, AEI & LSU Co-chair, GGF Apps RG, Gridstart Apps TWG Gabrielle Allen, Robert Engel, Tom Goodale, *Thomas Radke Others from WP-1
Vision • Goal: enable all Cactus apps to use GAT for Grid scenarios. • Very important WP, prototyped in first year, now being fully developed as GAT implementation in first full release • Cactus (http://www.cactuscode.org) • Leading application framework used by dozens of groups in astrophysics (EU Network), climate, CFD, numerical relativity, bio-informatics • Used for many years for in (pre)Gridenvironments, and is commonly deployed for demonstrations of the useof the Grid • Previously used hand-crafted Cactus modules, ad-hocmechanisms, relying on the authors' extensive experience of Gridcomputing • The CGAT work aims to replace this usingfunctionality available through the GAT API • Exemplar: modules for other Cactus apps, worked example for others WP2
Dynamic Grid Computing • Migration: “Cactus Worm” demonstrated SC00 • Launch Job • Run awhile, write checkpoint • Migrate itself to next site • Register new location to • User tracks/steers • Proof of concept, but dirty hack • Created our community! • Spawning: SC01 • User invokes “Spawner” • Analysis tasks outsourced • Globus enabled login, data transfer • It worked! WP2
Conceptual View • CGAT consists of a set of “thorns”, linked to GATEngine, which provide services to Cactus applications. The vastmajority of Cactus thorns will be unaware of the CGAT or GAT. CGAT Cactus Flesh Thorn GAT Thorn Thorn Thorn Thorn Application Thorns GridLab Service GridLab Service WP2
CGAT Functionality • Ability to remote trigger app checkpoint, retrieve checkpoint file, and stage it to a new host • Provide performance and other data to external applications using GAT monitoring infrastructure. • Export list of application-created files via GAT advert and/or replica functionality, through generic advertising service • Query information about the current machine, such as cache size, memory size, size of file-systems, name of machine. • Spawning of tasks, e.g. for task farming, monitoring status. • Automated/triggered announcement of app events, such as app startup, reaching particular iteration, termination, etc. • Etc: Working with app communities to determine need: GGF, Gridstart TWG, other projects WP2
Status • SC03: thorns were written with prototype GAT Engine to enable the GridLab migration scenario: • Remote monitoring of the status of the running Cactus application • Triggering of Cactus checkpointing • Advertisement of Cactus checkpointing data • Now: thorns converted to use the new GAT implementation and the specified GAT API • Will be demonstrated at this review and at GGF this week • Any application in Cactus can take advantage of this without any other modification • E.g., Black Holes on regular meshes, CFD on unstructured meshes (planned), ocean-atmosphere modeling WP2
Some Specifics • Gridmake: https://sourceforge.net/projects/gridmake/ • Distribute/compile source code on an arbitrary number of machines • Needed for GridLab migration, Cactus remote testing, creation of executables for MPI simulations across multiple machines • Good for codes with configurable make environments, machine configuration scripts, use of CVS etc. • Developed using public key infrastructure, soon as a grid service • To be incorporated as a GridSphere portlet • Thorn_cgat • Initializes GAT • Registers that this is checkpointable app with grms • Receives requests from grms (or any broker) • Steers cactus parameter to initiate checkpoint • Reports on success of checkpoint WP2
Near Future • Review current thorns, make production versions, distribute • Add remaining functionality • Work with AEI/LSU numerical relativity group • ensure correct functionality • train on use of CGAT infrastructure • develop task farming infrastructure for physics surveys • Deploy across GridLab testbed and LSU-AEI-KISTI Grid • Work with other Cactus app groups in astrophysics, climate, CFD, bioinformatics, others; NSF and DOE projects in US • New, experienced personnel just added now that GAT ready WP2