350 likes | 460 Views
Checkpoint & Restart for Distributed Components in XCAT3. Sriram Krishnan* Indiana University, San Diego Supercomputer Center & Dennis Gannon Indiana University *srikrish@cs.indiana.edu. Long-running Distributed Applications on the Grid. The Problem: 1 Launch simulation at Y
E N D
Checkpoint & Restart for Distributed Components in XCAT3 Sriram Krishnan* Indiana University, San Diego Supercomputer Center & Dennis Gannon Indiana University *srikrish@cs.indiana.edu
Long-running Distributed Applications on the Grid The Problem: 1 Launch simulation at Y 2. Launch simulation at Z 3. Link both simulations 4. Execute both simulations 5. Store results at X X Z Y The Grid Need an effective way to orchestrate such computations
Checkpoint & Restart • Motivation • Basic fault tolerance via periodic checkpointing • Rollback to saved checkpoint upon failure • Dynamic rescheduling of jobs • Checkpoint and restart on another location • Checkpointing Goals • Correctness • Portability • Minimal checkpoint size • Scalability • Interoperability • Checkpoint Availability
Outline • Motivation • Background • The XCAT3 framework • Checkpoint & Restart • Checkpointing & Restart in XCAT3 • Software Techniques • Algorithms • Experiments • Conclusions & Future work
Application Orchestration: Component Architectures • A Component Architecture consists of two parts: • Components • Software objects that implement a set of required behaviors • Frameworks • A runtime environment • A set of services used by components • Benefits • Encapsulation, modular construction of programs (via composition), reuse • Component Architectures adopted in various domains • Business: EJB, CCM, COM/DCOM • Scientific Computing: CCA
Common Component Architecture • A ComponentID for identification & management purposes • Ports: the public interfaces of a component • Defines the different ways we can interact with a component and the ways the component uses other services and components. setImage(Image I) Image getImage() Image Processing Component adjustColor() calls doFFT(…) setFilter(Filter) Uses Ports - interface of a service used by component Provides Ports - interfaces functions provided by component
XCAT3: CCA Framework for the Grid • Grid Service Extensions (GSX) Toolkit used for OGSI Compatible Grid services • Standard protocols used by Grid services: SOAP, HTTP • http://www.extreme.indiana.edu/xgws/GSX • A Component is represented as a set of Grid services • Provides ports, ComponentID’s are Grid services • Uses ports are Grid service clients • Sriram Krishnan and Dennis Gannon. XCAT3: A Framework for CCA Components as OGSA Services. In HIPS 2004, 9th International Workshop on High-Level Parallel Programming Models and Supportive Environments. April 2004.
Checkpointing: Software Techniques • System-level Techniques • Automatic transparent checkpointing for an application at the operating system or middleware level • User-defined Techniques • Non-transparent checkpointing for an application that relies on the programmer to identify the minimal information needed for restart
Transparent to the user: No expertise required Not very portable across platforms Larger checkpoint sizes: Typically complete process images stored Less flexible: Application is treated as a black box Not transparent to the user: Considerable expertise required More portable across platforms Smaller checkpoint sizes: Only minimal state stored More flexible: Application information can be used Checkpointing: Software Techniques System-Level User-defined
System-level Techniques Condor LAM-MPI Enterprise Java Beans CORBA Components User-defined Techniques CUMULVS Enterprise Java Beans CORBA Components Global Grid Forum: Grid Checkpoint/Recovery Group User-defined checkpointing APIs for Grid services Do not address consistent global checkpoints for distributed applications A set of individual checkpoints that constitute a state that occurs in a failure-free, correct execution Checkpointing: Examples
Checkpointing Technique in XCAT3 • User-defined & System-assisted • User is responsible for identifying local component state • Framework is responsible for: • Generating complete state of the component, viz. local component state, connection state, and environment state • Algorithms for generating global component states, and storing them into stable storage • Component writer implements the following methods: • generateComponentState() • loadComponentState() • resumeExecution()
Distributed Checkpointing • Algorithm Overview: Coordinated blocking checkpoint algorithm • Block all port communication between components • Take individual checkpoints, and commit them atomically • Resume port communication between components • Novelty: Application to RPC-based component framework • Typically, such algorithms are applied to messaging frameworks
Y X Z The Big Picture Distributed Components on the Grid Application Coordinator MS IS IS IS IS Persistent Storage Federation of Master (MS) & Individual Storage (IS) Services
Y X Z MS IS IS IS IS Checkpoint Algorithm Application Coordinator Checkpoint Components Persistent Storage
Y X Z MS IS IS IS IS Checkpoint Algorithm Application Coordinator Block all port communication between components Persistent Storage
Y X Z MS IS IS IS IS Checkpoint Algorithm Application Coordinator All communication between components blocked Persistent Storage
Y X Z MS IS IS IS IS Checkpoint Algorithm Application Coordinator Find best available Storage service URLs Persistent Storage
Y X Z MS IS IS IS IS Checkpoint Algorithm Application Coordinator Store checkpoints into Storage services Persistent Storage
Y X Z MS IS IS IS IS Checkpoint Algorithm Application Coordinator Return storageID’s for stored state Persistent Storage
Y X Z MS IS IS IS IS Checkpoint Algorithm Application Coordinator Atomically update locators for individual checkpoints Persistent Storage
Y X Z MS IS IS IS IS Checkpoint Algorithm Application Coordinator Un-block communication between components Persistent Storage
Checkpointing: Correctness • Consistency of Global Checkpoint • A flavor of coordinated blocking algorithms – well accepted to be correct • Atomicity of Checkpoints • Locators for the global checkpoint are updated atomically after all components have been checkpointed • Not possible to have a scenario where a global checkpoint consists of a combination of old and new individual checkpoints
Restart Algorithm • Also implemented by the Application Coordinator • Details • Destroy executing instances, if need be • Restart all components (possibly on other resources) • Load state of components from the Storage services • Resume execution of all control threads, after the states of every component have been loaded from the Storage services
Test Application: Chem-Eng Simulation • Based on the simulation of copper electro-deposition on resistive substrate (NCSA-UIUC) • Master-Worker model of execution • Variable number of workers, and data size per worker • generateComponentState(), loadComponentState(), and resumeExecution() methods added to support checkpointing and restart • Required identification of the various execution states of the master and worker components
Experiment Setup • Hardware setup • 8 node Linux cluster • 2.8GHz dual processor Intel Xeon processors • Red Hat Linux 8.0 • 2GB Memory • 1Gbps Ethernet • SUN’s JDK 1.4.2_04 • Federation of 1 Master & 8 Individual Storage services used • Single GSX-based Handle Resolver
Future Work • Framework • Integration with the Web Service Resource Framework (WSRF) • Fault Tolerance • Fault Monitoring • Reliable communication between components • Checkpoint Optimizations • Storage Service Optimizations • Applications • Use of XCAT3 for LEAD (http://lead.ou.edu)
Conclusions • A framework for checkpointing & restart of distributed applications on the Grid • CCA-based component framework consistent with Grid standards • User-defined, platform-independent checkpoints • APIs for checkpointing, and algorithms for capturing global checkpoints and for restart provided by the framework • http://www.extreme.indiana.edu/xcat/
OGSI Compatibility • Representation for Provides ports • In traditional Grid/Web services, multiple ports of the same portType are semantically equivalent • CCA allows multiple ports of the same type • CCA ports can not be mapped to Web service ports! • Hence, every Provides port is mapped as a separate Grid service • A single portType containing the Provides port interface • Representation for Uses ports • Clients of Grid services (Provides ports) • Connections to Provides ports made at runtime
OGSI Compatibility • Representation for the ComponentID • Also a Grid service • Acts as a Manager for the other Provides ports • Contains SDEs containing GSH/GSRs for the various Provides ports • The Provides ports and ComponentID services, and the Uses ports communicate via shared state
Building Applications by Composition • Connect Uses Ports to Provides Ports. Image database component setImage(…) Image Processing Component getImage() Acme FFT component doFFT(…) adjustColor() Image tool graphical interface component