150 likes | 266 Views
GGF10 - GridCPR-WG PARIS project-team Activities in Checkpoint Recovery. Christine Morin Christine.morin@irisa.fr PARIS INRIA project-team IRISA – Rennes (France) http://www.irisa.fr/paris. SAN. LAN. WAN. SAN. Cluster Federations. A particular case of grid
E N D
GGF10 - GridCPR-WG PARIS project-teamActivities in Checkpoint Recovery Christine Morin Christine.morin@irisa.fr PARIS INRIA project-team IRISA – Rennes (France) http://www.irisa.fr/paris Berlin, March 11th, 2004
SAN LAN WAN SAN Cluster Federations • A particular case of grid • Interconnection of several clusters of moderate size • Homogeneity and heterogeneity • More and more homogeneous platforms: PC, Linux • Heterogeneous networks (SAN, LAN, WAN) • Clusters with different amount and kinds of resources • Considered applications • Scientific applications (numerical simulation) • sequential and parallel applications based either on the shared memory or the message-passing communication paradigm • Code coupling applications • Applications requiring a huge amount of resources (memory, computing power) • Dynamicity • A cluster may join or leave the federation at any time • Individual nodes may fail in a cluster Berlin, March 11th, 2004
Grid-aware OS for Cluster Federations • A single system image OS on each cluster • A cluster appears as a single machine which offers a kind of standard interface • Mosix, Amoeba, Kerrighed • A cluster federation is seen as a set of pairs • Structured peer to peer (P2P) network (instead of a hierarchy) • Fully decentralized control • Native support for dynamicity • Designed for scalability • Size of the routing tables bounded by log(N) • Probabilistic log(N) bounds on the number of routing hops • “Standardization” of the APIs (IRIS project) • Promising work to take into account the network's topology and security issues (Pastry) • Structured P2P systems usually provide distributed hash tables (DHT) • Building block for higher level services Berlin, March 11th, 2004
Current Work on Checkpoint Recovery • Cluster Federation • Execution of multithreaded applications in cluster federations • A coherence protocol for cached copies of volatile objects in peer-to-peer systems (multiple failures tolerated) • Hierarchical checkpointing protocol for code coupling applications • Cluster SSI image operating system: Kerrighed • Full Posix thread interface • Global process and memory management • Configurable global scheduler • High availability • Dynamic resource management for tolerating cluster reconfigurations (node addition, eviction or failure) • Checkpoint recovery mechanisms Berlin, March 11th, 2004
Experimental platform for checkpointing strategies for parallel applications Basic mechanisms common to different checkpointing protocols in MP and SM systems Being able to checkpoint any kind of parallel application Transparent checkpointing Implementation in a single system of various checkpointing strategies To allow the programmer to choose a suitable strategy for a particular application To be able to compare several strategies with realistic (industrial) applications Avoid code duplication in the system Robustness Fair comparison Common framework Checkpoint and rollback servers Checkpoint numbering Dependency management Unified model for message-passing and shared memory models Direct Dependency Vector (DDV) management Message logging Incremental checkpointing Checkpointing in background Communication system Atomic multicast Stable storage Different implementations Disk Memory Goals for Checkpoint Recovery in Kerrighed Berlin, March 11th, 2004
Current Status Linux-based Kerrighed prototype (2.4) Small kernel patch and a set of modules Transparent checkpoint recovery for (computing) individual processes Virtualization of a process in the cluster Unique ghost mechanism for process migration, checkpointing and restoration Easy specialization of the stable storage implementation Ghost can be sent to or retrieved from network, memory or disk Work Directions Complete the debugging of coordinated checkpointing (and recovery) for multithreaded and message-passing based applications Checkpointable locks and barriers in a cluster Disk I/O management Posix extension for a proper integration of transparent checkpointing/recovery in the operating system Duplication Migration Checkpoint/restart Ghost process Memory Disk Network Checkpoint Recovery in Kerrighed: Current Status and Work Directions Berlin, March 11th, 2004
Relaxed inter-cluster synchronism to reflect the architecture Coordinated checkpointing in a cluster Communication-induced checkpointing between clusters Independent checkpoints in each cluster Forced checkpoints when a communication generates a new dependency Force a checkpoint only if the sender has saved a checkpoint since its last send Several cluster checkpoints are kept Management of Direct Dependency Vectors (DDV) to detect dependencies DDV included in inter-cluster messages DDV associated with cluster checkpoints Garbage collection of useless cluster checkpoints Evaluation by discrete-event simulation Works well if Few inter-cluster communications Inter-cluster communications « quasi-unidirectional » Simulation Processing Display Simulation Simulation Hierarchical Checkpoint Recovery for Cluster Federations Berlin, March 11th, 2004
Future Work • Checkpoint recovery in the large (we plan to hire a PhD student) • Dealing with applications with huge data sets executed in cluster federations • Follow-up of our preliminary work on a hierarchical checkpointing protocol for code coupling applications in cluster federations • Based on Kerrighed experimental platform • Not only basic coordinated checkpointing but also various variants of independent and communication-induced strategies • Standard interface and basic building blocks • Implementation in Kerrighed of ideas studied in previous projects • ICARE fault tolerant software DSM • Combining replication inherent to the DSM with the replication needed for ensuring recovery data stability • Extension of the coherence protocol to manage recovery data in memory • HA-PSLS • Integration of a DSM and a parallel file system • Up-grading ICARE • Cohabitation of persistent and memory checkpoints • Swap management (to avoid memory size limitation and to evict recovery data from memory) • Mapped file management (in-place checkpoints) Berlin, March 11th, 2004
Kerrighed is registered as a communitytrademark. http://www.kerrighed.org kerrighed.users@irisa.fr Berlin, March 11th, 2004
Software Distribution • Kerrighed web site • http://www.kerrighed.org (open since mid-November 2002) • Open source under GPL licence • Current version: Kerrighed V0.81 based on Linux 2.4.24 • Kerrighed users mailing-list • Kerrighed.users@irisa.fr (created in April 2003) • Kerrighed forum (created February 2004) • Notes • Kerrighed is a registered trademark • Kerrighed deposit at APP for each public release • Kerrighed tutorial (in conjunction with ICS’04, Saint-Malo (France), June 27th, 2004) Berlin, March 11th, 2004
RoadMap for Kerrighed Prototype • March 2004 • MPI (with migration) • April 2004 Kerrighed V1.00 (SSI-OSCAR) • SGFD • January 2005 Kerrighed V1.10 • 64 bits (opteron) • Checkpointing for parallel applications • July 2005 Kerrighed V2.0 • High availability Berlin, March 11th, 2004
Current Support: EDF • Kerrighed research prototype (2000-2003) • CRECO EDF/INRIA • CIFRE Ph.D. grant (Geoffroy Vallée) • Industrial Post-Doc (Renaud Lottiaux) • Experimentations with first industrial applications provided by EDF • HRM1D, CATHARE, Cyrano 3, Aster • Kerrighed integration in OSCAR (2004-2005) • INRIA Industrial Post-Doc (G. Vallée) with EDF & ORNL • SSI-OSCAR Berlin, March 11th, 2004
Current Support: DGA • Kerrighed robustness and full set of functionalities (2003-2005) • COCA PEA funded by DGA • Partnership with CGEY and ONERA-CERT • 2 full time engineers (Renaud Lottiaux, David Margery) • Experimentations with industrial applications • Ligase, Gorf3D, Mixsar, RTI HLA Berlin, March 11th, 2004
Faculty Christine Morin (DR, INRIA) PhD students Pascal Gallard (INRIA) Gaël Utard (INRIA) Louis Rilling (ENS-Cachan) Post-doc Geoffroy Vallée (PDI-EDF) Engineers Renaud Lottiaux (INRIA) David Margery (INRIA) Invited researcher Isaac Scherson (UCI) Master students Jamal Ghaffour Etienne Rivière Former members Ramamurthy Badrinath (assistant professor, IIT Kharagpur, India) May 2002 – April 2003 Viet Hoa Dinh (engineer) September 2001-September 2002 Jean-Yves Burlett (Master student, univ. Rennes 1) February-June 2001 Sébastien Monnet (Master student, univ. Rennes 1) February-June 2003 H. Maka (Bachelor student, IIT Kharagpur) May-July 2003 Current Kerrighed Team (being part of the PARIS project-team) Berlin, March 11th, 2004
Academic Collaborations • University of Ulm, Germany • Checkpointing for shared memory parallel applications • Rutgers University, USA • Myrinet, Infiniband • Self healing clusters • ORNL • SSI-OSCAR • University of California, Irvine, USA • Global scheduling • Deakin University, Australia • SSI (informal contacts) Berlin, March 11th, 2004