200 likes | 281 Views
Raphael Y. de Camargo Fabio Kon Alfredo Goldman Department of Computer Science IME / USP. Portable Checkpointing for BSP Applications on Grid Environments. INTRODUCTION. Computational Grids: ubiquitous access and coordinated usage of distributed resources
E N D
Raphael Y. de Camargo Fabio Kon Alfredo Goldman Department of Computer Science IME / USP Portable Checkpointing for BSP Applications on Grid Environments SBAC 2005
SBAC 2005 INTRODUCTION • Computational Grids: ubiquitous access and coordinated usage of distributed resources • Opportunistic Grids: usage of idle time of non-dedicated resources (desktop PCs) • Resources are heterogeneous (Mac, Windows, Linux) • Failure rate is higher than dedicated resources • Fails on a daily basis
INTEGRADE • Grid middleware: usage of idle computing power from personal computers • Federation of clusters • Clusters composed of a collection of resource providing nodes • Sequential, parameter sweeping, and BSP applications SBAC 2005
SBAC 2005 MOTIVATION • Fault-tolerance is essential, specially when running parallel applications • Failure of a single node require restarting the application from the beginning • Checkpointing can be used as a fault-tolerance mechanism • Mechanisms supporting heterogeneity improve resource utilization • Portable checkpointing mechanism allows reinitialization on machines of different architecture
SBAC 2005 OUR APPROACH • Source code instrumentation • Perform additional tasks • Logging, profiling, persistence • BSP application on heterogeneous nodes • Portable checkpointing of applications • Pre-compiler based on OpenC++ • Open-source tool for compile time reflection
SBAC 2005 BSP MODEL • Bridging model • Link architecture to software • Execution performed in supersteps • Computation and synchronization phases • Two communication mechanisms: • Direct Remote Memory Access (DRMA) • Bulk Synchronous Message Passing (BSMP) • Existing implementations: • Oxford BSPLib, PUB, BSP-G • Work only on homogeneous clusters
SBAC 2005 HETEROGENEOUS NODES • Extended BSPLib API • Some mehods receive extra parameter describing data type information → used to convert data • Pointer data types are defined by their declaration • Arbitrary data casts are not allowed • Reasonable requirement for portability • Pre-compiler automatically modifies application source code to use the extended API • Not need for manual modifications
SBAC 2005 CHECKPOINTING APPROACHES • System-level checkpoint • Data is copied in chekpoints directly from application address space • Application-level checkpoint • Instrument application source code to save its state • Semantic information about data-types is available • Allows generation of portable checkpoints • Drawbacks • Need to modify application source-code • Checkpoints at certain points in the application
SBAC 2005 CHECKPOINTING LIBRARY • Pre-compiler instruments application source code • No manual instrumentation of source code • Necessary access to source code • Checkpointing Library • Timer with a minimum checkpoint interval • Saving performed by a separate thread • Checkpoint can be stored in filesystem (NFS) or remote checkpoint repository (TCP/IP) • Execution Manager • Coordinates checkpointing of BSP parallel applications
SAVING EXECUTION DATA • Necessary to save • Execution stack + global variables • Data in heap area • Other information • Execution stack • Information from active function calls • Local variables, function parameters, return address, and control information • Dependent on architecture and OS • Heap area • Memory chunks allocated by application SBAC 2005
SBAC 2005 EXECUTION STACK • Save only data necessary for reconstruction • List of function calls • Value of parameters and local variables • Data added to an auxiliary stack during execution • Recovery • Data read from checkpoint • Functions called • Local variables and parameter values assigned • Data conversion is performed if necessary
SBAC 2005 POINTERS AND HEAP MEMORY • Memory addresses • Specific for an execution • Architecture dependent • Checkpoint generation • Data from heap area is copied to checkpoint • Memory addresses → offsets in checkpoint • Recovery • Memory areas are allocated • Data is copied to these memory areas
SBAC 2005 EXPERIMENTS • Parallel BSP applications • Similarity between large sequences of characters • Matrix multiplication • Testbed: • Machines in two labs: • 11 AthlonXP 1700+, 512MB • 1 Power PC G4, 512MB • 2 Athlon 64 2800+, 512MB • 100Mbps Ethernet in 2 connected LANs
SBAC 2005 CHECKPOINTING OVERHEAD • Simulation parameters • Matrix multiplication application using 9 nodes • Matrix size: 450x450 and 1800x1800 • Checkpoint sizes: 2.3MB and 37.1MB • Checkpointing intervals: 10, 30, and 60s
SBAC 2005 CHECKPOINTING OVERHEAD • Storage on local machine or remote repository is faster than with NFS • When using a remote repository, the overhead was consistently below 10%, even with a 10s interval
SBAC 2005 DYNAMIC GRID SIMULATION • We simulated a dynamic environment where machine can fail unexpectedly • Sequence similarity application using 10 nodes • Machine fails according to an exponential distribution • MTBF (1/λ) = 600s and 1800s • Smaller checkpointing intervals → smaller execution times
HETEROGENEOUS NODES • Matrix multiplication on 4 heterogeneous nodes • 3 AtlhonXP (x86) + 1 PowerPC G4 (ppc) • Elements of type long double • Time spent on data conversion is small compared to total execution time SBAC 2005
SBAC 2005 RESTART AN APPLICATION • Time to recover from a checkpoint saved on different architectures • Application that generates a graph of structures containing 20K nodes • When recovering on an x86 machine • From x86: 0.179s • From x86-64: 0.186s → 3.9% slower than x86 • From PPC: 0.192s → 7.2% slower than x86 • Overhead when reading checkpoint data
SBAC 2005 CONCLUSIONS • Overhead of portability is small, and can lead to better resource utilization • Possible to execute BSP applications on heterogeneous nodes • Ongoing work • Distributed checkpoint repository • Scalability and Fault-tolerance • Simulations in large scale and wide area Grids • Support for multithreaded C++ applications
SBAC 2005 QUESTIONS For more information, please visit the poject page: http://gsd.ime.usp.br/integrade