320 likes | 338 Views
This paper presents stdchk, a checkpointing-optimized storage system designed for high-throughput, scalable, and reliable storage specifically tailored for checkpointing applications. By supporting transparent incremental checkpointing and efficient data management techniques, stdchk minimizes the performance overhead typically associated with checkpointing operations. The system offers support for write-intensive applications, ensuring high throughput while reducing the load on the file system. With features such as simplified data management and replication for high reliability, stdchk allows for seamless integration into existing applications without requiring modifications. Through experimentation, stdchk demonstrates significant improvements in storage bandwidth performance for checkpointing workloads.
E N D
stdchk: A Checkpoint Storage System for Desktop Grid Computing Samer Al-Kiswany – UBC Matei Ripeanu – UBC Sudharshan S. Vazhkudai – ORNL Abdullah Gharaibeh – UBC Oak Ridge National Laboratory The University of British Columbia
Checkpointing Introduction Checkpointing uses: fault tolerance, debugging, or migration. Typically, an application running for days on hundreds of nodes (e.g. a desktop gird ) saves checkpoint images periodically. . . . C C C C ICDCS ‘08
Deployment Scenario ICDCS ‘08
The Challenge • Although checkpointing is necessary: • It is a pure overhead from the performance point of view. Most of the time spent writing to the storage system. • Generates a high load on the storage system Requirement: High performance, scalable, and reliable storage system optimized for checkpointing applications. Challenge: Low cost, transparent support for checkpointing at file-system level. ICDCS ‘08
Checkpointing Workload Characteristics • Write intensive application ( bursty ). e.g., a job running on hundreds of nodes. periodically checkpoints 100s of GB of data. • Write once, rarely read during application execution. • Potentially high similarity between consecutive checkpoints. • Applications specific checkpoint image life span. When it is safe to delete the image ? ICDCS ‘08
Why Checkpointing-Optimized Storage System? • Optimizing for checkpointing workload can bring valuable benefits: • High throughput through specialization. • Considerable storage space and network effort saving. through transparent support for incremental checkpointing. • Simplified data management by exploiting the particulaities of checkpoint usage scenarios. • Reduce the load on a share file-system • Can be built atop scavenged resources – low cost. ICDCS ‘08
stdchk A checkpointing optimized storage system built using scavenged resources. ICDCS ‘08
Outline • stdchk architecture • stdchk features • stdchk system evaluation ICDCS ‘08
stdchk Architecture Manager (Metadata management) Benefactors (Storage nodes) Client (FS interface) ICDCS ‘08
stdchk Features • High-throughput for write operations • Support for transparent incremental checkpointing • Simplified data management • High reliability through replication • POSIX file system API – as a result using stdchk does not require modifications to the application. ICDCS ‘08
Optimized Write Operation Alternatives Write procedure alternatives: • Complete local write • Incremental write • Sliding window write ICDCS ‘08
Optimized Write Operation Alternatives Write procedure alternatives: • Complete local write • Incremental write • Sliding window write Compute Node Application stdchk stdchk FS Interface Disk ICDCS ‘08
Optimized Write Operation Alternatives Write procedure alternatives: • Complete local write • Incremental write • Sliding window write Compute Node Application stdchk stdchk FS Interface Disk ICDCS ‘08
Disk Optimized Write Operation Alternatives Write procedure alternatives: • Complete local write • Incremental write • Sliding window write Compute Node Application stdchk stdchk FS Interface Memory ICDCS ‘08
Write Operation Evaluation Testbed: 28 machines Each machine has : two 3.0GHz Xeon processors, 1 GB RAM, two 36.5GB SCSI disks. ICDCS ‘08
Achieved Storage Bandwidth • Sliding Window write achieves high bandwidth (110 MBps) • Saturates the 1 Gbps link The average ASB over a 1 Gbps testbed. ICDCS ‘08
stdchk Features • High throughput write operation • Transparent incremental checkpointing • Checkpointing optimized data management • POSIX file system interface – no required modification to the application ICDCS ‘08
Transparent Incremental Checkpointing • Incremental checkpointing may bring valuable benefits: • Lower network effort. • Less storage space used. But : How much similarity is there between consecutive checkpoints ? How can we detect similarities between checkpoints? Is this fast enough? ICDCS ‘08
Hashing X Checkpoint T0 X T0 Y Y Z Z Similarity Detection Mechanism – Compare-by-Hash ICDCS ‘08
Hashing Checkpoint T1 W Y T1 Z W Similarity Detection Mechanism – Compare-by-Hash Will store T1 X T0 Y Z ICDCS ‘08
Similarity Detection Mechanism • How to divide the file into blocks? • Fixed-size blocks + compare-by-Hash (FsCH) • Content-based blocks + compare-by-Hash (CbCH) ICDCS ‘08
FsCH Insertion Problem B1 B2 B3 B4 B5 Checkpoint i B1 B2 B3 B4 B5 B6 Checkpoint i+1 Result: Lower similarity detection ratio. ICDCS ‘08
offset HashValueK= 0 ? HashValueK= 0 ? HashValueK= 0 ? Content-based Compare-by-Hash (CbCH) B1 B2 B3 B4 Checkpoint i m bytes Hashing k bits ICDCS ‘08
Content-based Compare-by-Hash (CbCH) B1 B2 B3 B4 Checkpoint i B1 BX B3 B4 Checkpoint i+1 Result: Higher similarity detection ratio. But: Computationally intensive. ICDCS ‘08
Evaluating Similarity Between Consecutive Checkpoints The Applications : BMS* and BLAST Checkpointing interval: 1, 5 and 15 minutes * Checkpoints by Pratul Agarwal (ORNL) ICDCS ‘08
Similarity Ratio and Detection Throughput The table presents the average rate of detected similarity and the throughput in MB/s (in brackets) for each heuristic. But: Using the GPU, CbCH achieves over 190 MBps throughput !! - StoreGPU: Exploiting Graphics Processing Units to Accelerate Distributed Storage Systems, S. Al-Kiswany, A. Gharaibeh, E. Santos-Neto, G. Yuan, M. Ripeanu, HPDC, 2008. ICDCS ‘08
Compare-by-Hash Results FsCH slightly degrades achieved bandwidth. But reduces the storage space used and network effort by 24% Achieved Storage Bandwidth ICDCS ‘08
Outline • stdchk architecture • stdchk features • stdchk overall system evaluation ICDCS ‘08
Steady Nodes Leave Nodes Join stdchk Scalability • stdchk sustains high loads : • Number of nodes • Workload 7 clients: Each client writes 100 files (100MB each). Total of 70GB. stdchk pool of 20 benefactor nodes. ICDCS ‘08
Experiment with Real Application Application : BLAST Execution time: > 5 days Checkpointing interval : 30s Stripe width : 4 benefactors Client machine: two 3.0GHz Xeon processors, SCSI disks. ICDCS ‘08
Summary stdchk : A checkpointing optimized storage system built using scavenged resources. stdchk features: • High throughput write operation • Saves considerable disk space and network effort. • Checkpointing optimized data management • Easy to adopt – implements a POSIX file system interface • Inexpensive - built atop scavenged resources Consequently, stdchk: • Offloads the checkpointing workload from the shared FS. • Speeds up the checkpointing operations (reduces checkpointing overhead) ICDCS ‘08
Thank you netsyslab.ece.ubc.ca ICDCS ‘08