350 likes | 450 Views
Uniprocessor Checkpointing. CS 717 – Fall 2001 9/25/01. The Need to Save State. Many of the FT systems we have discussed need a way to restart processes from previous points in their computation A checkpoint is just a ‘snapshot’ of a process (or system) at a certain point in time
E N D
Uniprocessor Checkpointing CS 717 – Fall 2001 9/25/01
The Need to Save State • Many of the FT systems we have discussed need a way to restart processes from previous points in their computation • A checkpoint is just a ‘snapshot’ of a process (or system) at a certain point in time • A checkpointing system provides a way to take these snapshots, and to restart from them
Types of Ckpt Systems • Kernel Level • OS supports ckpt & recovery • Transparent to the application and developer • User Level • Application linked against (user) library • Library functions perform ckpt and recovery • Transparent to application • Limitations (cannot restore PID, PPID, etc.) • Application Level • Applications coded to ckpt themselves, and to restart from a checkpoint
Comparison of Levels • Kernel & User (System) Level • Easy to add checkpointing to existing code • Works with (almost) any programs • General, ‘coarse’, approach • Application Level • Could require complete re-write, or extensive modifications • Specific, ‘fine-grained’ solutions
System Level Checkpointing • Libckpt (1994) • Plank, Beck, Kingsley (UTK), Li (Princeton) • User level library for UNIX
Libckpt • User Level Checkpoint Library • Goals • Transparent • Requires minimal modifications to code and re-re-linking • Low Overhead • Automatic optimizations to reduce ckpt file size • Allow user directed checkpointing
Libckpt Overview • Taking the ‘snapshot’ • Suspend the process • Write process’ memory and registers to a file • Recovery • Reload executable from original file • Reconstruct memory and register state from checkpoint file
Libckpt Operation • Application main() is re-named ckpt_target() • Library main() checks if in restore mode (specified using command line option); otherwise reads checkpoint parameters from file
Libckpt Operation (2) • main() sets a timer to interrupt application every n seconds • On signal • Uses setjmp to record registers, pc, etc. • Writes the stack and heap segments to file • Resumes application code
Libckpt Operation • If application started with =recover as command line option • Application begins, recovering Text segments • Open checkpoint file • Recover heap from file • Recover stack from file • Restores register file (using longjmp)
Virtual Address Space Bottom of Stack Stack SP sbrk(0) Heap &edata Data (Static) &etext Text 0
main() if(recovery) restore stack restore heap pos = top of stack longjmp(pos, 1) // restore regs. else run usual code signal_handler() jmp_buf pos if(setjmp(pos)==0) //saved reg. in known //position on stack write stack write heap else // process recovered return Checkpoint And Recovery Algorithms
main() user_main() fun1() fun2() signal save regs on stack save stack to file save heap to file resume main() restore() restore stack restore heap take jump Illustration
Optimization: Incremental Checkpointing • Observation: between taking two checkpoints, only a portion of the memory has actually been changed • Optimization: save only what has been changed since last ckpt, the rest can be read from previous ckpts
Taking Incremental Ckpts. • After taking a ckpt (and after init.), set protection on all pages to ‘read-only’ • Write to page will cause a protection violation • Libckpt library catches that signal, and sets page protection to ‘read-write’, page is marked as dirty • When writing checkpoint file, only write dirty pages
Drawbacks to Incremental Ckpt • Required to keep multiple copies of the checkpoint file • On recovery, will unnecessarily restore old copies of data
Optimization: Asynchronous Checkpointing • Observation: the process must be suspended while the checkpoint file is written • Optimization: a separate thread could write the checkpoint file while the main thread was allowed to continue
Asynchronous Checkpointing • Make a copy of the process space • 2nd thread takes writes copy to disk • 1st thread continues without halting
Asynchronous Checkpointing(2) • Unix fork() provides the necessary behavior • When about to take ckpt, process forks • OS makes a complete copy of the original process’ space • Clone writes ckpt file, then dies • Original continues computing
Copy-On-Write Checkpointing • Like asynchronous checkpointing, but only copy page if the two versions are about to differ • Some (most?) OS implement fork() in this manner, so benefit is automatic
Checkpoint Compression • Use a standard data compression algorithm to shrink the size of the checkpoint file • Only improves overhead if the speed of compression is faster than the speed of disk writes, and compression is significant • “For uniprocessor checkpointing, this is not the case” • Not implemented in libckpt
User Directed Checkpointing • As described so far, libckpt is (almost) entirely transparent to the programmer • Compare to application level checkpoint requiring extensive code changes • Is there a middle ground? • Libckpt allows programmers to annotate application code with directives that guide the checkpointing
Memory Exclusion • Certain areas of memory can be excluded from the checkpoint • Dead memory – will never be read or written • Clean memory – values have not changed since previous checkpoint • Incremental Ckpt provides clean memory opt. at a coarse level (page size) • Only writing the ‘active’ areas of the stack and heap provides dead memory opt.
User Directed Memory Exclusion • Libckpt provides the app. programer with two functions • exclude_bytes(ptr, length, usage) • Specify an area of memory to exclude from future checkpoints • include_bytes(ptr, length) • Add a previously excluded area of memory to future checkpoints
Clean Memory • If mem is clean • exclude_bytes(mem, …, CKPT_READONLY) • Include mem in next checkpoint, but exclude in all subsequent • Cannot write to mem until after call to include_bytes(mem) • Restore last saved version of mem
Clean Memory: Example for (…) { A = init_A() exclude_bytes(A,…,CKPT_READONLY) do_stuff(A) //assuming A does not change include_bytes(A…) }
Dead Memory • If mem is dead • exclude_bytes(mem, …, CKPT_DEAD) • Do not checkpoint mem • Cannot read mem until after include_bytes(mem) • Will not restore mem
Dead Memory: Example for (…) { A = init_A() do_stuff(A) exclude_bytes(A…DEAD) do_other_stuff() // assumes will not read A include_bytes(A) }
Using Memory Exclusion • There can be a dramatic reduction in the size of the checkpoint file • Must be used very carefully • Inadvertently excluding a live region from a checkpoint could cause erroneous behavior on restart
Synchronous Checkpointing • At different points in the program’s execution the amount of ‘live’ state varies widely • The stack might be much smaller (shallower call graph) • Heap items might have been de-allocated • Regions of memory might be dead or clean
Synchronous Ckpt (2) • If checkpoints are taken at times where there is relatively little live state, the checkpoint file size (and overhead) will be smaller • Allow user to specify where in a program a checkpoint should be taken • Independent of timers (signals)
Sync. Ckpt. Example for (…) { checkpoint_here() A = malloc(…) do_stuff(A) free A }
Synchronous Ckpt (3) • To avoid checkpointing too frequently, mintime parameter specifies the minimal amount of time between two checkpoints • If checkpoint_here() is called less than mintime seconds after the last checkpoints, the call is ignored
Synchronous Ckpt (4) • To ensure that checkpoints are taken frequently enough to be of use, maxtime parameter specifies the maximum time allowed to elapse between two checkpoints • If maxtime passes, an asynchronous checkpoint is taken
main(){ D = malloc f = file while(!done){ D = read(f) perform_calc(D) output_result() } } ckpt_target(){ D = malloc f = file while(!done){ D = read(f) perform_calc(D) output_result() exclude_bytes(D, DEAD) checkpoint_here() include_bytes(D) } } Combining Mem. Exclusion and Sync. Checkpointing