260 likes | 495 Views
Transaction-Oriented Database Recovery. Application Programmer (e.g., business analyst, Data architect). Application. Sophisticated Application Programmer (e.g., SAP admin). Query Processor. Indexes. Storage Subsystem. Concurrency Control. Recovery. DBA, Tuner. Operating System.
E N D
ApplicationProgrammer(e.g., business analyst, Data architect) Application SophisticatedApplicationProgrammer(e.g., SAP admin) QueryProcessor Indexes Storage Subsystem Concurrency Control Recovery DBA,Tuner Operating System Hardware[Processor(s), Disk(s), Memory]
Outline • Principles of transaction-oriented database recovery • Recovery tuning
Transaction-Oriented Database Recovery • Transaction properties • A: Atomicity • C: Consistency • I: Isolation • D: Duration • A database is transaction or logically consistent iff it contains the results of successful transactions
Failures To Recover From • Transaction failure • Self- or system-abort • To recover within time for normal transaction • 10-100 times per min. • System failure • OS or DBMS crash • To recover in same amount of time as required for all interrupted transactions • A few times per week • Media failure • Disk crash • To recover in hours • A few times per year
Recovery Actions • Transaction UNDO – roll-back a specific active trans • Global UNDO – roll-back all active trans • Partial REDO – re-instate some committed trans • Global REDO – re-instate all committed trans Failure Type Recovery Action Transaction Transaction UNDO System Global UNDO, Partial REDO Media Global REDO
Log for UNDO/REDO • Logical logging – operators & their arguments • Requires atomic actions from physical layer • Not always possible/justifiable • Physical state logging • Before and/or after image • Physical transition logging • Use XOR: commutative and associative • Log XOR before image after image • Log XOR after image before image • Lower space consumption (1 entry/change; compress long strings of 0s – small number of changes)
System Framework Source: T. Haerder, A. Reuter
Log Timing • UNDO entries must reach log file before changes are written out – Write-Ahead Logging (WAL) principle • To enable roll-back if necessary • REDO entries must reach log file before End-Of-Transaction (EOT) is acknowledged • To enable re-instatement after failure
UNDO STEAL: Modified pages may be written anytime ~STEAL: Modified pages kept in buffer till after transaction commits Large buffers required No global UNDO Transaction UNDO within memory No logging required for UNDO REDO FORCE: All modified pages written during EOT No need to log for partial REDO Need logging for global REDO ~FORCE: No propagation during EOT Dependency with Buffer Management At least one of global UNDO or partial REDO is always required. Why?
Checkpointing to Optimize Recovery • Problem • With LRU buffer replacement, frequently used pages will remain in buffer • Partial REDO has to go back very far • Checkpointing limits amount of partial REDO • Checkpoint • Write BEGIN-CHECKPOINT to temporary log • Write checkpoint data to log • Write END-CHECKPOINT to temporary log
Crash Recovery with Checkpoint Oldest Page In Buffer Checkpoint Crash T1 Nothing T2 REDO T3 T4 UNDO T5 Analyze Recovery Process UNDO REDO
Transaction-Oriented Checkpoint (TOC) • FORCE TOC • EOT (BEGIN-CHECKPOINT, END-CHECKPOINT) • Frequently used pages need to be written out each time a transaction commits • Not suitable for large applications Source: T. Haerder, A. Reuter
Transaction-Consistent Checkpoint (TCC) Source: T. Haerder, A. Reuter
Transaction-Consistent Checkpoint (TCC) • When checkpoint generation is triggered • All new update transactions are put on hold • All incomplete update transactions are completed • Write out all modified pages • Both REDO and UNDO are bounded • REDO starts from latest checkpoint • UNDO back to latest checkpoint • Drawback • Delay new update transactions; not suitable for large multi-user DBMS • High checkpointing costs
Action-Consistent Checkpoint (ACC) Source: T. Haerder, A. Reuter
Action-Consistent Checkpoint (ACC) • When checkpoint generation is triggered • All new actions are put on hold • All incomplete actions are completed • Write out all modified pages • Less disruptive than TCC • Partial REDO only from the most recent checkpoint • Global UNDO not bounded • Still costly when buffers are large
Fuzzy ACC • During checkpointing, the numbers of all dirty pages in buffer are written to the log • If a modified page is found in the previous checkpoint, and since then has not been written out, write it out now • Partial REDO from penultimate checkpoint
Archive Recovery Source: T. Haerder, A. Reuter Make sure the two paths are independent!!
Multi-Generation Archive Copies • Archive copies are accessed very infrequently • Subject to magnetic decay • Keep several generations Source: T. Haerder, A. Reuter
Duplicate Archive Logs Source: T. Haerder, A. Reuter
Duplicate Archive Logs • Archive log must extend back to the oldest archive copy • Log susceptible to magnetic decay as well • Duplicate archive log • Need to synchronize both archive logs with temporary log at EOT • Very expensive!
Decouple Archive Logs from EOT Source: T. Haerder, A. Reuter
Decouple Archive Logs from EOT • Log entries written only to temporary log during EOT • Asynchronous process copies REDO entries to archive log • Need to replicate temporary log • Synchronize both temporary logs at EOT
Crash recovery TOC: Per transaction TCC: Transaction boundary ACC: Action boundary Archive recovery Multi-generation archive copy Duplicate archive logs Decouple archive log from EOT Summary • Failure types Failure Type Recovery Action Transaction Transaction UNDO System Global UNDO, Partial REDO Media Global REDO