190 likes | 377 Views
ICS 214A: Database Management Systems Fall 2002. Lecture 17: Checkpoints Professor Chen Li. Recovery is very, very SLOW !. Undo log: First Record Last Record (1 year ago) We do not want to rescan all the log records! Some of them can be removed. Crash. Solution: Checkpoint.
E N D
ICS 214A: Database Management Systems Fall 2002 Lecture 17: Checkpoints Professor Chen Li
Recovery is very, very SLOW ! Undo log: First Record Last Record (1 year ago) We do not want to rescan all the log records! Some of them can be removed. ... ... ... Crash Notes 17
Solution: Checkpoint Simple Version Periodically: (1) Do not accept new transactions (“quiescent”) (2) Wait until all current transactions finish (3) Flush all log records to disk (4) Flush all data buffers to disk (5) Write log record <CKPT> and flush the log (6) Resume accepting transactions Notes 17
Example: Undo log, quiescent ckpt Log: <T1, START> <T1, A, 5> <T2, START> <T2, B, 10> Do a checkpoint • Wait until both T1 and T2 finish (commit or abort); • Then flush the data and log, and write <CKPT> to the log. Final Log <T1, START> <T1, A, 5> <T2, START> <T2, B, 10> <T2, C, 15> <T1, D, 20> <T1, COMMIT> <T2, COMMIT> <CKPT> … Notes 17
Recovery: Undo log, quiescent ckpt Log after a crash: <T1, START> <T1, A, 5> <T2, START> <T2, B, 10> <T2, C, 15> <T1, D, 20> <T1, COMMIT> <T2, COMMIT> <CKPT> <T3, START> <T3, E, 25> <T3, F, 30> • Scan the log backwards from the end and identify incomplete transactions • Once see a <CKPT> record, ignore record before this <CKPT> • Why? All transactions before this ckpt must have finished. • Other operations same as before • Example: • T3 is the only incomplete transaction • Undo F and E. Write <T3, abort> Notes 17
Nonquiescent checkpoint (undo) • We don’t want the system to “halt” to do a checkpoint • How to accept xacts during a checkpoint? • Write (flush) log record <START CKPT (T1,…,Tk)>, where T1,…,Tk are active (not finished) transactions. • Wait until them to finish (complete and abort). Meanwhile, accept new transactions. • After these k transaction complete, write (flush) a log record <END CKPT>. Notes 17
Ex: Undo log, nonquiescent ckpt Undo Log: <T1, START> <T1, A, 5> <T2, START> <T2, B, 10> <START CKPT(T1,T2)> Start checkpointing <T2, C, 15> continue, accept new xacts, <T3, START> until T1 and T2 complete <T1, D, 20> <T1, COMMIT> <T3, E, 25> <T2, COMMIT> <END CKPT> end checkpointing <T3, F, 30> continue Notes 17
Recovery: Undo log, nonquiescent ckpt • Scan the log backwards from the end • Case 1: meet a <END CKPT> first • Then all incomplete xacts began after the previous <START CKPT(…)> log record • Thus we can scan backwards until the previous <START CKPT(…)> log record • Ignore log before this record • Ex: • T3 is the only incomplete xact, and should be undone • Restore data element F back to 30. <T1, START> <T1, A, 5> <T2, START> <T2, B, 10> <START CKPT(T1,T2)> <T2, C, 15> <T3, START> <T1, D, 20> <T1, COMMIT> <T3, E, 25> <T2, COMMIT> <END CKPT> <T3, F, 30> Notes 17
Recovery: Undo log, nonquiescent ckptcase 2 • Scan the log backwards from the end • Case 2: meet a <START CKPT(T1,…,Tk)> first • Then all incomplete xacts include: • Those incomplete xacts we met before this <START CKPT()> log record; and • Those of (T1,…,Tk) that are incomplete • Thus we need to scan to the start of the earliest incomplete xact • Discard the previous log records • Undo incomplete xacts • Ex: • Incomplete xacts: (T2, T3) • T1 is complete! • Scan until the start of T2 (earliest) <T1, START> <T1, A, 5> <T2, START> <T2, B, 10> <START CKPT(T1,T2)> <T2, C, 15> <T3, START> <T1, D, 20> <T1, COMMIT> <T3, E, 25> Notes 17
Improvement <T1, START> <T1, A, 5> <T2, START> <T2, B, 10> <START CKPT(T1,T2)> <T2, C, 15> <T3, START> <T1, D, 20> <T1, COMMIT> <T3, E, 25> • Use pointers to chain together the log records of the same xact • Then we can follow the chain to find the “start” record of this xact. Notes 17
General rule: Undo log, nonquiescent ckpt <T1, START> <T1, A, 5> <T2, START> <T2, B, 10> <START CKPT(T1,T2)> <T2, C, 15> <T3, START> <T1, D, 20> <T1, COMMIT> <T3, E, 25> <T2, COMMIT> <END CKPT> <T3, F, 30> • Once an <END CKPT> record has been written to disk, we can delete the log prior to the previous <START CKPT> record Notes 17
Next: checkpoint in Redo Logging Notes 17
Complications • For a xact whose <COMMIT> log record is written on disk, • its changed data elements can be copied to disk much later • Thus, between a <START CKPT> and an <END CKPT> • We must write to disk all DB elements that have been modified by committed xacts but not yet written to disk • Need to keep track of all the dirty buffers • We can complete the ckpt without waiting for the active xacts (not completed) to complete (commit or abort), since they are not allowed to write their pages to disk at that time anyway Notes 17
Quiescent checkpoint (redo) • Write (flush) log record <START CKPT (T1,…,Tk)>, where T1,…,Tk are active (uncommitted) xacts. • Write to disk all DB elements that are written to buffers but not yet to disk by xacts that had already committed when the <START CKPT> record was written to the log • Write (flush) a log record <END CKPT>. Notes 17
Ex: redo, checkpoint, nonquiescent Redo Log: <T1, START> <T1, A, 5> <T2, START> <T1, COMMIT> <START CKPT(T2)> Start checkpoint <T2, C, 15> continue, accept new xacts, <T3, START> make sure A=5 by T1 is on disk <T3, D, 20> <END CKPT> end checkpoint <T2, COMMIT> continue <T3, COMMIT> Notes 17
Recovery: redo, nonquiescent (case 1) • Search backwards the log • Case 1: <END CKPT> is seen before <START CKPT(T1,…,Tk)> • All xacts committed before <START CKPT> have their data element changes on disk. These xacts can be ignored • Xacts T1,…,Tk and those new xacts after <START CKPT> that have committed need to be redone • Find the earliest of the <START Ti> records • Can use pointers to improve the performance • Ex: • T2 and T3 need to be considered • Since both have “COMMIT” records need to be redone Redo Log: <T1, START> <T1, A, 5> <T2, START> <T1, COMMIT> <START CKPT(T2)> <T2, C, 15> <T3, START> <T3, D, 20> <END CKPT> <T2, COMMIT> <T3, COMMIT> Notes 17
Recovery: redo, nonquiescent (case 1) • Ex: • T2 and T3 need to be considered • Since T2 has a “COMMIT” records, it needs to be redone • T3 can be ignored Redo Log: <T1, START> <T1, A, 5> <T2, START> <T1, COMMIT> <START CKPT(T2)> <T2, C, 15> <T3, START> <T3, D, 20> <END CKPT> <T2, COMMIT> Notes 17
Recovery: redo, nonquiescent (case 2) • Search backwards the log • Case 2: <START CKPT(T1,…,Tk)> is seen before <END CKPT> • Not sure if xacts prior to this <START CKPT> has their data element changes on disk. • Need to find the previous <START CKPT(S1,…,Sm)> • Redo those committed xacts that start after the previous <START CKPT> or among those Si’s • Ex: • Look for the previous <START CKPT> • T0 and T1 are the committed xacts need to be redone • T2 and T3 are ignored Redo Log: <START CKPT(T0)> … <T0, COMMIT> … <END CKPT(T2)> <T1, START> <T1, A, 5> <T2, START> <T1, COMMIT> <START CKPT(T2)> <T2, C, 15> <T3, START> <T3, D, 20> Notes 17
Next: Redo/Undo logging Notes 17