390 likes | 546 Views
Transactional Recovery and Checkpoints. Difference . How is this different from schedule recovery? It is the details to implementing schedule recovery It is at a lower level (transactional) – Operating System level Writing out dirty data, etc. Recovery is needed for atomicity and durability.
E N D
Difference • How is this different from schedule recovery? • It is the details to implementing schedule recovery • It is at a lower level (transactional) – Operating System level • Writing out dirty data, etc.
Recovery is needed for atomicity and durability • Problems: • Reading • involves buffers, paging and LRU replacement (operating system) • Writing • when to write updates from memory to disk? • When LRU replaces (use dirty bit)? • What if crash and still in memory? • No immediate durability • No atomicity • Solution? • Have a log buffer with log entries and a log file
Log Buffer • Log buffer contains list of log entries • Log entries: Start, Commit, Write
Log Entries • In log entry keep track of: 1) Start of Transactions (S, T#) S1 2) committed T's ( C, T#) C1 3) operations - enter writes but not reads: (W, T#, data_item (or RID), before_value, after_value) (W,1,A,10,20) • why before values? • in case must UNDO • why after values? • in case must REDO
Log buffer and Log file • Log buffer stored in memory • Log file is a copy of the log buffer on disk (permanent storage) • Log buffer written to log file at intervals • Assume written when transaction commits • Log file used to perform recovery • Log buffer and log file guarantees atomicity and durability
Log files vs. Data file • 2 different concepts: 1) Log buffer entries written to log file vs. 2) Making changes to actual data on disk • DB data file is stored on disk (permanent storage) • Updated data pages (dirty pages) are cached in memory • Data file on disk not updated with dirty pages until some event (discussed later)
ACID properties For atomicity Rollback - make list of committed T's and UNDO uncommitted T's actions if written to data disk For durability Rollforward - to REDO committed T's actions if not written to data disk NOTE: can only UNDO, REDO what is in log file NOT log buffer
Example R1(A,50) W1(A,20) R2(C,100)W2(C,50)C2 R1(B,50)W1(B,8)C1 1 2 3 4 5 6 7 8 9 10 Log entries: 1. (S1) 2. no entry 3. (W,1,A,50,20) 4. (S2) 5. no entry 6. (W,2,C,100,50) 7. (C2) write log buffer to log file 8. no entry 9. (W,1,B,50,80) 10. (C1) write log buffer to log file What if crash on operation 9?
How to recover from a crash • Must use information from non-volatile (permanent) storage • If not in permanent storage, it is lost • If ever restarted, it is assumed to be later and NOT part of the recovery • Method: • For each entry in log (move backwards in log) determine if should ignore or undo • Then (move forward in log) and redo relevant entries
One way to recover from a crash T1 not committed, T2 committed – what to do? Must undo T1 and redo T2 Rollback until log file empty 1. C2 - T2 on committed list 2. (W,2,C,100,50) C2 on list, do nothing 3. (S2) T2 no longer active 4. (W,1,A,50,20) T1 not on the committed list - UNDO this update only if written to data disk (NOTE: Differs from textbook – says can always UNDO) 5. (S1) T1 no longer active
How to recover cont’d Rollforward 6. S1 - no action 7. (W,1,A,50,20) T1 uncommitted - no action 8. (S,2) no action 9. (W,2,C,100,50) Redo update – assume not on data disk yet 10. (C,2) no action 11. done Note: T1 may be restarted later, but not part of recovery
Durability Problems - WAL • Durability - commits successful only once log file written on disk • if failure and log entry not written? • can never UNDO or REDO • Possible Solutions – Write Ahead Log (WAL) • Log Buffer written to log file when a Transaction commits (which we did) • Write dirty pages when log buffer full
Dirty Pages Dirty data pages not written until commit Atomicity Problem: Be sure dirty pages of data not written to disk before log entry – OS Solution: Can modify LRU replacement to ensure data not written to disk before written to log file
Dirty Pages If updated data page not written to disk until commit (UNDO) • no UNDO processing ever • no before images needed in logs • What if lot of updates? can't keep all in memory – so write whenever full?
Dirty Pages • Additionally: Write dirty pages when full • How to ensure dirty pages not written until log buffer written to disk (REDO) • Write ahead log guarantee (WAL) log sequence number (LSN)- • every log entry keeps track of smallest LSN to log file since last write (lsn_buffmin) • keep track of updates to data pages (lsn_pgmax) • cannot write page to disk unless lsn_pgmax < lsn_buffmin
When to write to log file • Is ACID guaranteed? • Checkpoints: Provides an approach to address ACID properties • Always write to log buffer when commit • Also determines when to write to data file on disk
Checkpoints • Recovery Checkpoints used so only have to rollback to a certain point • A Checkpoint • Consistent snapshot of all of the database - values on durable disk “A Checkpoint is a database event which synchronizes the modified data blocks in memory with the datafiles on disk.”
Checkpoint CKPT – my definition • A CKPT reflects current state of database • A CKPT includes data updates from all newly committed transactions • and possibly updates from uncommitted transactions • Assume dirty data only written at CKPT
Recovery • T's short-lived (take a few seconds) therefore rollback is quick - only a few T's active that need to be UNDONE • Rollforward takes longer - many T's to REDO, keep track of start T's, etc.
Checkpoint strategies • At checkpoint: • log buffer written to log file • Updated pages written to data file on disk • Still assume log buffer always written to log file at commit • 3 different approaches for when to take checkpoints • Commit consistent • Cache consistent • Fuzzy consistent
Commit consistent 1. Commit consistent -needed when count of log events exceeds some limit Enter checkpoint state: a) no new T's start until checkpoint complete b) DB processing continues until all existing T's commit and log entries out on disk c) current log buffer written to log file, all dirty pages written to disk d) when a)-c) complete, special log entry CKPT entered (these steps are the same as an orderly shutdown of the system )
Commit consistent cont’d • To recover: • Rollback? • Rollforward? • start at CKPT, then REDO committed since CKPT • Problems • But what if some transactions are long-lived? • must wait a long-time for them to finish, with no new T's active
Cache-consistent – Oracle, Informix 2. Cache-consistent checkpoint - aim to reduce time if long transactions a) No new T's permitted to start b) existing T's cannot start any new ops c) current log buffer written to disk, all dirty data pages to disk (even it not committed) d) log entry (CKPT, list of active T's) written on log file on disk
Cache-consistent cont’d • To recover: • must rollback past CKPT • Rollback - Starting at last CKPT, keep rolling back until UNDO uncommitted in active list in CKPT • Rollforward – Starting at last CKPT, REDO all updates by committed T's
Cache-consistent cont’d • Problems: • Time to flush dirty pages to disk
Fuzzy - Informix 3. Fuzzy checkpoint aim to reduce time to perform a checkpoint makes use of 2 checkpoint events; CKPTn-1 and CKPTn a) no new T's starts - existing T's no new ops b) current log buffer written to disk with CKPTn c) set of dirty pages in buffer that accumulated since CKPTn-1 is noted background process makes sure pages written out on data file disk by next checkpoint CKPTn+1 e) Note: prior to checkpoint, remaining pages dirty between CKPT n-2 and CKPT n-1 forced to disk
Fuzzy • Rollforward? • starts with first log entry following 2nd to last checkpoint log CKPTn-1 ...... CKPTn (start at CKPTn-1) • set of dirty pages since CKPTn-1 will written to disk by CKPTn+1 (buffer flushing at time of CKPTn if not completed yet) • REDO committed transactions starting at CKPTn-1 • Rollback? • UNDO uncommitted • But has it been written to disk yet? (complicated in reality)
Important observations • Only UNDO • if data written to data disk at CKPT before transaction committed (didn’t commit before checkpoint) • Only REDO • if committed transaction’s data not written to data disk (committed after checkpoint)
Log buffer, Log file, Data file and Checkpointing • Log buffer contains list of operations (Start, Commit, Write); it is stored in memory. • Log file is a copy of the log buffer on disk (permanent storage). • Data file is stored on disk (permanent storage). Updated data pages (dirty pages) are cached in memory. Data file on disk not updated with dirty pages until CKPT. • A CKPT reflects current state of database. It includes data updates from all newly committed transactions, and even updates from uncommitted transactions (if cache or fuzzy checkpoint).
Events • At CKPT: • Updated pages written to data file on disk and log buffer written to log file. • At a COMMIT: • log buffer also written to log file, but updated data not written to disk. • UNDO: • when data written to data disk and transaction doesn’t commit. • REDO: • when transaction committed but not written to data disk yet.
Rollback to where? • Commit consistent: • To CKPT • Cache consistent: • Past CKPT until no active transactions if cache consistent • Fuzzy consistent: • To CKPTn-1 if fuzzy consistent • Which strategy does UNDO? • cache consistent • sometimes fuzzy.
Rollforward • REDO when? • only if Commit has occurred after a CKPT (so dirty data not written to disk yet). • Can only REDO operations entered in log file. • If operation in log buffer, but not written to log file yet, do not REDO.
After recovery • After recovery using CKPTs: • Data file on disk in a consistent state • What is on the data disk? • All committed transactions? • updates recorded on data disk • All updates from transactions not committed? • not recorded on data disk
Checkpoints in Oracle • The mechanism of writing modified blocks on disk in Oracle is not synchronized with the commit of the corresponding transactions. • all database changes up to the checkpoint are recorded in the data disk, making it unnecessary to apply redo log entries prior to the checkpoint. • Which consistency? • cache consistent
Checkpoints in Oracle • Oracle writes the dirty buffers to disk only on certain conditions: • Every three seconds • When a checkpoint is produced
Checkpoints in Oracle • A checkpoint is realized on five types of events: • At each switch of the redo log files • When the delay for LOG_CHECKPOINT_TIMEOUT is reached. • When the size in bytes corresponding to: (LOG_CHECKPOINT_INTERVAL* size of IO OS blocks) is written on the current redo log file. • Directly by the ALTER SYSTEM SWITCH LOGFILE command. • Directly with the ALTER SYSTEM CHECKPOINT command
Checkpoints in Oracle • During a checkpoint the following occurs: • The database writer (DBWR) writes all modified database blocks in the buffer cache back to data disk (cache consistent) • Log writer (LGWR) updates both the controlfile and the datafiles to indicate when the last checkpoint occurred (SCN).
Informix • Checkpoints in Informix • Full (cache-consistent) • Fuzzy (uses WAL) • http://publib.boulder.ibm.com/infocenter/idshelp/v10/index.jsp?topic=/com.ibm.admin.doc/admin559.htm