410 likes | 613 Views
Lecture plan. Transaction processing Concurrency control Recovery techniques. Transaction processing. A transaction is a logical unit of DB processing, consisting of one or more DB access operations Transaction boundaries may be specified implicitly or explicitly
E N D
Lecture plan • Transaction processing • Concurrency control • Recovery techniques
Transaction processing • A transaction is a logical unit of DB processing, consisting of one or more DB access operations • Transaction boundaries may be specified implicitly or explicitly • Transactions are recorded in system log, kept on disk
T1 (successful) Begin_transaction; Read_item(X); X := X - 10; Write_item(X); Read_item(Y); Y := Y + N; Write_item(Y); End_transaction; Commit(T1); T2 (unsuccessful) Begin_transaction; Read_item(X); X := X - 10; Write_item(X); Read_item(Y); [transaction fails] Abort(T2); [possible rollback] Example transactions
Transaction properties [1] • Must hold for every transaction for the DB to remain stable • ACID properties • Atomicity • A transaction should be treated as an indivisible unit • Managed by transaction recovery subsystem • Consistency preservation • A transaction must transform the database from one consistent state to another consistent state • Managed by programmers / DBMS module
Transaction properties [2] • Isolation • Transactions should execute independently of each other • Managed by concurrency control subsystem • Durability • Effects of a successful transaction must be permanently recorded in the DB • Managed by recovery subsystem
Transaction schedules [1] • Ordering of operations of all transactions • Potential multiple users • Potential multiple processors • Two operations conflict if • They belong to different transactions • They access the same item • At least one of the operations is a write_item(X)
Transaction schedules [2] • Complete schedule • All operations from all transactions present • Commit or abort operation must be last in each transaction • Any pair of operations from same transaction appear in correct order • For any pair of conflicting operations, one must occur first in schedule • Partial order of operations • Two non-conflicting operations may occur simultaneously • Committed projection of schedule • Only operations from committed transactions
Schedule criteria • Recoverability • Should not be necessary to rollback, i.e. undo write operations to DB, after commit point • Transaction only commits when all other transactions writing to common data item have committed • Other transactions writing to common data items should not abort before transaction reads • Avoidance of cascading rollback • All transactions only read items written by committed transactions • Strictness • Transactions cannot read or write items until last transaction to write item has committed or aborted
Serial and serializable schedules • Serial schedules • Operations of each transaction executed consecutively without interleaved operations • Serializable schedules • Protocol based on committed projection being equivalent to some serial schedule • Serializability guaranteed by concurrency control protocol
Schedule equivalence [1] • Result equivalent • Produces same final DB state • View equivalent • Two schedules have • Same set of transactions and operations • Equivalent conditions on same operations • Same last operation to write an item • So a schedule whose committed projection is view equivalent to some serial schedule is said to be view serializable
Schedule equivalence [2] • Conflict equivalent • Order of any two conflicting operations is the same in both schedules • So a schedule whose committed projection is conflict equivalent to some serial schedule is said to be conflict serializable
Concurrency control • Concurrency control necessary because • Lost update • Temporary update (dirty read) • Incorrect summary • Unrepeatable read
Transaction 1 Read_item(X); X := X - N; Write_item(X); Read_item(Y); Y := Y + N; Write_item(Y); Transaction 2 Read_item(X); X := X + M; Write_item(X); Lost update
Transaction 1 Read_item(X); X := X - N; Write_item(X); Read_item(Y); Transaction fails Transaction 2 Read_item(X); X := X + M; Write_item(X); Temporary update
Transaction 1 Read_item(X); X := X - N; Write_item(X); Read_item(Y); Y := Y + N; Write_item(Y); Transaction 2 Sum := 0; Read_item(A); Sum := Sum + A; Read_item(X); Sum := Sum + X; Read_item(Y); Sum := Sum + Y; Incorrect summary
Transaction 1 Read_item(Y); Read_item(X); Y := Y + X; Write_item(Y); Read_item(Z); Read_item(X); Z := Z + X; Write_item(Z); Transaction 2 Read_item(X); X := X + 1; Write_item(X); Unrepeatable read
Concurrency control techniques • Locking • Timestamp ordering • Multi-version timestamp ordering • Validation / certification (optimistic)
Locking [1] • Lock • Variable describing status of data item • Information held in a lock table • Danger of deadlock • Each transaction in a set of transactions is waiting for an item locked by another transaction in the same set
Locking [2] • Types of locks • Binary • Two possible states: locked or unlocked • Two operations: Lock_item(X), Unlock_item(X) • Shared / exclusive • Multiple states • Three operations: Read_lock(X), Write_lock(X), Unlock(X)
Binary locks • Transaction must lock data item before read_item or write_item operations • Transaction must unlock data item after finishing with it • No two transactions can access same data item concurrently
Shared / exclusive locks • Read-locked items are share-locked • Write-locked items are exclusive-locked • Two types of lock conversion: • Read_lock(X) -> write_lock(X) • Write_lock(X) -> read_lock(X)
Two-phase locking [1] • Guarantees serializability • All locking operations precede first unlock operation in the transaction • Two phases: • Expanding / growing • Shrinking
Two-phase locking [2] • Types of 2PL • Basic 2PL • Conservative (static) 2PL • Predeclaration of read- and write-sets • Transaction waits until all items available • Deadlock-free, but impractical • Strict 2PL • Write locks not released until commit / abort • Guarantees strict schedules • Not deadlock-free • Rigorous 2PL (variant of strict 2PL) • No locks released until commit / abort
Deadlock avoidance • Deadlock can be avoided by • Prevention by deadlock prevention protocols • Detection once it has happened • Conflicting transactions can be rolled back, or aborted and restarted
Deadlock prevention [1] • Conservative 2PL • Lock all data items in advance • Transaction timestamp • Older transaction has smaller timestamp value • Wait-die algorithm • If waiting transaction older than locking transaction, continue to wait • Otherwise, abort and restart later • Wound-wait algorithm • If waiting transaction older than locking transaction, locking transaction is aborted and restarted later with same timestamp • Otherwise, wait
Deadlock prevention [2] • No waiting algorithm • If transaction unable to obtain lock, it is aborted and restarted after delay • Cautious waiting algorithm • If locking transaction is not blocked, wait • Otherwise, transaction is aborted
Deadlock detection [1] • More practical than prevention if • Transactions are big • Each transaction uses many data items • Transaction load is heavy • Automatic (system) method uses time-outs • Deadlock assumed if transaction waits too long
Deadlock detection [2] • Manual detection uses wait-for graph • One node created for each transaction executing • Directed edge created for transaction waiting to lock item currently locked by another transaction • Deadlock if graph has cycle • Victim selection necessary
Starvation • One transaction cannot proceed for an indefinite amount of time • Can be solved by: • Better waiting scheme • E.g. first-come-first-served • Higher priorities for transactions that have been aborted multiple times
Timestamp ordering • Transactions ordered by timestamp • Equivalent serial schedule in timestamp order • Data items accessed by conflicting operations in serializability order • No locks, and therefore no deadlock, BUT risk of long waiting time
Multi-version timestamp ordering • Based on basic timestamp ordering • Maintain multiple copies of data items: • Keep multiple copies of current data items • Keep old values of data items on update • Appropriate version and copy of item chosen to maintain (conflict or view) serializability • Old values deleted once all transactions using that item have completed
Validation • Three stages • Transaction executed • Updates made to local copy of data • Transaction validated • Checks for serializability violations • Transaction committed or aborted • If validation OK, DB is updated • Otherwise transaction is aborted and restarted
DB recovery [1] • DB recovery necessary if failure caused by: • Transaction • System • Media • DB restored to most recent consistent state before failure by rollback mechanism
DB recovery [2] • Types of failure: • Catastrophic (loss of DB on disk) • Restore past copy of DB from archival storage • Redo operations of committed transactions from log backup • Non-catastrophic (consistency failure) • Use lists of committed and active transactions • Reverse changes by undoing inconsistent operations • May also be necessary to redo some operations from system log
Non-catastrophic failure • Two types of algorithm: • Deferred update • DB updated only when transaction commits • Immediate update • DB may be updated before transaction commits
Deferred update • Transaction commits only when: • All update operations have been recorded in log • Log has been force-written to disk • A type of NO-UNDO/REDO algorithm • Undo not needed • Any changes made to the DB result from completed, committed transactions • Redo operations perhaps necessary • Some transactions may have committed, but the changes not yet been saved to the DB
Single-user deferred update • Apply REDO to all write_item operations of committed transactions in log order • Restart active transactions
Multi-user deferred update • Apply REDO to all write_item operations of committed transactions in log order • If data item updated more than once, then only last update need be redone • Active transactions that had not committed are cancelled and must be resubmitted
Immediate update • Assume that schedules are strict • Update operations recorded in disk log at intervals by force-writing • Transaction may commit before all changes saved to DB • A type of UNDO/REDO algorithm • Undo needed • Some active transactions may already have had changes saved to DB • Redo needed • Some committed transactions may not yet have had changes saved to DB
Single-user immediate update • UNDO write_item operations of active transaction in reverse log order • REDO write_item operations from committed transactions in log order
Multi-user immediate update • UNDO write_item operations of active (uncommitted) transactions in reverse log order • REDO write_item operations from committed transactions in log order • If data item updated more than once, then only last update need be redone