280 likes | 397 Views
Concurrency control II. General Overview. Relational model - SQL Formal & commercial query languages Functional Dependencies Normalization Physical Design Indexing Query Processing and Optimization Transaction Processing and CC. Review: AC[I]D. Isolation
E N D
General Overview • Relational model - SQL • Formal & commercial query languages • Functional Dependencies • Normalization • Physical Design • Indexing • Query Processing and Optimization • Transaction Processing and CC
Review: AC[I]D • Isolation • Concurrent xctions unaware of each other • How? • We discussed locking protocols: • 2PL protocol and its variants • Graph-based locking protocols
Database Tables Pages Tuples Multiple-Granularity Locks • Hard to decide what granularity to lock (tuples vs. pages vs. tables). • Shouldn’t have to decide! • Data “containers” are nested: contains
In addition to S and X lock modes, there are three additional lock modes with multiple granularity: intention-shared (IS): indicates explicit locking at a lower level of the tree but only with shared locks. intention-exclusive (IX): indicates explicit locking at a lower level with exclusive or shared locks shared and intention-exclusive (SIX): the subtree rooted by that node is locked explicitly in shared mode and explicit locking is being done at a lower level with exclusive-mode locks. Intention Lock Modes
IS IX S X -- Ö Ö Ö Ö Ö -- IS Ö Ö Ö Ö IX Ö Ö Ö S Ö Ö Ö Ö X MGL: multiple granularity locking • Before locking an item, Xact must set “intention locks” on all its ancestors. • For unlock, go from specific to general (i.e., bottom-up). • SIX mode: Like S & IX at the same time.
Parent Child can be locked in locked in IS IX S SIX X P IS, S IS, S, IX, X, SIX [S, IS] not necessary X, IX, [SIX] none C
T1(IS) , T2(IX) T2(X) T1(S) Example R1 t1 t4 t2 t3
Transaction Ti can lock a node Q, using the following rules: (1) Follow multiple granularity comp function (2) Lock root of tree first, any mode (3) Node Q can be locked by Ti in S or IS only if parent(Q) can be locked by Ti in IX or IS (4) Node Q can be locked by Ti in X,SIX,IX only if parent(Q) locked by Ti in IX,SIX (5) Ti uses 2PL (6) Ti can unlock node Q only if none of Q’s children are locked by Ti Observe that locks are acquired in root-to-leaf order, whereas they are released in leaf-to-root order. Multiple Granularity Locking Scheme
Examples T1(IX) T1(IS) R R T1(IX) t3 T1(S) t4 t2 t1 t3 t4 t2 t1 T1(X) f4.2 f4.2 f2.2 f2.1 f4.2 f4.2 f2.2 f2.1 T1(SIX) Can T2 access object f2.2 in X mode? What locks will T2 get? R T1(IX) t3 t4 t2 t1 T1(X) f4.2 f4.2 f2.2 f2.1
IS IX S X -- Ö Ö Ö Ö Ö -- IS Ö Ö Ö Ö IX Ö Ö Ö Ö S Ö Ö Ö X Examples • T1 scans R, and updates a few tuples: • T1 gets an SIX lock on R, then repeatedly gets an S lock on tuples of R, and occasionally upgrades to X on the tuples that needs to update. • T2 uses an index to read only part of R: • T2 gets an IS lock on R, and repeatedly gets an S lock on tuples of R. • T3 reads all of R: • T3 gets an S lock on R. • OR, T3 could behave like T2; can use lock escalation to decide which.
Optimistic CC • Locking is a conservative approach in which conflicts are prevented. Disadvantages: • Lock management overhead. • Deadlock detection/resolution. • Lock contention for heavily used objects. • If conflicts are rare, we might be able to gain concurrency by not locking, and instead checking for conflicts before Xacts commit.
Execution of transaction Tiis done in three phases. 1. Read and execution phase: Tireads all values and makes copies to local variables (private workspace.) Ti writes only to temporary local variables. No locking. 2. Validation phase: Transaction Ti performs a ``validation test'' to determine if local variables can be written without violating serializability. 3. Write phase: If Ti is validated, the updates are applied to the database; otherwise, Ti is rolled back. optimistic concurrency control: transaction executes fully in the hope that all will go well during validation Validation-Based Protocol
Each transaction Ti has 3 timestamps Start(Ti) : the time when Ti started its execution Validation(Ti): the time when Ti entered its validation phase Finish(Ti) : the time when Ti finished its write phase Serializability order is based on Validation(Ti). Key idea: validation is atomic! Validation-Based Protocol (Cont.)
Validation-Based Protocol To implement validation, system keeps the following sets: • FIN = transactions that have finished phase 3 (and are all done) • VAL = transactions that have successfully finished phase 2 (validation) • For each transaction the Read and Write Sets
Example of what validation must prevent: RS(T1)={B} RS(T2)={A,B} WS(T1)={B,D} WS(T2)={C} = T1 validated T2 validated T1 start T2 start time T2 validation will fail!
Example of what validation must allow: RS(T1)={B} RS(T2)={A,B} WS(T1)={B,D} WS(T2)={C} = T1 validated T2 validated T1 start T2 start T2 start T1 finish phase 3 time
BAD: w2(D) w1(D) Another thing validation must prevent: RS(T1)={A} RS(T2)={A,B} WS(T1)={D,E} WS(T2)={C,D} T1 validated T2 validated finish T1 finish T2 time
Another thing validation must allow: RS(T1)={A} RS(T2)={A,B} WS(T1)={D,E} WS(T2)={C,D} T1 validated T2 validated finish T1 finish T1 time
Validation rules for Tj: (1) When Tj starts phase 1: IGNORE(Tj) FIN (2) at Tj Validation: if check (Tj) then [ VAL VAL U {Tj}; do write phase; FIN FIN U {Tj} ]
All transactions that either validated or finished after the start of Tj Check (Tj): For Ti VAL - IGNORE (Tj) DO IF [ WS(Ti) RS(Tj) OR Ti FIN ] THEN RETURN false; RETURN true; Is this check too restrictive ?
Improving Check(Tj) For Ti VAL - IGNORE (Tj) DO IF [ WS(Ti) RS(Tj) OR (Ti FIN AND WS(Ti) WS(Tj) )] THEN RETURN false; RETURN true;
Example: start validate finish U: RS(U)={B} WS(U)={D} W: RS(W)={A,D} WS(W)={A,C} V: RS(V)={B} WS(V)={D,E} T: RS(T)={A,B} WS(T)={A,C} U,T,V successful; W abort and roll back
Timestamp-Based Protocols • Idea: • Decide in advance ordering of xctions • Ensure concurrent schedule serializes to serial order decided • Timestamps • TS(Ti) is time Ti entered the system • Data item timestamps: • W-TS(Q): Largest timestamp of any xction that wrote Q • R-TS(Q): Largest timestamp of any xction that read Q • Timestamps -> serializability order
Timestamp CC • Idea: If action pi of Xact Ti conflicts with action qj of Xact Tj, and TS(Ti) < TS(Tj), then pi must occur before qj. Otherwise, restart violating Xact.
When Xact T wants to read Object O • If TS(T) < W-TS(O), this violates timestamp order of T w.r.t. writer of O. • So, abort T and restart it with a new, larger TS. (If restarted with same TS, T will fail again!) • If TS(T) > W-TS(O): • Allow T to read O. • Reset R-TS(O) to max(R-TS(O), TS(T)) • Change to R-TS(O) on reads must be written to disk (log)! This and restarts represent overheads. U writes O T reads O T start U start
When Xact T wants to Write Object O 1) If TS(T) < R-TS(Q), then the value of Q that T is producing was needed previously, and the system assumed that that value would never be produced. write rejected, T is rolled back and restarts. 2) If TS(T) < W-TS(Q), then T is attempting to write an obsolete value of Q. Hence, this write operation is rejected, and T is rolled back. 3) Otherwise, the write operation is executed, and W-TS(Q) is set to TS(T). Another approach in 2) is to ignore the write and continue!! Thomas Write Rule U reads Q T writes Q T start U start
Timestamp CC and Recoverability • Timestamp CC can be modified to allow only recoverable schedules: • Buffer all writes until writer commits (but update WTS(O) when the write is allowed.) • Block readers T (where TS(T) > WTS(O)) until writer of O commits. • Similar to writers holding X locks until commit, but still not quite 2PL. T1 T2 W(A) R(A) W(B) Commit • Unfortunately, unrecoverable schedules are allowed: