490 likes | 607 Views
Concurrency Control (Chapter 10.4-10.6). Concurrency Control Techniques. Protocols that guarantee serializability 1. Locking 2. Timestamps 3. Multiversion 4. Validation or certification. Locking. Locking - main technique to control concurrent execution
E N D
Concurrency Control (Chapter 10.4-10.6)
Concurrency Control Techniques • Protocols that guarantee serializability 1. Locking 2. Timestamps 3. Multiversion 4. Validation or certification
Locking • Locking - main technique to control concurrent execution • execution based on concept of locking data items • locks granted to a specific transaction for a particular data item • a lock forces mutual exclusion on data items • lock manager subsystem to keep track of and control access to locks
Binary locks • binary lock - 2 states or values • locked, unlocked • distinct lock for each data item • Suppose: • transaction T must issue a lock request before R, W • issue unlock after R, W • Too restrictive, what if want transactions to read at same time?
Multiple-mode lock • multiple-mode lock - 3 types, indivisible • R_lock (RL) share lock • W_lock (WL) exclusive lock • Unlock (UL) a lock needs 3 fields: <data, lock(R,W), # of reads>
Locking Rules for multiple-mode • Rules: 1. T must issue RL(X) or WL(X) before any R(X) 2. T must issue WL(X) before any W(X) 3. T must issue UL after R or W completed 4. If issue WL(X) and hold RL on X, must upgrade RL to WL(X) 5. If issue RL(X) and hold WL on X, must downgrade WL to RL(X) 6. T will not issue UL(X) unless hold RL or WL(X)
Lock conflicts • If lock conflict, force requesting transaction to wait for transaction holding lock, proceed with other transactions dirty write example R1(Y) R2(X) W2(Y) C2 R1(X) W1(X) C1 Will locking prevent dirty write? RL1(Y) R1(Y) UL1(Y) RL2(X) R2(X) UL2(X) WL2(Y) W2(Y) UL2(Y) C2 WL1(X) R1(X) W1(X) UL1(X) C1 • Locks do not guarantee serializability - need 2PL
Two-Phase Locking 2PL • 2PL has a: growing phase - locks acquired shrinking phase - locks released – cannot request new lock during this phase • 2PL limits concurrency • Can upgrade RL to WL - must be done in growing phase • Can downgrade WL to RL - must be done in shrinking phase
Releasing locks • Theoretically, can release locks whenever done with item, as long as don't request new locks on any other data items • Commercial systems that use 2PL assume unlock at last stage of commit • If R(X) then W(X), should request as WL(X)
Any problems? R1(A) R2(A) W1(A) C1 W2(A) C2 WL1(A) R1(A) WL2(A)* W1(A) UL1(A) C1 W2(A) C2 *T2 blocked then both commit – no lost update R1(Y) R2(X) W2(Y) C2 R1(X) W1(X) C1 RL1(Y) R1(Y) RL2(X) R2(X) WL2(Y)* WL1(X)** *T2 blocked then **T1 blocked no lost update, but DEADLOCK
Types of 2PL 1. Basic 2PL enforces serializability, but deadlocks 2. Strict 2PL does not release any locks until commits or aborts - what most commercial system assume - Not deadlock free (But easier to recover from) • Conservative 2PL requires transactions to lock all data items before begin executing - prevents deadlock. Declare readset, writeset or request in order
Strategies for deadlock in 2PL 1. No waiting - if Ti unable to obtain a lock, aborted and restarted after a time delay without checking if deadlock can occur. T's can abort and restart needlessly. • Cautious waiting - if Ti tried to lock X and already locked by Tj and if Tj is not blocked: Ti waits else Tj aborts Deadlock free - illustrate through the total ordering of blocking times • Timeouts - if Ti waits > some system defined time-out, assume Ti is deadlocked, Ti is aborted
Strategies cont’d • Deadlock detection - useful if T's rarely access the same items and each T only locks few items • construct a waits for graph (maintained by lock scheduler) • deadlock when cycle in graph • victim selection - which to abort • Livelock - if repeatedly choose the same victim to abort and restart • if wait indefinite period of time, need a fair waiting scheme – FCFS
Strategies cont’d 5. Timestamps to prevent deadlocks • Transactions can be assigned timestamps, TS(Ti) if T1 starts before T2, TS(T1) < TS(T2) if T1 starts after T2, TS(T1) > TS(T2) • older Ti has a smaller TS • Timestamp can be: a counter, or the current value of the system clock • wait-die or wound-wait strategies use TS’s
Wait-die Suppose Ti tries to lock X, and a CONFLICTING lock is already held by Tj Wait-die: (aborts Transaction requesting lock) if TS(Ti) < TS(Tj) // Ti is older then Ti waits else Ti aborts and restart with the same timestamp // Tj is older R1(Y) R2(X) W2(Y) C2 R1(X) W1(X) C1where T1<T2 RL1(Y) R1(Y) RL2(X) R2(X) WL2(Y) Abort T2 WL1(X) R1(X) W1(X) C1 T2 restarts and commits
Wound-wait Wound-wait: (aborts Transaction holding lock) if TS(Ti) < TS(Tj) // Ti is older then abort Tj // if does not finish // in the mean time restart with same // timestamp else Ti waits // Ti is younger R1(Y) R2(X) W2(Y) C2 R1(X) W1(X) C1where T1<T2 RL1(Y) R1(Y) RL2(X) R2(X) WL2(Y) Block T2 WL1(X) T2 aborted R1(X) W1(X) C1 T2 restarts and commits
Comparisons • Wait-die - older waits on younger, else younger is aborted and restarted • favors younger lock holder • Wound-wait - younger waits on older, older T requesting items held by younger, preempts younger by abort – • favors older requester • T's aborted and restarted even if not deadlocked. • Wait-die: Ti aborted and restarted in a row • Wound-wait: can be aborted even it obtain all of its locks (not true for wait-die)
Problems with serializability • Scheduling that guarantees perfect serializability can be intrusive on performance • too many transaction in wait state • if increase number of threads, can reduce the number of transactions active • CPU never fully utilized
Alternatives to serializability • Weakened forms of 2PL locking in SQL-99 - levels of isolation • Used instead of degrees of isolation • Can set the isolation level with set transaction statement - can specify R only, W only 1) read uncommitted - dirty reads 2) read committed (in DB2 cursor stability) 3) repeatable read 4) serializable
Read uncommitted • Short-term lock (guarantees R, W, is atomic) vs. long-term lock (held until commit or abort) • Read uncommitted - allow for Read only operations - no long-term locks used • Since Read only, no dirty writes, but dirty reads can occur
Read committed • Read committed - WL long term, RL short term • Can only read data that has been written by committed transactions • cursor stability - lock held on each row fetched by cursor (R short, W long) • Read and update before move to next row affects Ri(A) -> Wj(A) • Problems: repeatable read • Usually no lost update - but it can occur (see Fig. 10.12)
Repeatable read • Repeatable read - WL, RL long term predicate locking is not guaranteed • Predicate locking – lock only rows that satisfy specified condition (e.g. major =‘CS’) • therefore can have phantom updates due to inserting new rows • e.g. if branch totals in branch table, and insert while computing total
Serializability • Serializable requires RL long term on all data satisfying predicate • lock entire table
Timestamp Ordering • Timestamp Ordering • concurrency control techniques based on TS's - do not use locks • Can deadlock occur?
Timestamps • Use timestamp ordering (TO) • in 2PL schedule, serializable by being equivalent to some serial schedule allowed by locking protocols • In TO schedule, equivalent to particular order that corresponds to order of transaction TS's
Timestamps (TO) • Basic TO algorithm: • associated with each X, 2 TS values • R_TS(X) - largest TS that has successfully read X • W_TS(X) - largest TS that has successfully written X • If T is aborted, it is restarted with LATER timestamp
TO Algorithms If T issues W(X): if R_TS(X) > TS(T) or W_TS(X) > TS(T) then abort and rollback T // reject operation request else W(X) and set W_TS(X) = TS(T) If T issues R(X): if W_TS(X) > TS(T) then abort and reject operation request else //W_TS(X) ≤ TS(T) R(X), set R_TS(X) = Max(TS(T), R_TS(X)) R1(X)R2(X)W1(X)W2(X)
Example R1(X)R2(X)W1(X)W2(X) X R_TS W_TS 0 0 1 0 R1(X) 2 0 R2(X) W1(X), rollback, restart with T1 = 3 2 2 W2(X) C2 3 2 R1(X) 3 3 R1(X) C1
TO vs. 2PL • TO and 2PL guarantee serializability – • Some schedules possible under each, not allowed under the other • If T is aborted and rolled back, any value written by T also must be rolled back (Can have cascading rollback) • Schedules produced are not recoverable, does not ensure recoverable and cascade- less or strict schedules
TO vs. 2PL • Strict TO - ensures strict schedules and conflict serializability • if T issues R(X) or W(X), s.t. TS(T) > W_TS(X) • R,W operation delayed until T that wrote to value of X committed or aborted • to implement, must simulate locking of X, but doesn't cause deadlocks
Multiversion Concurrency Control • Multiversion Concurrency Control • Oracle uses multiversions to enforce levels of isolation • Useful for temporal DBs and RTDBs • Keep old values when item is updated - several versions maintained • When operation accesses item, appropriate version chosen to ensure serializability • Read older version of item instead of abort • Write new version, keep old one • View serializability not conflict serializability is ensured
Multiversion • Disadvantage - more storage • however, may keep older versions for recovery anyway
Multiversion Algorithm • If T issues: R(X), find version i of X with largest W_TS s.t. W_TS(Xi) ≤ TS(T) set R_TS(Xi) = max(TS(T), R_TS(Xi)) • If T issues: W(X), find version i of X with largest W_TS s.t. W_TS (Xi) ≤ TS(T) if TS(T) >= R_TS(Xi) create new version Xj with R_TS(Xj) = W_TS(Xj) = TS(T) else abort // TS(T) < R_TS(Xi) so must abort T1: W1(X) T2: R2(X)W2(X) T3: W3(X) T4: W4(X) W1(X)R2(X)W3(X)W2(X)W4(X)
Multiversion W1(X)R2(X)W3(X)W2(X)W4(X) Version R_TS W_TS X0 0 0 W0(X) X1 0 1 W1(X) 2 R2(X) X2 3 3 W3(X) X3 2 2 W2(X) X4 4 4 W4(X)
View Equivalence and View Serializability • Less restrictive than conflict equivalence • View equivalent if given 2 schedules, S and S‘ • same set of transactions in S and S' (same operations) • for any operation on X in S, if value X read has been written by Wj(X), same condition must hold in S' • If the operation Wk(Y) is the last operation to write to Y in S, then Wk(Y) must also be the last in S'
Multiversion • Premise • As long as each read reads a result of the same write • the W operations produce the same results • the read operations see the same view • If the final write operations are the same, the database state is the same
Multiversion • How is view equivalence less restrictive? • constraints vs. unconstrained write • constrained - any W1(X) preceded by R1(X) • unconstrained - no read before the W1(X) - independent of any previous value, also called blind write • If the write is constrained, view and conflict equivalence are the same • The difference occurs with blind writes • There is an algorithm to test for view serializability - test is NP-complete • Conflict seralizability is a subset of view seralizability
Multiversion • Several versions of X X1, X2, ... Xk • For each version Xi, keep the timestamps: • R_TS(Xi) - largest of all TS's of transactions that sucessfully read version Xi • W_TS(Xi) - TS of transaction that wrote values of version Xi • Want to read version closest to, but less than your TS • Always write a new version • Only abort when a version with W_TS ≤ TS(T) has a R_TS > TS(T)
Multiversion Is the example schedule serializable? W1(X)R2(X)W3(X)W2(X)W4(X) T1->T2->T3->(T3->T2)-> T4 View equivalent but not conflict equivalent
Example (using TO) W1(X)R2(X)W3(X)W2(X)W4(X) R_TS W_TS 0 0 0 1 W1(X) 2 R2(X) 3 W3(X) W2(X) aborts // W_TS > TS(T) 4 W4(X)
Example W0(X)R1(X)W1(X)R2(X)R3(X)W3(X)W2(X) Version R_TS W_TS X0 0 0 W0(X) 1 R1(X) X1 1 1 W1(X) 2 R2(X) 3 R3(X) X2 3 3 W3(X) W2(X) aborts, version X1 has W_TS <= TS(T2) but R_TS(X1) > T2(T2) because T3 should have read values written by T2
Example W0(X)R1(X)W3(Y)W2(X)R3(X)W3(X)W4(X)W4(Y)
Optimistic (validation) Concurrency Control Techniques (or certification) • Optimistic (validation) Concurrency Control Techniques (or certification) • In other concurrency control techniques, checking is done to the DB before operations are executed • e.g. TO, 2PL • In optimistic, checking is done after
Optimistic • Updates are applied to local copies of the data (transaction workspace) • If serializability is not violated, transactions are committed • The DB is updated from local copies • Otherwise T is aborted and restarted
Optimistic 3 phases: • R phase - R item from DB, updates to local copies • validation phase - check for serializability • W phase – if validation successful, update DB else, restart transaction
Optimistic • If little interference, T's validated successfully and optimistic protocol works well else it doesn't • This protocol (there are others) uses TS's and W-sets and R-sets
Optimistic • Validation phase • For each Ti ready to commit and in validation phase, • check for each Tj, committed or in validation phase 1) Tj completes W-phase before Ti starts to read 2) Ti starts W-phase after Tj completes W-phase and R-set of Ti has no items in common with W-set of Tj 3) Both R-set and W-set of Ti, have no items in common with W-set of Tj and Tj commits R-phase before Ti completes R-phase
Optimistic • If any 1 of the previous holds, no interference Ti validated successfully, Else Ti aborted and restarted later. • Interference may have occurred.
Is this topic relevant? • Check out the Data Concurrency and Consistency link for Oracle • http://www-rohan.sdsu.edu/doc/oracle/server803/A54643_01/toc.htm