1 / 75

Concurrency Control

Concurrency Control. Concurrency Control Techniques. Protocols that guarantee serializability      1. Locking      2. Timestamps      3. Multiversion      4. Optimistic - validation or certification. Locking. Serlializable?. R1(X) R4(X) R2(Y) W4(X) R2(X) W2(Y) C1 C4 C2

miac
Download Presentation

Concurrency Control

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Concurrency Control

  2. Concurrency Control Techniques • Protocols that guarantee serializability      1. Locking      2. Timestamps      3. Multiversion      4. Optimistic - validation or certification

  3. Locking

  4. Serlializable? • R1(X) R4(X) R2(Y) W4(X) R2(X) W2(Y) C1 C4 C2 • R1(Y) R2(X) W2(Y) C2 R1(X) W1(X) C1

  5. Locking • Locking - main technique to control concurrent execution • execution based on concept of locking data items • locks granted to a specific transaction for a particular data item • a lock forces mutual exclusion on data items • lock manager subsystem to keep track of and control access to locks

  6. Binary locks • binary lock - 2 states or values • locked, unlocked • distinct lock for each data item • Suppose: • transaction T must issue a lock request before R, W • issue unlock after R, W • Too restrictive, what if want transactions to read at same time?

  7. Multiple-mode lock • multiple-mode lock - 3 types, indivisible • Read lock - share lock • Write lock - exclusive lock • Unlock - unlock  a lock needs 3 fields:  <data, lock(R,W), # of reads>

  8. Locking Rules for multiple-mode (theoretical) Rules: 1.  T must issue request for R or W lock before any R(X) 2.  T must issue request for W lock before any W(X) 3.  T must issue Unlock(X) after R or W completed OR once commit assume all remaining locks unlocked 4.  If T issues request for W lock(X) and holds R lock on X, must upgrade R lock(X) to W lock(X) 5.  If issue R lock(X) and hold W lock on X, must downgrade W lock(X) to R lock(X) 6.  T will not issue Unlock(X) unless hold R lock(X) or W lock(X)

  9. Dirty Write example R1(Y) R2(X) W2(Y) C2 R1(X) W1(X) C1 Will locking rules prevent dirty write?  T1 granted request R lock on Y, R1(Y), then Unlock T2 granted request R lock on X, R2(X), then Unlock T2 granted request W lock on Y, W2(Y), then Unlock C2 T1 granted request R lock on X, R1(X) T1 granted upgrade to W lock on X, W1(X), then Unlock C1 • Does not guarantee serializability • need 2PL

  10. Basic Two-Phase Locking 2PL • 2PL has a:        growing phase - locks acquired        shrinking phase - locks released – cannot request new lock during this phase • Advantage • Guarantees serializability • Disadvantage • 2PL limits concurrency

  11. Requesting/Releasing locks • Can upgrade R lock to W lock - must be done in growing phase • Can downgrade W lock to R lock - must be done in shrinking phase • If R(X) then W(X),we will assume request lock as W lock(X) for the R(X) • Theoretically, can release locks whenever done with item, as long as don't request new locks on any other data items • We assume will release when transaction commits

  12. Lock conflicts • If lock conflict, force requesting transaction to block (Bi) wait for transaction holding lock, proceed with other transactions W1(Y) R2(Y) C1 C2 T1 Requests W_lock on Y W1(Y) T2 Requests R_lock on Y and B2 on R2(Y) C1 Unlocks R2(Y) C2

  13. Lost update problem R1(A) R2(A) W1(A) C1 W2(A) C2 R1(A) (B2 on R2(A)) W1(A) C1 R2(A) W2(A) C2 T2 blocked, eventually both commit – no lost update

  14. Example R1(X)R2(X)W1(Y)W2(Z) • Should this block? • NO!! R1(X) R2(X) are not conflicting locks, make sure you understand this

  15. Any problems with Basic 2PL? R1(Y) R2(X) W2(Y) C2 W1(X) C1 R1(Y) R2(X) (B2 on W2(Y)) (B1 on W1(X)) T2 blocked then T1 blocked no lost update, but DEADLOCK Basic 2PL • enforces serializability but deadlocks

  16. Types of 2PL • Basic 2PL • Request locks as needed, release at end of growing phase • Strict 2PL • does not release any write locks until commits or aborts • Not deadlock free (But easier to recover from) • Strong: does not release any read locks until commits or aborts 2PL is most common type of concurrency control in commercial systems, but it is implemented as follows: • Strong Strict 2PL (SS2PL) a.k.a. rigorous 2PL Not really two-phases, only one-phase …

  17. Strategies for deadlock in 2PL 1. Conservative 2PL – not popular • requires transactions to lock all data items before begin executing • prevents deadlock  • Declare readset, writeset • or request data items in order

  18. Strategies for deadlock in 2PL 2.  No waiting - if Ti unable to obtain a lock, abort/restart after a time delay.  T's can abort and restart needlessly. 3. Cautious waiting - if Ti tried to lock X and X is already locked by Tj and if Tj is not blocked: Ti waits else Tj aborts Deadlock free - total ordering of blocking times 4. Timeouts - if Ti waits > threshold, Ti is aborted

  19. Strategies cont’d 5. Deadlock detection - useful if T's rarely access the same items and each T only locks a few items • construct a waits for graph (maintained by lock scheduler) • deadlock when cycle in graph • Problems: • victim selection - which to abort • Livelock - if repeatedly choose same victim to abort/restart • if wait indefinite period of time, need a fair waiting scheme – FCFS

  20. Strategies cont’d 6.   Timestamps to prevent deadlocks • Transactions can be assigned timestamps, TS(Ti)   if T1 starts before T2,  TS(T1) < TS(T2)   if T1 starts after T2, TS(T1) > TS(T2) • older Ti has a smaller TS • Timestamp can be:  a counter, or the current value of the system clock • wait-die or wound-wait strategies use TSs • Can you think of a strategy using TSs?

  21. Wait-die Suppose Tj tries to lock X, and a CONFLICTING lock is already held by Ti   Wait-die:  (aborts Transaction requesting lock)          if TS(Tj) < TS(Ti)    // Tj is older              then Tj waits          else // Tj is younger Tj aborts and restart with same timestamp R1(Y) R2(X) W2(Y) C2 R1(X) W1(X) C1where TS(T1)<TS(T2) R1(Y) R2(X) (A2 on W2(Y)) R1(X) W1(X) C1 R2(X) W2(Y) C2

  22. Wound-wait Suppose Tj tries to lock X, and a CONFLICTING lock is already held by Ti Wound-wait:  (aborts Transaction holding lock)   if TS(Tj) < TS(Ti)   //  Tj is older      then Ti aborts and restart with same timestamp    else       //  Tj is younger Tj waits R1(Y) R2(X) W2(Y) C2 R1(X) W1(X) C1 where TS(T1)<TS(T2) R1(Y) R2(X) (B2 on W2(Y)) (A2 on R1(X)) R1(X) W1(X) C1 R2(X) W2(Y) C2

  23. Comparisons • Wait-die - older waits on younger, else younger aborts itself and restarted • favors younger lock holder • Wound-wait - younger waits on older, else older preempts younger lock holder by abort – • favors older requester • T's aborted and restarted even if not deadlocked. • Wait-die:  can abort Tj and restart many times in a row • Wound-wait: can be aborted even if obtain all of its locks (not true for wait-die, lock holder not aborted)

  24. Strategies Useful? • Google using Wound-Wait in new semi-relational DB • Classic concurrency control for Real-time Transactions • Assume: T1 holds lock, T2 wants it, dl is deadline If T1(dl) < T2(dl) // T1 earlier deadline T2 waits else abort T1 T2 gets lock // T2 has earlier deadline

  25. Problems with serializability • Scheduling that guarantees perfect serializability can be intrusive on performance • Too many transaction in wait state • If increase number of threads, can reduce the number of transactions active • CPU never fully utilized

  26. Alternatives to serializability • Weakened forms of 2PL locking in SQL levels of isolation • Used instead of degrees of isolation • Can set the isolation level with set transaction statement (can specify R only, W only)     1)  read uncommitted     2)  read committed – Default Oracle    3)  repeatable read    4)  serializable

  27. Isolation levels • Lock types used to implement • Short-term lock • guarantees R, W, is atomic • long-term lock • held until Transaction commits or aborts

  28. Read uncommitted • Read uncommitted (for read only Ts) – • no long-term locks used • allow for Read only operations • no dirty writes (since only read) • but dirty reads can occur

  29. Read committed • Read committed – (no dirty reads) • W lock long term, R lock short term • Can only read data that has been written by committed transactions • Unrepeatable reads can occur • Lost update can still occur R1(A) R2(A) W2(A) W1(A) C2 C1 // will not allow R1(A) R2(A) W2(A) C2 W1(A) C1 // may allow

  30. Repeatable read • Repeatable read – • W lock, R lock long term • Repeatable reads, no lost update • But, predicate locking is not guaranteed • Predicate locking – lock only rows that satisfy specified condition (e.g. major =‘CS’) • therefore can have phantom updates due to inserting new rows • e.g. if branch totals in branch table, and insert while computing total

  31. Serializability • Serializable requires R, W lock long term on all data satisfying predicate • How? • lock entire table

  32. Granularity Hierarchy • How to accommodate different granularities of locks by the lock manager • If only a few data items from a table are needed, how to indicate they are locked • Use a tree • Allow data items to be of various sizes • Used with 2PL to guarantee serializability

  33. Tree • Multiple levels of nodes • Highest node is entire DB • Non-leaf node as data associated with descendants • Each node can be locked individually • If lock a node, all ancestors are also locked in appropriate mode • Locks are: • Shared or exclusive • Intention or explicit mode

  34. How to determine if a node at a lower level is locked without searching entire tree? • If a node at a lower level must be explicitly locked, then all ancestor nodes are intention locked as traverse tree • S shared, IS intention-shared • X exclusive, IX intention exclusive • SIX shared with intention exclusive • Strategy most useful for short transactions with few data items and long transactions produce reports form file

  35. SIX • What is SIX? • Subtree rooted in that node is S and then X at lower level • The lock owner can read and change data in the table, partition, or table space. Concurrent processes can read data in the table, partition, or table space, but not change it. Only when the lock owner changes data does it acquire page or row locks. • Does this mean I can share everything except what I want to write to?

  36. Locking protocol • Top down lock: • Use compatibility matrix on next page • T must lock root first (in any mode) • T can lock Q in S or IS only if T has parent of Q in IX or IS mode • T can lock Q in X, SIX, or IX only if T has parent of Q locked in IX or SIX mode • Bottom up unlock: • T can lock node only if not previously unlocked any node • T can unlock Q only if T has no children of Q locked

  37. If a transaction holds a lock in one mode, a requester can be granted a lock in the specified mode

  38. Serializability • Do we always have to use locking to ensure serializability?

  39. Timestamp Ordering • No - Timestamp Ordering • concurrency control techniques based on timestampe (TS) - do not use locks • Can deadlock occur?

  40. Timestamp Ordering

  41. Timestamps • Use timestamp ordering (TO) • in 2PL schedule, serializable by being equivalent to some serial schedule allowed by locking protocols • In TO schedule, serializable by being equivalent to particular order that corresponds to order of transaction TS's • This means conflicting operations must execute in order of their timestamps, e.g. R/Ws to same data item must occur in the same order as their TS

  42. Timestamps (TO) • Basic TO algorithm: • associated with each X, 2 TS values • R_TS(X) - largest TS that has successfully read X • W_TS(X) - largest TS that has successfully written X • If T is aborted, it is restarted with LATER timestamp

  43. TO Algorithms If T issues R(X):      if W_TS(X) ≤ TS(T) R(X) and set R_TS(X) = Max(TS(T), R_TS(X))       else // W_TS(X) > TS(T) abort T If T issues W(X):    if R_TS(X) > TS(T) or W_TS(X) > TS(T) abort T     else W(X) and set W_TS(X) = TS(T) R1(X)R2(X)W1(X)W2(X)C1C2 Assume TS(T1)=1 and TS(T2)=2

  44. Lost Update Example - TO R1(X)R2(X)W1(X)W2(X) Assume TS(T1)=1 and TS(T2)=2                         X R_TS        W_TS 0                0 1                0  R1(X) 2                0  R2(X) W1(X), abort restart with T1 = 3 2               2  W2(X) C2 3                2   R1(X) 3                3   W1(X) C1 Serializable? Try with TO R1(X)R4(X)R2(Y)R2(X)W4(X)W2(Y) R1(X)R4(X)R2(Y)W4(X)R2(X)W2(Y)

  45. TO vs. 2PL • TO and 2PL guarantee serializability – • Neither allows non-serializable schedules but • Some schedules possible under each, not allowed under the other • Note • If T is aborted (undo, rolled back), any value written by T also must be rolled back     (Can have cascading rollback) • Schedules produced are not recoverable, does not ensure recoverable and cascade- less or strict schedules

  46. Multiversion Concurrency Control a.k.a Multiversion Timestamp Ordering

  47. Multiversion Concurrency Control (MVCC) • Multiversion Concurrency Control • Oracle uses multiversions to enforce levels of isolation • Useful for mobile, temporal and real-time DBs • Keep old values when item is updated - several versions maintained • When operation accesses item, appropriate version chosen to ensure serializability • Read older version of item instead of abort • Write new version, keep old one • View serializability not conflict serializability is ensured

  48. Multiversion • Disadvantage - more storage • however, may keep older versions for recovery anyway • Google keeps multiple versions for semi-relational DB

  49. MVCC Algorithm If T issues: R(X) find version i of X with largest W_TS     s.t. W_TS(Xi) ≤ TS(T)      set R_TS(Xi) = max(TS(T), R_TS(Xi)) If T issues: W(X) find version i of X with largest W_TS s.t. W_TS (Xi) ≤ TS(T) if TS(T) >= R_TS(Xi) create new version Xj with R_TS(Xj) = W_TS(Xj) = TS(T) else abort     // TS(T) < R_TS(Xi) so must abort R1(X)R4(X)R2(Y) R2(X)W4(X)W2(Y)R1(X)R2(X)W1(X)W2(X) W1(X)R2(X)W3(X)W2(X)W4(X)

More Related