590 likes | 763 Views
ICS 214B: Transaction Processing and Distributed Data Management. Lecture 11: Concurrency Control and Distributed Commits Professor Chen Li. Overview. Concurrency Control Schedules and Serializability Locking Timestamp Control Deadlocks. In centralized db. T1 T2 … Tn. DB
E N D
ICS 214B: Transaction Processing and Distributed Data Management Lecture 11: Concurrency Control and Distributed Commits Professor Chen Li
Overview • Concurrency Control • Schedules and Serializability • Locking • Timestamp Control • Deadlocks Notes 11
In centralized db T1 T2 … Tn DB (consistency constraints) Notes 11
In distributed db T1 T2 Z Y X Notes 11
Concepts (similar to centralize db) Transaction: sequence of ri(x), wi(x) actions Conflicting actions: r1(A) w2(A) w1(A) w2(A) r1(A) w2(A) Schedule: represents chronological order in which actions are executed Serial schedule: no interleaving of actions or transactions Notes 11
Example constraint: X=Y X Y Node 1 Node 2 T1T2 1 (T1) a X 5 (T2) c X 2 (T1) X a+100 6 (T2) X 2c 3 (T1) b Y 7 (T2) d Y 4 (T1) Y b+100 8 (T2) Y 2d Precedence relation Notes 11
Precedence: intra-transactioninter-transaction Schedule S1 (node X) (node Y) 1 (T1) a X 2 (T1) X a+100 5 (T2) c X 3 (T1) b Y 6 (T2) X 2c 4 (T1) Y b+100 7 (T2) d Y 8 (T2) Y 2d If X=Y=0 initially, X=Y=200 at end Notes 11
Enforcing Serializability • Locking • Timestamp Ordering Schedulers Notes 11
Locking Rules in centralized db (2-phase locking) • Well-formed transactions • Legal schedulers • Two-phase transactions These rules guarantee serializable schedules Notes 11
# locks time Strict 2PL • Hold all locks until transaction commits • Called “Strict 2-phase locking” • Strict 2PL automatically avoids cascading rollbacks Notes 11
access & lock data access & lock data T (release all locks at end) Two-phase Locking in distributed db • Just like in a centralized system • But with multiple lock managers scheduler 1 scheduler 2 ... locks for D1 locks for D2 D1 D2 node 1 node 2 Notes 11
Replicated data T1 T2 scheduler 1 scheduler 2 ... locks for D1 locks for D2 X X node 1 node 2 Notes 11
Replicated data • Simplest scheme (read all, write all) • If T wants to read (write) data item X, T obtains read (write) locks for X at all sites that have X • Better scheme (Read one, write all) • If T wants to read X, T obtains read lock at any one site that has X • If T wants to write X, T obtains write locks at all sites that have X • More sophisticated schemes possible Notes 11
Timestamp Ordering Schedulers • Basic idea: - assign timestamp as transaction begins - if ts(T1) < ts(T2) … < ts(Tn), then scheduler produces history equivalent to T1,T2, ... Tn Notes 11
TO Rule If pi[x] and qj[x] are conflicting operations, then pi[x] is executed before qj[x] (pi[x] <S qj[x]) IFF ts(Ti) < ts(Tj) Notes 11
reject! abort T1 abort T1 abort T2 abort T2 Example: schedule S2 ts(T1) < ts(T2) (Node X) (Node Y) (T1) a X (T2) d Y (T1) X a+100 (T2) Y 2d (T2) c X (T1) b Y (T2) X 2c (T1) Y b+100 Notes 11
Strict T.O. • Lock written items until it is certain that writing transaction has been successful (avoid cascading rollbacks) Notes 11
abort T1 (T2) c X (T2) X 2c Example Revisited ts(T1) < ts(T2) (Node X) (Node Y) (T1) a X (T2) d Y (T1) X a+100 (T2) Y 2d (T2) c X (T1) b Y reject! delay abort T1 Notes 11
Enforcing T.O. • For each data item X: MAX_R[X]: maximum timestamp of a transaction that read X MAX_W[X]: maximum timestamp of a transaction that wrote X rL[X]: # of transactions currently reading X (0,1,2,…) wL[X]: # of transactions currently writing X (0 or 1) Notes 11
T.O. Scheduler - Part 1 ri[X] arrives IF (ts(Ti) < MAX_W[X]) THEN { ABORT Ti } ELSE { IF (ts(Ti) > MAX_R[X]) THEN MAX_R[X] ts(Ti); IF (queue is empty AND wL[X] = 0) THEN { rL[X] rL[X]+1; START READ OF X } ELSE add (r, Ti) to queue } Notes 11
T.O. Scheduler - Part 2 Wi[X] arrives IF (ts(Ti) < MAX_W[X] OR ts(Ti) < MAX_R[X]) { ABORT Ti } ELSE { MAX_W[X] ts(Ti); IF (queue is empty AND wL[X]=0 AND rL[X]=0) { wL[X] 1; WRITE X; // WAIT FOR Ti TO FINISH } ELSE add (w, Ti) to queue } Notes 11
T.O. Scheduler - Part 3 When o finishes (o is r or w) on X oL[X] oL[X] - 1; NDONE TRUE WHILE NDONE DO { Let head of queue be (q, Tj); (smallest timestamp) IF (q=w AND rL[X]=0 AND wL[X]=0) { Remove (q,Tj); wL[X] 1; WRITE X; // WAIT FOR Tj TO FINISH } ELSE IF (q=r AND wL[X]=0) { Remove (q,Tj); rL[X] rL[X] +1; START READ OF X } ELSE NDONE FALSE } Notes 11
ts(T)=11 Starvation possible If a transaction is aborted, it must be retried with a new, larger timestamp MAX_R[X]=10 T ts(T)=8 MAX_W[X]=9 read X . . . X . . . Notes 11
Theorem If S is a schedule representing an execution by a T.O. scheduler, then S is serializable Notes 11
Improvement: Thomas Write Rule MAX_R[X] MAX_W[X] ts(Ti) Ti wants to write X Notes 11
Change in T.O. Scheduler MAX_R[X] MAX_W[X] ts(Ti) Ti wants to write X When Wi[X] arrives IF ts(Ti)<MAX_R[X] THEN ABORT Ti ELSE IF (ts(Ti)<MAX_W[X]) { IGNORE THIS WRITE (tell Ti it was OK) } ELSE { process write as before… } Notes 11
2PL TO: Example 1 T1: r1[X] r1[Y] w1[Z] ts(T1) < ts(T2) T2: w2[X] S: r1[X] w2[X] r1[Y] w1[Z] S could be produced with T.O. but not with 2PL Notes 11
2PL TO: Example 2 T1: r1[X] r1[Y] w1[Z] ts(T1) < ts(T2) T2: w2[Y] S: r1[X] w2[Y] r1[Y] w1[Z] S could be produced with 2PL but not with TO Notes 11
Relationship between 2PL and TO Serializable schedules T.O. schedules 2PL schedules Notes 11
access data access data T Distributed T.O. Scheduler scheduler 1 scheduler 2 ... D1 ts cache D2 ts cache D1 D2 node 1 node 2 • Each scheduler is “independent” • At end of transaction, signal all schedulers involved to release all wL[X] locks Notes 11
Next: Deadlocks • If nodes use 2P locking, global deadlocks possible Local wait-for graph (WFG): no cycles T1 T2 T1 T2 Notes 11
Need to “combine” WFGs to discover global deadlock T1 T2 T1 T2 T1 T2 e.g., central detection node Notes 11
Deadlocks • Local vs. Global • Deadlock detection • Waits-for graph • Timeouts • Deadlock prevention • Wound-wait • Wait-die • Covered in ICS214A Notes 11
Summary • 2PL - the most popular - deadlocks possible - useful in distributed systems • T.O. - aborts more likely - no deadlocks - useful in distributed systems Notes 11
Next: • Reliable distributed database management • Dealing with failures • Distributed commit algorithms • The “two generals” problem Notes 11
Reliability • Correctness • Serializability • Atomicity • Persistence • Availability Notes 11
Types of failures • Processor failures • Halt, delay, restart, berserk, ... • Storage failures • atomic write, transient errors, disk crash • Communication (network) failures • Lost message, out-of-order messages, partitions Notes 11
Failure models • Cannot protect against everything • Unlikely failures (e.g., flooding in the Sahara) • Expensive to protect failures (e.g., earthquake) • Failures we know how to protect against (e.g., message sequence numbers; stable storage) Notes 11
Failure model: Desired Events Expected Undesired Unexpected Notes 11
Node models (1) Fail-stop nodes time perfect halted recovery perfect Volatile memory lost Stable storage ok Notes 11
Node models (2) Byzantine nodes A Perfect Perfect Arbitrary failure Recovery B C At any given time, at most some fraction f of nodes failed (typically f < 1/2 or f < 1/3) Notes 11
Network models (1) Reliable network - in-order messages - no spontaneous messages - timeout TD I.e., no lost messages, except for node failures Destination down (not paused) If no ack in TD sec. Notes 11
Variation of reliable net • Persistent messages • If destination down, net will eventually deliver message • Simplifies node recovery, but leads to inefficiencies • Just moves the problem one level lower down the stack • Not considered here Notes 11
Network models (2) Partitionable network - In order messages - No spontaneous messages - nodes can have different views of failures Notes 11
Scenarios • Reliable network • Fail-stop nodes • No data replication (1) • Data replication (2) • Partitionable network • Fail-stop nodes (3) Notes 11
No Data Replication • Reliable network, fail-stop nodes • Basic idea: node P controls X P net Item X - Single control point simplifies concurrency control and recovery - Note availability hit: if P down, X unavailable too! Notes 11
“P controls X” means - P does concurrency control for X - P does recovery for X Notes 11
Say transaction T wants to access X: req PT is a process that represents T at this node PT Local DMBS X Lock mgr LOG Notes 11
Distributed commit problem . Transaction T Action: a1,a2 Action: a3 Action: a4,a5 Commit must be atomic Notes 11
Distributed commit problem • Commit must be atomic • Solution: Two-phase commit (2PC) • Centralized 2PC • Distributed 2PC • Linear 2PC • Many other variants… Notes 11