460 likes | 557 Views
What Happens Here (1)?. 1. CREATE TABLE pairs ( x INTEGER, y INTEGER);. 2. select * from pairs 3. insert into pairs values(1,1) 6. select * from pairs 8. COMMIT. 4. select * from pairs 5. insert into pairs values(1,2) 7. select * from pairs 9. COMMIT. T.1: R. COMMITTED.
E N D
What Happens Here (1)? 1. CREATE TABLE pairs (x INTEGER, y INTEGER); 2. select * from pairs 3. insert into pairs values(1,1) 6. select * from pairs 8. COMMIT 4. select * from pairs 5. insert into pairs values(1,2) 7. select * from pairs 9. COMMIT T.1: R. COMMITTED T.2: SERIALIZABLE
What Happens Here (2)? 1. CREATE TABLE pairs (x INTEGER, y INTEGER); 2. select * from pairs 6. select * from pairs 7. insert into pairs values(1,2) 8. COMMIT 3. insert into pairs values(1,1) 4. COMMIT 5. select * from pairs 9. select * from pairs 10. COMMIT SERIALIZABLE R. COMMITTED
What Happens Here (3)? 1. CREATE TABLE pairs (x INTEGER, y INTEGER); 2. select * from pairs 6. select * from pairs 7. insert into pairs values(1,2) 8. COMMIT 3. insert into pairs values(1,1) 4. COMMIT 5. select * from pairs 9. select * from pairs 10. COMMIT SERIALIZABLE Is it equivalent to any truly serial execution of the transactions? No Serial would make the green run before the light-blue and the select in the orange would act differently SERIALIZABLE
Transaction Management Introduction SQL Isolation Levels Determining Serialiazability Ensuring Serializability
Transactions • A transaction is a list of actions. • The actions are reads (written R(O)) and writes (written W(O)) of database objects. • The subscript is left out if clear from the context Example: T1: R(V), R(Y), W(V), W(C)
Properties of Transactions • The database allows transactions to run concurrently, while ensuring the following properties: • Atomicity:Either all the actions of the transaction are carried out or none are. • Consistency:The effect of running the transactions concurrently is the same as running them serially, one after another (* not exactly implemented *) • Isolation: Transactions are isolated from effects of other transactions running concurrently • Durability: Once a transaction completes (commits), the result of the transaction must persist in the system
Schedules • A schedule is a list of actions from a set of transactions. • Intuitively, this a specification of how transactions will be run concurrently • The order in which 2 actions of a transaction T appear in a schedule must be the same as the order in which they appear in T. • We will usually write schedules from left to right (sometimes from top to bottom)
Schedule Example • T1: R(V) W(V) • T2: R(Y) W(Y) • Which of the following are schedules of the transactions? S1 S1: S2:
Serializable Schedules • A schedule is serial if the actions of the different transactions are not interleaved; they are executed one after another • A schedule is serializableif its effect on the database is the same as that of some serial schedule • Note: We assume that there are no effects other than the effects to the database, i.e., no writing to the screen • We usually only want to allow serializable schedules to be performed. Why?
Conflicts • There are several different types of conflicts between transactions that may occur • Such conflicts may cause a schedule to not be serializable • 3 Types of conflicts • WR, RW, WW
WR Conflicts • WR Conflicts: There is a WR conflict between T1 and T2 if there is an item Y s.t. • T1 writes Y and afterwards, T2 reads Y • Note: If T1 has not committed, this is a dirty read • Find all the WR conflicts in the following schedule: T1: W(Y) T2: R(V) R(Y) W(Z) T3: W(V)
RW Conflicts • RW Conflicts: There is a RW conflict between T1 and T2 if there is an item Y s.t. • T1 reads Y and afterwards, T2 writes Y • Note: This read becomes unrepeatable • Find all the RW conflicts in the following schedule: T1: W(Y) T2: R(V) R(Y) W(Z) T3: W(V)
WW Conflicts • WW Conflicts: There is a WW conflict between T1 and T2 if there is an item Y s.t. • T1 writes Y and afterwards, T2 writes Y • Note: This write becomes overwritten • Find all the WW conflicts in the following schedule: T1: W(Y) T2: W(V)W(Y) W(Z) T3: W(V)
Swapping Actions • Summary: Actions conflict if they are from: • Different transactions • Involve the same data item • One of the actions is a write • It is easy to see that we can make as many non-conflicting swaps as we want, without changing the outcome of the schedule. serialized after swap means Bar Sidur T1: R(Y) T2: R(V) R(Y) W(Y) T1: R(Y) T2: R(V) R(Y) W(Y) T1: R(Y) T2: R(V) R(Y) W(Y)
Conflict Serializable Schedules • Two schedules are conflict equivalent if they can be turned one into the other by a sequence of non-conflicting swaps of adjacent actions • A schedule is conflict serializableif it is conflict equivalent to some serial schedule. • Conflict serializable schedules are also serializable (but not necessarily vice-versa).
Conflict Equivalent: Example Are any of the following schedules conflict equivalent? 1 and 2 – only RR swap T1: W(V) R(V) W(V) T2: R(V) T1: W(V) R(V) W(V) T2: R(V) T1: W(V) R(V) W(V) T2: R(V)
Checking for Conflict-Serializability: Precedence Graph • Given a schedule we can create a precedence graph • The graph has a node for each transaction • There is an edge from T1 to T2 if there is a conflicting action between T1 and T2 in which T1 occurs first • The schedule is conflict serializable if and only if there is no cycle in the precedence graph!! • An equivalent serial schedule is defined by a topological sort of the precedence graph
Example Which of the schedules are conflict serializable? 2,3 T1: W(V) W(V) T1 T2 (circle) T2: R(V) T1: R(V) W(V) T1 T2 (no circles) T2: R(V) T1: W(Y) T2: R(V) R(Y) W(Z) T3: W(V) T1T2T3 (no circles)
חוקים • מותר להחליף סדר בין 2 פעולות צמודות של תנועות שונות שהם ללא קונפליקט. • זוג פעולות הם ללא קונפליקט אם הם מתייחסים לפרט מידע שונה או שהם פעולות קריאה.
Conflict Serializable: Why does it work? • Lemma: Let S be a conflict-serializable schedule. Let S’ be a serial schedule derived by a topological sort of the precedence graph of S. Then S and S’ are conflict equivalent משפט: S הוא בר סידור קונפליקטים אם"ם אין מעגל בגרף הקדימויות של S. • Proof:On the blackboard.
Conflict Serializable: Why does it work? הוכחה: נניח שיש מעגל בגרף הקדימויות: נניח על דרך השלילה ש S’ הוא סדרתי ושקול קונפליקטים ל S. אזי T1 חייב להתבצע לפני T2 ב S’, כי יש ביניהם פעולה עם קונפליקט שבו T1 קודם ואסור להחליף סדר של פעולות אלה. לפי אותו עקרון נקבל ש T1 < T2 < T3 <… < Tn < T1, וקיבלנו סתירה. נניח שאין מעגל בגרף הקדימויות: אזי יש Ti שאין לו צלע נכנסת. אזי בכל הפעולות של Ti שבו יש קונפליקט עם תנועה אחרת, Ti קודם. אז ע"י החלפת זוגות צמודים של פעולות ללא קונפליקט נביא את Ti לתחילת התזמון. נוכל למחוק את Ti מהגרף ולהמשיך באותה צורה עד לקבלת תזמון סדרתי.
Conflict Serializable: Why Not? Is the following schedule conflict serializable? T1: R(V) W(V) T2: W(V) T3: W(V) T1T2T1T3, T2T3 (circle) לא בר סידור קונפליקטים כיוון שיש מעגל בגרף. הוא כן בר סידור. כיוון שהאפקט הסופי על מסד הנתונים הוא כמו ריצה סדרתית. Note that it is serializable! The writes of T2 and T3 are called blind writes
View Serializable • Two schedules S1 and S2 are view equivalent if • they involve the same set of actions of the same transactions and • if Ti reads the initial value of X in S1 then it must also read the initial value of X in S2 and • if Ti reads the value of X written by Tj in S1 then it must also read the value of X written by Tj in S2 and • For each data object X, the transaction (if any) that performs the final write on X in S1 must also perform the final write on X in S2 • A schedule is view serializableif it is view equivalent to some serializable schedule.
Example Which of schedules are view serializable? Yes. T1 T2 T3 (T3 – must write last.) T1: R(V) W(V) T2: W(V) T3: W(V) No. T1 must be before T2, since T2 writes V to the DB T1 must be after T2, since it needs to write to end writing of V. T1: R(V) W(V) T2: W(V) T3: R(V)
Serializable vs. View Serializable Is the following schedule view serializable? T1 must be first since it reads V, and it must read the last value of Y, but T2 change it. Though it is still conflict serializable. T1: R(V) R(Y) C T2: W(V) W(Y) C Note that it is serializable!
Transaction Management Introduction SQL Isolation Levels Determining Serialiazability Ensuring Serializability
Ensuring Serializability • So far, we have shown a sufficient condition that allows us to check if a schedule is serializable • We now consider the problem of ensuring serializability during runtime • Problem: We don’t know which transactions will run and which items they will access • Different strategies for ensuring serializability • Lock-based concurrency control • Multi-version concurrency control
Lock-Based Concurrency Control T1 Want to Access Y Lock Table Scheduler Y, T1
Lock-Based Concurrency Control T1 T2 Want to Access Z Lock Table Scheduler Y, T1 Z, T2
Lock-Based Concurrency Control: Intuition T1 T2 Wait Want to Access Y Lock Table Scheduler Y, T1 Z, T2
Lock Based Concurrency Control • Transactions must lock objects before using them. • A shared lock (S-lock) is acquired on Y before reading Y. Many transactions can hold a shared lock on Y. • An exclusive lock(X-lock) is acquired on Y before writing Y. A transaction can hold an exclusive lock on Y only if no other transaction holds any kind of lock on Y. • If a transaction has an exclusive lock on Y, it can also read Y • Transaction unlock the objects when they are no longer needed
Example • In the following schedule, the exclusive lock, shared lock and unlock commands are written explicitly: T1: X(B) W(B) U(B) T2: S(A) R(A) U(A) X(B) W(B) U(B)
2 Phase Locking Protocol 2 Phase Locking (2PL): • Each transaction must get an S-lock (shared lock) on an object before reading it • Each transaction must get an X-lock (exclusive lock) on an object before writing it • A transaction cannot request additional locks once it releases any lock זה מה שידאג שהמצבים יהיו שקולים קונפליקטים • Any schedule that conforms to 2PL is conflict-serializable!
2 Phase Locking Protocol 2 Phase Locking (2PL): טענה: התזמון השקול קונפליקטים הסדרתי מתקבל על ידי סידור התנועות לפי סדר שחרור המנעול הראשון שלהן. הוכחה:נניח ש Ti מבצע את השחרור הראשון מבין כל התנועות. נניח על דרך השלילה של Tj יש קונפליקט עם Ti כך ש Tj מבצע ראשון פעולה. Ti: X(Y)W(Y) Tj: w(Y)U(Y) כיוון שהנחנו ש Ti מבצע את השחרור הראשון זה מחייב ש Ti שחרר משהו לפני הכתיבה ב Tj. אולם זה בסתירה לפרוטוקול 2PL, כיוון שעל פי הפרוטוקול Ti לא יכול לבקש מנעול לאחר שחרור.
Examples • Which of the following conform to the 2PL protocol? The second one. T1: X(B) W(B) U(B) T2: S(A) R(A) U(A) X(B) W(B) U(B) T1: X(B) W(B) U(B) T2: S(A) R(A) X(B) U(A) W(B) U(B)
Problem with 2PL: Cascading Rollbacks • Consider the following schedule: T1: X(A) S(B) W(A) U(A) R(B) U(B) Abort T2: S(A)R(A)X(A) W(A) U(A) Commit • Does this schedule conform to 2PL? Yes. • What problem occurs? • Reading A in T2 after the writing A in T1, is using the changed input of A, while T1 finally aborts. • Meaning, T2 reads dirty read.
Strict 2 Phase Locking Protocol Strict 2 Phase Locking (Strict 2PL): • Each transaction must get an S-lock (shared lock) on an object before reading it • Each transaction must get an X-lock (exclusive lock) on an object before writing it • A transaction releases all locks only when the transaction is completed (i.e., at the end, when performing commit/rollback) • This protocol avoids cascading aborts
Example • Is the following schedule conflict-serializable? Can it be achieved using 2PL? using Strict 2PL? • i.e., can you add lock and unlock statements in a way that will ensure that the schedule is in 2PL or Strict 2PL? T1: R(V) R(Z) R(Y) T2: R(Y) W(V) T3: W(V) W(Z)
Example • Is the following schedule conflict-serializable? Yes הערה:אם S לא בר סידור קונפליקטים אז S לא ניתן להשגה ע"י 2PL או strict 2PL. • Can it be achieved using 2PL? using Strict 2PL? • i.e., can you add lock and unlock statements in a way that will ensure that the schedule is in 2PL or Strict 2PL? לא strict 2PL: כי T3 צריך לבצע נעילה אקסקלוסיבית על V, ולפי strict 2PL ה unlock הוא רק בסוף T3 (אחרי הכתיבה של Z). T2 צריך גם לקבל X-Lock על V. כלומר, T2 ו T3 מחזיקים X-Lock על V באותו רגע. כן 2PL: נסמן Ri(V) = Ti קרא את V, Wi(V) באופן דומה וכו'. S1(V) S1(Y) S1(Z) R1(V) U1(V) S2(Y) R2(Y) X3(V) W3(V) R1(Z) U(Z) X3(Z) U2(V) X2(V) W2(V) U2(V) R1(Y) U1(Y) W3(Z) U3(Z)
What can go wrong with Strict 2PL? • Consider the following: • T1 requests (and gets) an X-lock on A • T2 requests (and gets) an X-lock on B • T1 requests an X-lock on B. T1 is suspended to wait for this lock • T2 requests an X-lock on A. T2 is suspended to wait for this lock
Deadlocks • Deadlock: Cycle of transactions waiting for locks to be released by each other. • Two ways of dealing with deadlocks: • Deadlock detection • Deadlock prevention
Deadlock Detection • Create a waits-for graph: • Nodes are transactions • There is an edge from Ti to Tj if Ti is waiting for Tj to release a lock • Periodically check for cycles in the waits-for graph • When a cycle is found, abort one transaction in the cycle
Deadlock Detection (Continued) T1: S(A), R(A), S(B) T2: X(B),W(B) X(C) T3: S(C), R(C) X(A) T4: X(B) הערה:ה X וה Sים כאן הם בקשות לנעילה (שלאו דוקא מתקבלות) T1 T2 T3 T1 T4 T2 What will the waits-for-graph look like?
Deadlock Prevention • Assign priorities based on timestamps. Assume Ti wants a lock that Tj holds. Two policies are possible: • Wait-Die: It Ti has higher priority, Ti waits for Tj; otherwise Ti aborts • Wound-wait: If Ti has higher priority, Tj aborts; otherwise Ti waits • If a transaction re-starts, make sure it has its original timestamp • Both policies: deadlocks are prevented!