1.13k likes | 1.3k Views
CSE544 Transactions: Concurrency Control. Lectures #5-6 Thursday, January 20, 2011 Tuesday, January 25, 2011. Reading Material for Lectures 5-7. Main textbook ( Ramakrishnan and Gehrke ): Chapters 16, 17, 18 Mike Franklin’s paper More background material: Garcia-Molina, Ullman, Widom :
E N D
CSE544Transactions: Concurrency Control Lectures #5-6 Thursday, January 20, 2011 Tuesday, January 25, 2011 Dan Suciu -- 544, Winter 2011
Reading Material for Lectures 5-7 Main textbook (Ramakrishnan and Gehrke): • Chapters 16, 17, 18 Mike Franklin’s paper More background material: Garcia-Molina, Ullman, Widom: • Chapters 17.2, 17.3, 17.4 • Chapters 18.1, 18.2, 18.3, 18.8, 18.9 Dan Suciu -- 544, Winter 2011
Transactions • The problem: An application must perform several writes and reads to the database, as a unity • Solution: multiple actions of the application are bundled into one unit called Transaction Dan Suciu -- 544, Winter 2011
Turing Awards to Database Researchers • Charles Bachman 1973 for CODASYL • Edgar Codd 1981 for relational databases • Jim Gray 1998 for transactions Dan Suciu -- 544, Winter 2011
The Need for Transactions • What can go wrong ? • System crashes • Anomalies during concurrent access: three are famous Dan Suciu -- 544, Winter 2011
Crashes Client 1: UPDATE Accounts SET balance= balance - 500WHERE name= ‘Fred’ UPDATE Accounts SET balance = balance + 500WHERE name= ‘Joe’ Crash ! What’s wrong ? Dan Suciu -- 544, Winter 2011
Three Famous Anomalies • Lost update – what is it ? • Dirty read – what is it ? • Inconsistent read – what is it ? Dan Suciu -- 544, Winter 2011
The Three Famous anomalies • Lost update • Two tasks T and T’ both modify the same data • T and T’ both commit • Final state shows effects of only T, but not of T’ • Dirty read • T reads data written by T’ while T’ has not committed • What can go wrong: T’ write more data (which T has already read), or T’ aborts • Inconsistent read • One task T sees some but not all changes made by T’
1st Famous Anomaly: Lost Updates Client 1: UPDATE CustomerSET rentals= rentals + 1WHEREcname= ‘Fred’ Client 2: UPDATE CustomerSET rentals= rentals + 1WHEREcname= ‘Fred’ Two people attempt to rent two movies for Fred,from two different terminals. What happens ? Dan Suciu -- 544, Winter 2011
2nd Famous Anomaly: Dirty Reads Client 1: transfer $100 acc1 acc2 X = Account1.balance Account2.balance += 100 If (X>=100) Account1.balance −=100else { /* rollback ! */ account2.balance −= 100 println(“Denied !”) Client 2: transfer $100 acc2 acc3 Y = Account2.balance Account3.balance += 100 If (Y>=100) Account2.balance −=100else { /* rollback ! */ account3.balance −= 100 println(“Denied !”) What’s wrong ? Dan Suciu -- 544, Winter 2011
3rd Famous Anomaly: Inconsistent Read Client 1: move from gizmogadget UPDATE Products SET quantity = quantity + 5WHERE product = ‘gizmo’ UPDATE Products SET quantity = quantity - 5WHERE product = ‘gadget’ Client 2: inventory…. SELECT sum(quantity) FROM Product Dan Suciu -- 544, Winter 2011
Transactions: Definition • A transaction = one or more operations, which reflects a single real-world transition • Happens completely or not at all; all-or-nothing • Examples • Transfer money between accounts • Rent a movie; return a rented movie • Purchase a group of products • Register for a class (either waitlisted or allocated) • By using transactions, all previous problems disappear Dan Suciu -- 544, Winter 2011
Transactions in Applications May be omitted:first SQL querystarts txn START TRANSACTION [SQL statements] COMMIT or ROLLBACK (=ABORT) In ad-hoc SQL: each statement = one transaction Dan Suciu -- 544, Winter 2011
ACID Properties • Atomic • What is it ? • Consistent • What is it ? • Isolated • What is it ? • Durable • What is it ? Dan Suciu -- 544, Winter 2011
ACID Properties • Atomic • State shows either all the effects of txn, or none of them • Consistent • Txn moves from a state where integrity holds, to another where integrity holds • Isolated • Effect of txns is the same as txns running one after another (ie looks like batch mode) • Durable • Once a txn has committed, its effects remain in the database Dan Suciu -- 544, Winter 2011
Concurrency Control • Multiple concurrent transactions T1, T2, … • They read/write common elements A1, A2, … • How can we prevent unwanted interference ? The SCHEDULER is responsible for that Dan Suciu -- 544, Winter 2011
Schedules A schedule is a sequenceof interleaved actions from all transactions Dan Suciu -- 544, Winter 2011
Example Dan Suciu -- 544, Winter 2011
A Serial Schedule Dan Suciu -- 544, Winter 2011
Serializable Schedule A schedule is serializable if it is equivalent to a serial schedule Dan Suciu -- 544, Winter 2011
A Serializable Schedule This is NOT a serial schedule,but is serializable Dan Suciu -- 544, Winter 2011
A Non-Serializable Schedule Dan Suciu -- 544, Winter 2011
Serializable Schedules • The role of the scheduler is to ensure that the schedule is serializable Q: Why not run only serial schedules ? I.e. run one transactionafter the other ? Dan Suciu -- 544, Winter 2011
Serializable Schedules • The role of the scheduler is to ensure that the schedule is serializable Q: Why not run only serial schedules ? I.e. run one transactionafter the other ? A: Because of very poor throughput due to disk latency. Lesson: main memory databases may do serial schedules only Dan Suciu -- 544, Winter 2011
ASerializableSchedule Schedule is serializablebecause t=t+100 ands=s+200 commute We don’t expect the scheduler to schedule this Dan Suciu -- 544, Winter 2011
Ignoring Details • Assume worst case updates: • We never commute actions done by transactions • As a consequence, we only care about reads and writes • Transaction = sequence of R(A)’s and W(A)’s T1: r1(A); w1(A); r1(B); w1(B) T2: r2(A); w2(A); r2(B); w2(B) Dan Suciu -- 544, Winter 2011
Conflicts • Write-Read – WR • Read-Write – RW • Write-Write – WW Dan Suciu -- 544, Winter 2011
Conflicts Two actions by same transaction Ti: ri(X); wi(Y) Two writes by Ti, Tj to same element wi(X); wj(X) wi(X); rj(X) Read/write by Ti, Tj to same element ri(X); wj(X) A “conflict” means: you can’t swap the two operations Dan Suciu -- 544, Winter 2011
Conflict Serializability • A schedule is conflict serializable if it can be transformed into a serial schedule by a series of swappings of adjacent non-conflicting actions Example: r1(A); w1(A); r2(A); w2(A); r1(B); w1(B); r2(B); w2(B) r1(A); w1(A); r1(B); w1(B); r2(A); w2(A); r2(B); w2(B)
The Precedence Graph Test Is a schedule conflict-serializable ? Simple test: • Build a graph of all transactions Ti • Edge from Ti to Tj if Ti makes an action that conflicts with one of Tj and comes first • The test: if the graph has no cycles, then it is conflict serializable ! Dan Suciu -- 544, Winter 2011
Example 1 r2(A); r1(B); w2(A); r3(A); w1(B); w3(A); r2(B); w2(B) 1 2 3 Dan Suciu -- 544, Winter 2011
Example 1 r2(A); r1(B); w2(A); r3(A); w1(B); w3(A); r2(B); w2(B) B A 1 2 3 This schedule is conflict-serializable Dan Suciu -- 544, Winter 2011
Example 2 r2(A); r1(B); w2(A); r2(B); r3(A); w1(B); w3(A); w2(B) 1 2 3 Dan Suciu -- 544, Winter 2011
Example 2 r2(A); r1(B); w2(A); r2(B); r3(A); w1(B); w3(A); w2(B) B A 1 2 3 B This schedule is NOT conflict-serializable Dan Suciu -- 544, Winter 2011
View Equivalence • A serializable schedule need not be conflict serializable, even under the “worst case update” assumption w1(X); w2(X); w2(Y); w1(Y); w3(Y); Is this schedule conflict-serializable ? Dan Suciu -- 544, Winter 2011
View Equivalence • A serializable schedule need not be conflict serializable, even under the “worst case update” assumption w1(X); w2(X); w2(Y); w1(Y); w3(Y); Is this schedule conflict-serializable ? No… Dan Suciu -- 544, Winter 2011
View Equivalence • A serializable schedule need not be conflict serializable, even under the “worst case update” assumption w1(X); w2(X); w2(Y); w1(Y); w3(Y); Lost write w1(X); w1(Y); w2(X); w2(Y); w3(Y); Equivalent, but not conflict-equivalent Dan Suciu -- 544, Winter 2011
View Equivalence Lost Serializable, but not conflict serializable Dan Suciu -- 544, Winter 2011
View Equivalence Two schedules S, S’ are view equivalent if: • If T reads an initial value of A in S, then T also reads the initial value of A in S’ • If T reads a value of A written by T’ in S, then T also reads a value of A written by T’ in S’ • If T writes the final value of A in S, then it writes the final value of A in S’ Dan Suciu -- 544, Winter 2011
View-Serializability A schedule is view serializable if it is view equivalent to a serial schedule Remark: • If a schedule is conflict serializable, then it is also view serializable • But not vice versa Dan Suciu -- 544, Winter 2011
Schedules with Aborted Transactions • When a transaction aborts, the recovery manager undoes its updates • But some of its updates may have affected other transactions ! Dan Suciu -- 544, Winter 2011
Schedules with Aborted Transactions Cannot abort T1 because cannot undo T2 Dan Suciu -- 544, Winter 2011
Recoverable Schedules A schedule is recoverable if: • It is conflict-serializable, and • Whenever a transaction T commits, all transactions who have written elements read by T have already committed Dan Suciu -- 544, Winter 2011
Recoverable Schedules Nonrecoverable Recoverable
Cascading Aborts • If a transaction T aborts, then we need to abort any other transaction T’ that has read an element written by T • A schedule is said to avoid cascading aborts if whenever a transaction read an element, the transaction that has last written it has already committed. Dan Suciu -- 544, Winter 2011
Avoiding Cascading Aborts With cascading aborts Without cascading aborts
Review of Schedules Serializability Recoverability Recoverable Avoiding cascading deletes Serial Serializable Conflict serializable View serializable Dan Suciu -- 544, Winter 2011
Review Questions • What is a schedule ? • What is a serializableschedule ? • What is a conflict ? • What is a conflict-serializable schedule ? • What is a view-serializable schedule ? • What is a recoverable schedule ? • When does a schedule avoid cascading aborts ? Dan Suciu -- 544, Winter 2011
Scheduler • The scheduler is the module that schedules the transaction’s actions, ensuring serializability • Two main approaches • Pessimistic scheduler: uses locks • Optimistic scheduler: time stamps, validation Dan Suciu -- 544, Winter 2011
Pessimistic Scheduler Simple idea: • Each element has a unique lock • Each transaction must first acquire the lock before reading/writing that element • If the lock is taken by another transaction, then wait • The transaction must release the lock(s) Dan Suciu -- 544, Winter 2011