1 / 72

CSL 771: Database Implementation Transaction Processing

CSL 771: Database Implementation Transaction Processing. Maya Ramanath All material (including figures) from: Concurrency Control and Recovery in Database Systems Phil Bernstein, Vassos Hadzilacos and Nathan Goodman (http :// research.microsoft.com /en-us/people/ philbe / ccontrol.aspx ).

tayte
Download Presentation

CSL 771: Database Implementation Transaction Processing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSL 771: Database ImplementationTransaction Processing Maya Ramanath All material (including figures) from: Concurrency Control and Recovery in Database Systems Phil Bernstein, VassosHadzilacos and Nathan Goodman (http://research.microsoft.com/en-us/people/philbe/ccontrol.aspx)

  2. Transactions • Interaction with the DBMS through SQL updateAirlinessetprice = price - price*0.1, status = “cheap” whereprice< 5000 A transaction is a unit of interaction

  3. ACID Properties • Atomicity • Consistency • Isolation • Durability Database system must ensure ACID properties

  4. Atomicity and Consistency • Single transaction • Execution of a transaction: “all-or-nothing” Either a transaction completes in its entirety Or it “does not even start” • As if the transaction never existed • No partial effect must be visible 2 outcomes: A transaction COMMITs or ABORTs

  5. Consistency and Isolation • Multiple transactions • Concurrent execution can cause an inconsistent database state • Each transaction executed as if isolated from the others

  6. Durability • If a transaction commits the effects are permanent • But, durability has a bigger scope • Catastrophic failures (floods, fires, earthquakes)

  7. What we will study… • Concurrency Control • Ensuring atomicity, consistency and isolation when multiple transactions are executed concurrently • Recovery • Ensuring durability and consistency in case of software/hardware failures

  8. Terminology • Data item • A tuple, table, block • Read (x) • Write (x, 5) • Start (T) • Commit (T) • Abort (T) • Active Transaction • A transaction which has neither committed nor aborted

  9. High level model Transaction 2 Transaction 1 Transaction n Transaction Manager Scheduler Disk Recovery Manager Cache Manager

  10. Recoverability (1/2) • Transaction T Aborts • T wrote some data items • T’ read items that T wrote • DBMS has to… • Undo the effects of T • Undo effects of T’ • But, T’ has already committed

  11. Recoverability (2/2) • Let T1,…,Tn be a set of transactions • Ti reads a value written by Tk, k < i • An execution of transactions is recoverable if Ti commits after all Tk commit

  12. Cascading Aborts (1/2) • Because T was aborted, T1,…, Tk also have to be aborted

  13. Cascading Aborts (2/2) • Recoverable executions do not prevent cascading aborts • How can we prevent them then ?

  14. What we learnt so far… Reading a value, committing a transaction Recoverable with cascading aborts Recoverable without cascading aborts Not recoverable

  15. Strict Schedule (1/2) • “Undo”-ing the effects of a transaction • Restore the before image of the data item Equivalent to Final value of y: 3

  16. Strict Schedule (2/2) Initial value of x: 1 Should x be restored to 1 or 3? T1 restores x to 3? T2 restores x to 2? Do not read or write a value which has been written by an active transaction until that transaction has committed or aborted

  17. The Lost Update Problem Assume x is your account balance

  18. Serializable Schedules • Serial schedule • Simply execute transactions one after the other • A serializableschedule is one which equivalent to some serial schedule

  19. Serializability Theory

  20. Serializable Schedules T1: op11, op12, op13 T2: op21, op22, op23, op24 • Serial schedule • Simply execute transactions one after the other op11, op12, op13 op21, op22, op23, op24 op11, op12, op13 op21, op22, op23, op24 • Serializable schedule • Interleave operations • Ensure end result is equivalent to some serial schedule

  21. Notation r1[x] = Transaction 1, Read (x) w1[x] = Transaction 1, Write (x) c1 = Transaction 1, Commit a1= Transaction 1, Abort r1[x], r1[y], w2[x], r2[y], c1, c2

  22. Histories (1/3) • Operations of transaction T can be represented by a partial order. r1[x] w1[z] c1 r1[y]

  23. Histories (2/3) • Conflicting operations • Of two ops operating on the same data item, if one of them is a write, then the ops conflict • An order has to be specified for conflicting operations

  24. Histories (3/3) • Complete History

  25. Serializable Histories • The goal: Ensure that the interleaving operations guarantee a serializable history. • The method • When are two histories equivalent? • When is a history serial?

  26. Equivalence of Histories (1/2) H ≅H’ if • they are defined over the same set of transactions and they have the same operations • they order conflicting operations the same way

  27. Equivalence of Histories (2/2) y Source: Concurrency Control and Recovery in Database Systems: Bernstein, Hadzilacos and Goodman

  28. Serial History • A complete history is serial if for every pair of transactions Ti and Tk, • all operations of Ti occur before Tk OR • all operations of Tk occur before Ti • A history is serializableif its committed projection is equivalent to a serial history.

  29. Serialization Graph T1 T3 T2

  30. Serializability Theorem A history H is serializable if its serialization graph SG(H) is acyclic On your own How do recoverability, strict schedules, cascading aborts fit into the big picture?

  31. Locking

  32. High level model Transaction 2 Transaction 1 Transaction n Transaction Manager Scheduler Disk Recovery Manager Cache Manager

  33. Transaction Management Transaction 1 Transaction 2 Transaction 3 . . . Transaction n • Transaction Manager • Receives Transactions • Sends operations to scheduler Read1(x) Write2(y,k) Read2(x) Commit1 Disk • Scheduler • Execute op • Reject op • Delay op

  34. Locking • Each data item x has a lock associated with it • If T wants to access x • Scheduler first acquires a lock on x • Only one transaction can hold a lock on x • T releases the lock after processing Locking is used by the scheduler to ensure serializability

  35. Notation • Read lock and write lock rl[x], wl[x] • Obtaining read and write locks rli[x], wli[x] • Lock table • Entries of the form [x, r, Ti] • Conflicting locks • pli[x], qlk[y], x = y and p,q conflict • Unlock rui[x], wui[x]

  36. Basic 2-Phase Locking (2PL) RULE 2 pli[x] cannot be released until pi[x] is completed RULE 1 Receive pi[x] is qlk[x] set such that p and q conflict? NO Acquire pli[x] RULE 3 (2 Phase Rule) Once a lock is releasedno other locks may be obtained. YES pi[x] scheduled pi[x] delayed

  37. The 2-phase rule Once a lock is releasedno other locks may be obtained. T1: r1[x] w1[y] c1 T2: w2[x] w2[y] c2 H = rl1[x] r1[x] ru1[x]wl2[x] w2[x] wl2[y] w2[y] wu2[x] wu2[y] c2wl1[y] w1[y] wu1[y] c1 T1 T2

  38. Correctness of 2PL 2PL always produces serializable histories Proof outline STEP 1: Characterize properties of the scheduler STEP 2: Prove that any history with these properties is serializable (That is, SG(H) is acyclic)

  39. Deadlocks (1/2) T1: r1[x] w1[y] c1 T2: w2[y] w2[x] c2 Scheduler rl1[x]wl2[y]r1[x]w2[y]<cannot proceed>

  40. Deadlocks (2/2) Strategies to deal with deadlocks • Timeouts • Leads to inefficiency • Detecting deadlocks • Maintain a wait-for graph, cycle indicates deadlock • Once a deadlock is detected, break the cycle by aborting a transaction • New problem: Starvation

  41. Conservative 2PL • Avoids deadlocks altogether • T declares its readset and writeset • Scheduler tries to acquire all required locks • If not all locks can be acquired, T waits in a queue • T never “starts” until all locks are acquired • Therefore, it can never be involved in a deadlock On your own Strict 2PL (2PL which ensures only strict schedules)

  42. Extra Information • Assumption: Data items are organized in a tree Can we come up with a better (more efficient) protocol?

  43. Tree Locking Protocol (1/3) RULE 2 if x is an intermediate node, and y is a parent of x, the ali[x] is possible only if ali[y] RULE 1 Receive ai[x] NO is alk[x] ? RULE 2 RULE 3 ali[x] cannot be released until ai[x] is completed YES pi[x] scheduled ai[x] delayed RULE 4 Once a lock is releasedthe same lock may not be re-obtained.

  44. Tree Locking Protocol (2/3) • Proposition: If Ti locks x before Tk, then for every v which is a descendant of x, if both Ti and Tk lock v, then Ti locks v before Tk. • Theorem: Tree Locking Protocol always produces Serializable Schedules

  45. Tree Locking Protocol (3/3) • Tree Locking Protocol avoids deadlock • Releases locks earlier than 2PL BUT • Needs to know the access pattern to be effective • Transactions should access nodes from root-to-leaf

  46. Multi-granularity Locking (1/3) • Granularity • Refers to the relative size of the data item • Attribute, tuple, table, page, file, etc. • Efficiency depends on granularity of locking • Allow transactions to lock at different granularities

  47. Multi-granularity Locking (2/3) • Lock Instance Graph • Explicit and Implicit Locks • Intention read and intention write locks • Intention locks conflict with explicit read and write locks but not with other intention locks Source: Concurrency Control and Recovery in Database Systems: Bernstein, Hadzilacos and Goodman

  48. Multi-granularity Locking (3/3) • To set rli[x] or irli[x], first hold irli[y] or iwli[y], such that y is the parent of x. • To set wli[x] or iwli[x], first hold iwli[y], such that y is the parent of x. • To schedule ri[x] (or wi[x]), Ti must hold rli[y] (or wli[y]) where y = x, or y is an ancestor of x. • To release irli[x] (or iwli[x]) no child of x can be locked by Ti

  49. The Phantom Problem • How to lock a tuple, which (currently) does not exist? T1: r1[x1], r1[x2], r1[X], c1 T2: w[x3], w[X], c2 rl1[x1], r1[x1], rl1[x2], r1[x2], wl2[x3], wl[X], w2[x3], wu2[x3,X], c2, rl1[X], ru1[x1,x2,X], c1

  50. Non-lock-based schedulers

More Related