1 / 85

Managing Concurrency in Systems: Serialization Theory

Explore the concepts of isolation and serialization in system transactions, avoiding anomalies and ensuring consistent outcomes. Learn the laws of concurrency control and the essentials of serializability theory.

trinkle
Download Presentation

Managing Concurrency in Systems: Serialization Theory

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Isolation Concepts Chapter 7 in Gray and Reuter Adapted from slides by J. Gray & A. Reuter

  2. Why Lock? • Give each transaction the illusion that there are no concurrent updates • Hide concurrency anomalies. • Do it automatically • Goal: • Although there is concurrency in system execution is equivalent to some serial execution of the system • Not deterministic outcome, just a consistent transformation ECE 569

  3. The Essentials • Notation • Every transaction T has a Read Set denoted: R(T) and a Write Set denoted: W(T) • Definition • T1 and T2 conflict IFF W(T2)  (R(T1)  W(T1)) ¹ Ø ; or W(T1)  (R(T2)  W(T2)) ¹ Ø • If they conflict, delay one until the other finishes ECE 569

  4. Laws Of Concurrency Control • First Law of Concurrency ControlConcurrent execution should not cause application programs to malfunction. • Second Law of Concurrency ControlConcurrent execution should not have lower throughput or much higher response times than serial execution. ECE 569

  5. Transactions and Serializability • Database modeled as a set of elements representing relations, pages, tuples or whatever. • Transactions are sets of operations which access these elements {r[x] | w[x]}* {c | a} • A concurrent execution of several transactions is serializable if it is equivalent to a serial execution of the same transactions • Two histories are equivalent if all transactions read the same values in both histories and the final states of the database are identical ECE 569

  6. Why Serialization? deposit(item x, int amt) { t = read(x); write(x, t + amt); commit(); } • T1: r1[x] w1[x] c1 transfer(item x, item y, int amt) { f = read(x); if (f < amt) abort() else { write(x, f-amt); t = read(y); write(y, t+amt); commit(); } } • T2: r2[x] a2 • T2’: r2[x] w2[x] r2[y] w2[y] c2 ECE 569

  7. Why Serialization? • 1. Lost updates - Deposit made by T1 is lost T1: r1[x] w1[x] c1 T2: r2[x] w2[x] r2[y] w2[y] c2 • 2.Dirty Reads- Amount deducted from x by T2 disappears T1: r1[x] w1[x] c1 T2: r2[x] w2[x] ... A2 ECE 569

  8. Why Serialization? • Incorrect Summary - (Similar to unrepeatable read problem) print_sum(item x, item y) { f = read(x); g = read(y); printf("%d\n", f+g); commit(); } T2: r2[x] w2[x] r2[y] w2[y] c2 T3: r3[x] r3[y] c3 • Sum is “short” by amount being transferred. ECE 569

  9. Serializability Theory - Chapter 2 Bernstein • Transactions • Definition- A transaction Ti is a partial order with ordering relation <i where: 1. Ti {ri[x], wi[x] | x is a data item}  {ai, ci}; and 2. ai Tiiff ci Ti; and 3. If t is ci or ai, then for any other operation p  Ti, p <i t; and 4. If ri[x]  Ti and wi[x]  Ti, then either ri[x] <i wi[x] or wi[x] <i ri[x]. ECE 569

  10. Transactions • A transaction is a partial order (i, <i) • i is the set of operations • <i is the “happened-before” relation for operations in i. ECE 569

  11. Histories • Histories are used to model concurrent executions • Definition- Let T = {T1, T2, ..., Tn} be a set of transactions. A complete history H over T is a partial order with ordering relation <H where: 1. H = ; 2. <H 3. For any two conflicting operations p, q  H, p  q, either p <H q or q <H p. ECE 569

  12. Histories • Example- A history H over T = {T1, T2} • Summary • H contains all of the operations in T • <H honors all of the orderings of the transactions {T1, T2, ..., Tn}. • All conflicting operations are ordered. ECE 569

  13. History Prefixes • Definition- A history H’ = (H’, <H’) is a prefix of a history H = (H, <H) if: 1. H’H; and 2. p, q H’, p <H’ q iff p <H q; and 3. p H’, q H, q <H p  q H’ • Example- A prefix H’ of H above. ECE 569

  14. Serializability • Definition- The committed projection of a history H, C(H), is the history obtained by removing all operations of transactions that are not committed from H. • Example- C(H’) ECE 569

  15. Serializability • Example- Serial histories over T = {T1, T2} ECE 569

  16. Conflict Serializability • Definition- Two histories H and H’ are conflict equivalent () if: 1. They are defined over the same set of transactions and have the same set of operations; and 2. For all conflicting operations pi and qj such that ai, aj H (or H’), pi <H qjiff pi <H’ qj • Definition- A history is conflict serializable if C(H) is conflict equivalent to some serial history Hs. ECE 569

  17. Conflict Serializability • Example- History H is not conflict serializable because • w1[x] <H w2[x] rules out equivalence with T2T1 • w2[y] <H r1[y] rules out equivalence with T1T2 ECE 569

  18. Conflict Serializability • Example- History H’ is conflict serializable. It is equivalent to T1T2. ECE 569

  19. Serializability Theorem • Definition- The serialization graph of a history H, SG(H), is a directed graph with nodes corresponding to committed transactions in H and includes all edges Ti Tj such that pi <H qj and pi conflicts with qj. • Examples- 1. SG(H) ECE 569

  20. Serializability Theorem 2. SG(H’) 3. SG(H1) H1 = r1[x] w2[x] c2 w1[y] c1 r3[x] w3[y] c3 ECE 569

  21. Serializability Theorem • Serializability Theorem- A history H is conflict serializable iff SG(H) is acyclic. • Proof 1. If SG(H) is acyclic then H is conflict serializable. a) Assume SG(H) is acyclic (we will show H is CSR) b) Let Hs be a serial history including all committed transactions in H. c) Order transactions in Hs such that if Ti Tj then Ti precedes Tj in Hs, i.e., order of transactions in Hs is a topological sort of SG(H). (SG(H) is acyclic) ECE 569

  22. Serializability Theorem (Proof Cont.) d) C(H)  Hs • Let pi and qj be conflicting operations from distinct transactions in C(H) such that pi <H qj • Ti and Tj are committed (in C(H)), so they must be included in SG(H) • There must be an edge Ti Tj in SG(H) because pi and qj conflict and pi <H qj. • Our construction ensures that Ti precedes Tj in H, and thus, pi <Hs qj ECE 569

  23. Serializability Theorem (Proof Cont.) 2. If H is conflict serializable then SG(H) is acyclic. a) Assume H is conflict serializable (we will show SG(H) is acyclic) b) Let Hs be a serial history such that C(H)  Hs (There must be one because H is conflict serializable.) c) Ti Tj in SG(H) implies Ti <Hs Tj • Because Ti Tj in SG(H) there must be conflicting operations pi and qj in C(H) where pi <H qj • Since C(H)  Hs, we know that pi <Hs qj and therefore Ti <Hs Tj ECE 569

  24. Serializability Theorem (Proof Cont.) d) SG(H) is acyclic • Assume for the sake of a contradiction that there is a cycle T1 T2 ...  Tn T1 in SG(H) • From argument in (c) this implies T1 <Hs T2 <Hs ... <Hs Tn <Hs T1 • By transitivity we get T1 <Hs T1 which is clearly impossible ECE 569

  25. Properties of Histories • Definition- Transaction Tireads-x-from Tj in H if: 1. wj[x] <H ri[x]; and 2. Aj <H ri[x]; and 3. wk[x]  H, wj[x] <H wk[x] <H ri[x] implies ak <H ri[x]. • Example- H = w1[x] w2[y] r1[y] w2[x] a2 r1[x] • T1 reads-y-from T2 • T1 reads-x-from T1 ECE 569

  26. Properties of Histories • Definition- A history H is recoverable (RC) if Ti reads-from Tj (i  j) and ci H implies cj <H ci • Example- H = r1[x] w2[y] w2[x] r3[x] w1[y] c1 w3[y] c3 c2 • T3 reads-x-from T2 but c2 does not precede c3 in H, thus H is not RC. • Definition- A history H avoids cascading aborts (ACA) if Ti reads-x-from Tj (i  j) implies cj <H ri[x] • Definition- A history H is strict (ST) if wi[x] <H oj[x] (i  j) implies either: • ai <H oj; or • ci <H oj. ECE 569

  27. concepts • What are the three problems non-serializable histories have? What dependencies are concerned? • Define a history being conflict serializable • What is the serialization graph of a history? • What is the serializability theorem? • What is a recoverable history? A history that avoids cascading aborts? Strict history? ECE 569

  28. View Serializability • Definition of serializability based on the view transactions have of the database • Definition- A write wi[x] is the final write of x in H if: • wi[x]  H; and • ai H; and • wj[x]  H (i ≠ j), wj[x] <H wi[x] or aj  H ECE 569

  29. View Serializability (Cont.) • Definition- Two histories H and H’ are view equivalent if: 1. They are over the same transactions and have the same operations; and 2. For all Ti and Tj not aborted in H, Ti reads-x-from Tj in H iff Ti reads-x-from Tj in H’; and 3. wi[x] is the final write of x in H iff wi[x] is the final write of x in H’. • Definition- A history H is view serializable if for every prefix H’ of H, C(H’) is view equivalent to a serial schedule ECE 569

  30. View Serializability (Cont.) • Example- • H = r1[x] w2[y] w2[x] c2 r3[x] w1[y] c1 w3[y] c3 1. H1’ = r1[x] w2[y] w2[x] c2 • C(H1’) = w2[y] w2[x] c2 • C(H1’) is view equivalent to a serial history (itself) 2. H2’ = r1[x] w2[y] w2[x] c2 r3[x] w1[y] c1 • C(H2’) = r1[x] w2[y] w2[x] c2 w1[y] c1 • C(H2’) is not view equivalent to T1T2 (w1[y] not w2[y] is final write of y in H) • C(H2’) is not view equivalent to T2T1 ECE 569

  31. View Serializability (Cont.) • Example- • H = r1[x] w2[y] w2[x] r3[x] w1[y] c1 w3[y] c3 c2 2. H2’ = r1[x] w2[y] w2[x] r3[x] w1[y] c1 w3[y] c3 • C(H2’) = r1[x] r3[x] w1[y] c1 w3[y] c3 • C(H2’) is view equivalent to T1T3 • T3 writes final version of y in both histories 3. H3’ = r1[x] w2[y] w2[x] r3[x] w1[y] c1 w3[y] c3 c2 • C(H3’) = r1[x] w2[y] w2[x] r3[x] w1[y] c1 w3[y] c3 c2 • C(H3’) is view equivalent to T1T2T3 • T3 reads-x-from T2 in both histories. • T2 writes final (only) version of x in both histories • T3 writes final version of y in both histories ECE 569

  32. Two-Phase Locking (2PL) • Notation • rli[x] - A read lock for element x granted to transaction Ti • wli[x] - A write lock for x granted to Ti • rui[x] - Release a read lock for x held by transaction Ti • wui[x] - Release a write lock for x held by Ti ECE 569

  33. Basic 2PL • Protoco 1. Before executing pi[x], a lock pli[x] must be acquired on Ti’s behalf. If another transaction Tj is holding a lock qlj[x] that conflicts with pli[x] then the operation is delayed until the lock can be set. (well-formed) 2. The scheduler cannot release the lock pli[x] at least until the completion of pi[x] has been acknowledged. 3. A transaction cannot acquire any new locks after it has released a lock. ECE 569

  34. Basic 2PL ECE 569

  35. Correctness of 2PL • Characteristics of 2PL Histories 1. If oi[x]  C(H) then a) oli[x]  C(H) and oli[x] <Hoi[x]; and b) oui[x]  C(H) implies oi[x] <Houi[x] 2. If pi[x] and qj[x] (i  j) are conflicting operations on x in C(H), then either: a) pui[x] <Hqlj[x]; or b) quj[x] <Hpli[x] 3. If pi[x] and qi[y] are in C(H), then pli[x] <Hqui[y] ECE 569

  36. 2PL ??? CSR • 2PL histories are a subset of CSR histories r1[A] w1[A] r2[A] w3[A] c3 r1[B] w1[B] c1 r2[B] c2 ECE 569

  37. Correctness of 2PL (Cont.) • Theorem- Every 2PL history is CSR • Proof I. Ti Tj SG(H) implies there is an element x for which pui[x] <H qlj[x] a) The edge Ti Tj SG(H) implies that pi <Hqj. b) By 2 above, we know that pui[x] <Hqlj[x] or quj[x] <Hpli[x] c) Assume quj[x] <Hpli[x]. By 1(a), 1(b) and transititivity, we get qj <Hpi. But this contradicts I(a). ECE 569

  38. Correctness of 2PL (Cont.) II. If T1 T2 ...  Tk is a path in SG(H) then pu1[x] <H qlk[y] for some x and y. a) Basis: T1 Tk SG(H) • pu1[x] <H qlk[x] holds by argument in I. b) Induction Step: Assume that hypothesis holds for path T1 T2 ...  Tk. Now consider path T1 T2 ...  Tk  Tk+1. • qlk[z] <H puk[y] follows from 3 above. • Becuase Tk  Tk+1 SG(H), from I we know there is a y, puk[y] <H qlk+1[y] • Thus, qlk[z] <Hqlk+1[y]. Combining this with induction hypothesis (pu1[x] <H qlk[z]), we get pu1[x] <Hqlk[y] ECE 569

  39. Correctness of 2PL (Cont.) III. Assume that SG(H) contains a cycle T1 T2 ...  Tn  T1. From II, this implies pu1[x] <Hql1[y]. But this contradicts the two-phase rule (3). Must assume that 2PL schedules are CSR. ECE 569

  40. Deadlock T1: T2: l1(A) l2(B) r1(A) r2(B) l1(B) // T1 waits l2(A) // T2 waits ECE 569

  41. Other Variations • Conservative 2PL • When a transaction starts, it predeclares the set of elements it will read and write • A transaction must acquire all of its locks before it executes any operations. If all locks cannot be acquired, any acquired locks are released. • Deadlock is avoided because no locks are held while requesting other locks. • Strict 2PL (Rigorous 2PL) • Release all locks only after transaction commits • Rigorous histories ensure that serialization order is compatible with commit order. • Which avoids cascading rollback? ECE 569

  42. ECE 569

  43. Serializability Requirements • Lock everything transaction accesses • Do not lock after unlock. • Backout may have to undo a unlock (= lock). • So do not release locks prior to commit ECE 569

  44. Degrees of Isolation • SQL allows client to trade-off isolation against performance by specifying a degree of isolation. • 0° - Does not overwrite another transaction’s dirty data if the other transaction is 1° or better. • transaction gets short xlocks for writes (well formed writes not 2Ø, no read locks) • 1° - No lost updates • transaction gets no read locks (well formed and 2Ø writes,) • 2° - No lost updates or dirty reads • transaction releases read locks right after read (well formed with respect to reads but not 2Ø with respect to reads) • 3° - No lost updates and repeatable reads (implies serializability) • well formed and 2Ø ECE 569

  45. Isolation Levels Theorem • What is effect of some transactions employing an isolation level lower than 3°? • If others lock 1° or better and I obey 0°, 1°, 2° or 3° any legal history will give me 1°, 2° or 3° isolation. • But the DB may be corrupted! • Must ensure that if I allow dirty reads, I can still produce consistent updates. ECE 569

  46. Degree 2 Degree 3 Issue Degree 0 Degree 1 Isolated Common name Chaos Browse read uncommited Cursor stability read committed Serializable Repeatable reads 1+ 2 + Protection Lets others run 0 and No dirty reads Repeatable Provided at higher no lost updates reads isolation Same Same Committed data Writes visible Writes visible at immediately eot 0, 1, and you 0,1,2 and others Dirty data You don't 0 and others do don't read dirty don't produce dirty overwrite dirty not overwrite data data you read data your dirty data 1 and set short 1 and set long Lock protocol Set short Set long share locks on share locks on exclusive locks exclusive locks data you read data you read on data you on data you write write wrt Well-formed Well-formed Trans structure Well-formed Well-formed And and Wrt writes writes wrt Two-phase wrt Two-phase Two-phase writes writes Medium: Lowest: Concurrency Greatest: Great: hold few read any data only set short only wait for locks touched write locks write locks Locked to eot Comparison of Isolation Levels Rollback supported ECE 569

  47. Issue Degree 0 Degree 1 Degree 2 Degree 3 Overhead Least: short W locks Small: Only write locks Medium: Set R&W but short R Most: Set long R&W Rollback UNDO may cascade Cant rollback Can Undo incomplete transactions same Same System Recovery Dangerous, Updates may be lost and violate 3° Apply log in 1° order Same same Dependencies None W  W W  W W  R W  W W  R R  W Comparison of Isolation Levels ECE 569

  48. The Phantom Problem • Phantom Records (if the locks are for each record) • If I try to read hair = "red" and eyes = "blue" and get not found, what gets locked? No records have been accessed so no records get locked • If I delete a record, what gets locked? (the record is gone) • Predicate Locks can solve this problem • Page Locks (done right) can also solve this problem • lock the red hair page and the blue eye page, • prevents others red hair and blue eye inserts & updates • High volume TP systems use esoteric locking mechanisms: • Key Range Locks to protect b-trees • Hole Locks to protect space for uncommitted deletes ECE 569

  49. Predicate Locks • Read and write sets can be defined by predicates (e.g. Where clauses in SQL statements) • When a transaction accesses a set for the first time, • Automatically capture the predicate • Do set intersection with predicates of others. • Delay this transaction if conflict with others. • Problems with predicate locks: • Set intersection = predicate satisfiability is NP complete (slow). • Hard to capture predicates • Pessimistic: Jim locks eye = blue Andreas locks hair=red • Predicate says conflict, but DB may not have blue eyed red haired person. ECE 569

  50. Granular Locks • Idea • Pick a fixed set of predicates • They form a lattice under and, or • This can be represented as a graph • Lock the nodes in this graph • Example • Can lock whole DB, whole file, or just one key value. • Size of lock is called granule. ECE 569

More Related