1 / 124

Chapters 19 and 20: Transaction Processing and Concurrency Control

Overview of Material. Key Background Topics:Synchronization in Operating SystemsTransaction and Deadlock ConceptsPrevention, Avoidance, DetectionChapter 19 - Transaction ProcessingConcurrency ControlData Consistency ProblemsSchedules and SerializabilityChapter 20 - Concurrency ControlDiffer

cicero
Download Presentation

Chapters 19 and 20: Transaction Processing and Concurrency Control

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. Chapters 19 and 20: Transaction Processing and Concurrency Control A portion of these slides are being used with the permission of Dr. Ling Lui, Associate Professor, College of Computing, Georgia Tech. Remaining slides represent new material.

    2. Overview of Material Key Background Topics: Synchronization in Operating Systems Transaction and Deadlock Concepts Prevention, Avoidance, Detection Chapter 19 - Transaction Processing Concurrency Control Data Consistency Problems Schedules and Serializability Chapter 20 - Concurrency Control Different Locking-Based Algorithms 2 Phase Protocol Deadlock and Livelock Optimistic Concurrency Control

    3. What is Synchronization? Ability of Two or More Serial Processes to Interact During Their Execution to Achieve Common Goal Recognition that “Today’s” Applications Require Multiple Interacting Processes Client/Server and Multi-Tiered Architectures Inter-Process Communication via TCP/IP Fundamental Concern: Address Concurrency Control Access to Shared Information Historically Supported in Database Systems Currently Available in Many Programming Languages

    4. Thread Synchronization Suppose X and Y are Concurrently Executing in Same Address Space What are Possibilities?

    5. Thread Synchronization Will Second Part Still Finish After Third Part? Will Second Part Now Finish Before Third Part? What Happens if Variables are Shared? This is the Database Concern - Concurrent Transactions Against Shared Tables!

    6. Databases have Transactions A Transaction is A Logic Unit of Database Processing Represents the Collection of Actions that Make Consistent Transformations of System States while Preserving System Consistency Interleaved (A and B) and Concurrent (C and D)

    7. Two Sample Transactions Transaction T1 Reads/Writes X/Y, Modifying X by Subtracting N and Y by Adding N Transaction T2 Reads X and Modifies X by Adding M Transactions can Execute Serially: T1 followed by T2 (or reverse) Interleaved: Operation by Operation

    8. Why Do we Need to Synchronize? Promote Sharing of Resources, Data, etc. Cooperating Processes Norm Not Exception Difficult to Program Solutions Handle Concurrent Behavior of Processes Multiple Processes Interacting via OS Resources Under Control of Process Manager/Scheduler Performance, Parallel Algorithms & Computations Multi-Processor Architectures (CSE228) Different OS to Handle Multiple Processors Client/Server and Multi-Tier Architectures Underlying Database Support Concurrent Transactions Shared Databases

    9. Potential Problems without Synchronization? Data Inconsistency Lost-Update Problem Impact on Correctness of Executing Software Deadlock Two Processes (Transactions) Each Hold Unique Resource (Data Item) and Want Resource (Date Item) of Other Process (Trans.) Processes Wait Forever Non-Determinacy of Computations Behavior of Computation Different for Different Executions Two Processes (Transactions) Produce Different Results When Executed More than Once on Same Data

    10. Classic Synchronization Techniques Goal: Shared Variables and Resources Two Approaches: Critical Sections Define a Segment of Code as Critical Section Once Execution Enters Code Segment it Cannot be Interrupted by Scheduler Release Point for Critical Section for Interrupts We’ll Briefly Review Semaphores Proposed in 1960s by E. Dijkstra Utilizes ADTs to Design and Implement Behavior that Guarantees Consistency and Non-Deadlock Recall your OS Course (CSE258)

    11. Critical Sections

    12. Critical Sections

    13. Critical Sections (continued) There is a Race to Execute Critical Sections Sections May Be Different Code in Different Processes: Cannot Detect With Static Analysis Results of Multiple Execution Are Not Determinate Need an OS Mechanism to Resolve Races If p1 Wins, R1 and R2 Added to Balance - Okay If p2 Wins, its Changed Balance Different from One Held by p1 which Adds/Writes Wrong Value

    14. General Problem: A Deadly Embrace T1 has A Wants B - Won’t Release A Until it Gets B T2 has B Wants A - Won’t Release B Until it Gets A T3 has C Wants A - Won’t Release B Until it Gets A What is the End Result? Deadlock!

    15. Deadlock in Databases Databases Must Control Access to Information by Multiple Concurrent Transactions (Processes) How do we Prevent Simultaneous Updates of Database by Concurrent Transactions (Processes)? Data is the Resource in Database System

    16. Addressing Deadlock Deadlock is Global Condition! Need to Analyze All Processes that Need All Resources Can’t Make Local Decision Based on Needs of One Process Four Deadlock Approaches: Prevention: Never Allow Deadlock to Occur Avoidance: System Makes Decision to Head Off Future Deadlock State Detection & Recovery: Check for Deadlock (Periodically or Sporadically), Then Recover Manual Intervention: Operator Reboot if System Seems Too Slow

    17. Prevention: A First Look Design the System So that Deadlock is Impossible Deadlock Only Occurs If All Following TRUE!! Mutual Exclusion: Allocated Data Items are Exclusive Property of Transaction Hold and Wait: Transaction Can Hold Data Items While Waiting for Another Resource Circular Waiting: T1 has A needs B, T2 has B needs C, … Tn has Z needs A No Preemption: Only Transaction Can Release Data Items or Withdraw Data Items Request All Four Necessary for Deadlock to Exist Prevention Requires Concurrency Control Manager to Violate at Least One Condition at All Times!

    18. Avoidance: A First Look Construct a Formal Model of System States Via Model, Choose a Strategy that Will Not Allow the System to Go to a Deadlock State Predictive Approach: Requires Transaction to Declare Intent re. Data Items in Advance Transaction X Needs A, B, and D Represents “Maximum Claim” on Data Items Transaction X Won’t Proceed Until all Data Items Available May Require “Long” Waits Amenable to Formal Solution/Algorithm

    19. Detection and Recovery: A First Look When Deadlock Occurs, Can we Detect and Recover? Two Phases to Algorithm Detection: Is there Deadlock? Recovery: Preempt Data Items from Transactions Detection Algorithm When is it Executed? What is its Overhead? Too Often - Wastes Data Items Too Infrequent - Blocked Transactions Don’t Do Enough Work Dominant Commercial Solution

    20. Deadlock in Databases Concurrent Access to Database Information Optimistic Concurrency Control Assume Problems Infrequent (ATM Example) Maintain Transaction Log Detect and Correct Errors in System via Log “Long-After” Their Occurrence Similar to What in OS? Deadlock Concepts? Pessimistic Concurrency Control Assume Problems will Occur (Airline Example) Require Transactions to Lock Portions of Data for Read and Write Requests Similar to What in OS? Deadlock Concepts?

    21. Prevention Necessary Conditions for Deadlock Mutual Exclusion Hold and Wait Circular Waiting No Preemption Ensure that at Least One of the Necessary Conditions is False at All Times Why Must Mutual Exclusion Hold at All Times? Some Data Items (System Catalog) Must be Exclusively Held by a Transaction How Can a Prevention Strategy be Designed to Guarantee Failure of One of Other Conditions?

    22. Hold and Wait Invalidate: Hold and Wait: Transaction Can Hold One Data Item While Waiting for Another Data Item Approach 1: Targeted to Batch Systems Transaction Must Request All Data Items it Needs Transaction Competes for All Data Items Even if Needs Only One Data Item at Time Holds Data Items “Done” With Approach 2: Targeted to Timesharing For Transaction to Acquire a Data Item Must Release All Held Data Items Reacquire All (Released &New) Data Items Needed Overhead to Reacquire Held Data Items Could Encourage Starvation

    23. Avoidance Requires a Multi-Phase Approach Construct a Model of System States Choose a Strategy that Guarantees that the System Will Not Go to a Deadlock State Service Transactions in Some Order, Not Necessarily Order Received Requires Extra Information for Each Transaction Maximum Claim - Every Data Item Each Transaction Will Ever Request Concurrency Controller Sees the Worst Case and can Allow Transitions Based on that Knowledge Goal: To Maintain “Safe” State

    24. Synopsis of Techniques Resource Allocation Policy: Detection - Very Liberal - requested Resources (Data Items) are Granted Where Possible Prevention - Conservative - Undercommits Resources (Data Items) Avoidance - Moderate - Between Detection/Prev. Different Invocation Schemes Detection - Periodically to Test for Deadlock Prevention - Request All Resources at Once, Preempt, and Order Resources Avoidance - Manipulate Transactions to Find at Least One Safe Execution Path

    25. Advantages Disadvantages Detection Never Delays Transaction Initiation Facilitates On-Line Processing Prevention Works Well for “Short” Transactions Enforceable Via Compile Time Checks Run-Time Computation Reduced Avoidance No Preemption Needed Detection Inherent Preemption Losses Prevention Inefficient Preempts Too Often Disallows Incremental Transaction Requests Avoidance Future Transaction Requirements Must be Known in Advance Transactions can be Blocked for Long Periods

    26. Transaction Processing Concepts Basic Transaction Processing Concepts What is a Transaction? Why do we need Concurrency Control in a Multi-User Environment? Atomic Transactions and ACID Properties Serial execution and Serializability Concurrency Control Techniques Locking, Timestamps, and Multiversion Optimistic Concurrency Control Transactions Provide Atomic/Reliable Execution in the Presence of Failures Correct Execution of Multiple User Accesses

    27. What is a Transaction? A Transaction is A Logic Unit of Database Processing Represents the Collection of Actions that Make Consistent Transformations of System States while Preserving System Consistency

    28. Two Sample Transactions Transaction T1 Reads/Writes X/Y, Modifying X by Subtracting N and Y by Adding N Transaction T2 Reads X and Modifies X by Adding M Their Interleaved Execution can Yield Dramatically Different Results! What are the Possibilities?

    29. Example of Transaction for Query Query: User Steve Reserves Seat on Flight 123 Transaction Comprised of Three Steps: Update the Seats Available (CAP) for Flight 123 If Steve is a New Customer, Insert Information for Customer “Steve” into the CUST Table Insert Information for the Reservation into the Flight-Customer Table FC To Record the Flight

    30. Termination of Transactions Note: Checking to See if Steve Customer is Omitted

    31. What is Concurrency Control? Single User vs. Multi-user Environment Programs May Executed in an Interleaved Fashion Concurrency Control Concurrent Execution of Transactions May Interfere with Each Other May Produce an Incorrect Overall Result Even If Each Transaction is Correct When Executed in Isolation Why is Concurrency Control Needed? The Lost Update Problem The Dirty Read Problem The Incorrect Summary Problem The Unrepeatable Read Problem

    32. The Lost Update Problem Problem: Item X has an incorrect value Since its Update by T1 is “lost” (Overwritten by T2)

    33. The Dirty Read Problem Problem: Item X read by T2 is “dirty” (incorrect) Since Due to T1 Failing before Completion (Commit), System Must Undo the Update and Change X Back to Original Value It is Created by a Trans. That Has Not Been Completed/Committed Unfortunately T2 has Read the “temporary” value of X z

    34. The Incorrect Summary Problem Problem: Inconsistent Values w.r.t. Time - X has been Changed but Y has Not - X and Y are “Correct”

    35. Summary of the Problems Lost Update Problem Two processes execute the programs that intend to update the same data item X concurrently X may end up with just one update Dirty Data Read Problem A process may write intermediate values into the database Further writes invalidate that particular value Process rollback also invalidate that value

    36. Summary of the Problems Incorrect Summary Problem Query: Calculate total checking deposits Update: Transfer $1 M from Acct 1 to Acct 2 If query reads account 1 before the update and account 2 after the update, the result is off by $1M The Unrepeatable Read Problem Consider at Transaction T T Reads Data Item X at Time t Another Transaction Y Modifies X at Time t+1 T then Read X again at Time t+2 T has Read Two Different values of X!

    37. Further Transaction and System Concepts Transaction Moves Through Many States from Begin to End From System Issue, Key Concern are Potential Abort When Can Aborts Occur? What are Issues?

    38. Further Transaction and System Concepts Aborting Active Transaction Recovery Likely Not Needed Reads/Writes on “Local” Copies Permanent Copy Not Updated

    39. Further Transaction and System Concepts Aborting Partially Committed Transaction Transaction Commits by Writing Values to DB Suppose Write A, Write B, Write C If Failure After Write A and Before Write C, Transaction Aborts and Corrective Action Needed Must “Undo” Effect of All Completed Writes

    40. Desirable Properties of Transactions (ACID) Database Consists of Set of Data Items Read(x) Gets Last Stored Value in X Write(x) Stores a New Value Into X Atomicity: A Set of R/W Operations that Either Completes Entirely or Not at All Consistency: R/W Operations take the Database from a One Consistent State to Another Consistent State Isolation: No Intermediate Values Produced by the R/W Operations will be Visible to Other Transactions Durability: Once the Transaction is Completed, and All the Updates Are Committed, then these Changes Must Never be Lost because of Subsequent Failure

    41. What is a Schedule? A Schedule S is a Sequence of R/W Operations, Which End with Commit or Abort Different Transactions May Interleave with One Another Each Transaction a Sequence of R/W Operations Two Schedules S1 and S2 are Equivalent, Denoted As S1 ? S2 , If and Only If S1 and S2 Execute the Same Set of Transactions Produce the Same Results (i.e., Both Take the DB to the Same Final State)

    42. Transactions and a Schedule Below are Transactions T1 and T2 Note that the Their Interleaved Execution Shown Below is an Example of One Possible Schedule There are Many Different Interleaves of T1 and T2

    43. Equivalent Schedules Are the Two Schedules below Equivalent? S1 and S4 are Equivalent, since They have the Same Set of Transactions and Produce the Same Results

    44. Serializability of Schedules A Serial Execution of Transactions Runs One Transaction at a Time (e.g., T1 and T2 or T2 and T1) All R/W Operations in Each Transaction Occur Consecutively in S, No Interleaving Consistency: a Serial Schedule takes a Consistent Initial DB State to a Consistent Final State A Schedule S is Called Serializable If there Exists an Equivalent Serial Schedule A Serializable Schedule also takes a Consistent Initial DB State to Another Consistent DB State An Interleaved Execution of a Set of Transactions is Considered Correct if it Produces the Same Final Result as Some Serial Execution of the Same Set of Transactions We Call such an Execution to be Serializable

    45. Example of Serializability Consider S1 and S2 for Transactions T1 and T2 If X = 10 and Y = 20 After S1 or S2 X = 7 and Y = 40

    46. Example of Serializability Consider S1 and S2 for Transactions T1 and T2 If X = 10 and Y = 20 After S1 or S2 X = 7 and Y = 40 Is S3 a Serializable Schedule?

    47. Example of Serializability Consider S1 and S2 for Transactions T1 and T2 If X = 10 and Y = 20 After S1 or S2 X = 7 and Y = 40 Is S4 a Serializable Schedule?

    48. Two Serial Schedules with Different Results Consider S1 and S2 for Transactions T1 and T2 If X = 10 and Y = 20 After S1 X = 7 and Y = 28 After S2 X = 7 and Y = 27

    49. The Serializability Theorem A Dependency Exists Between Two Transactions If: They Access the Same Data Item Consecutively in the Schedule and One of the Accesses is a Write Three Cases: T2 Depends on T1 , Denoted by T1 ?T2 T2 Executes a Read(x) after a Write(x) by T1 T2 Executes a Write(x) after a Read(x) by T1 T2 Executes a Write(x) after a Write(x) by T1 Transaction T1 Precedes Transaction T2 If: There is a Dependency Between T1 and T2, and The R/W Operation in T1 Precedes the Dependent T2 Operation in the Schedule

    50. The Serializability Theorem A Precedence Graph of a Schedule is a Graph G = <TN, DE>, where Each Node is a Single Transaction; i.e.,TN = {T1, ..., Tn} (n>1) and Each Arc (Edge) Represents a Dependency Going from the Preceding Transaction to the Other i.e., DE = {eij | eij = (Ti, Tj), Ti, Tj ??TN} Use Dependency Cases on Prior Slide The Serializability Theorem A Schedule is Serializable if and only of its Precedence Graph is Acyclic

    51. Serializability Theorem Example Consider S1 and S2 for Transactions T1 and T2 Consider the Two Precedence Graphs for S1 and S2 No Cycles in Either Graph!

    52. What are Precedence Graphs for S3 and S4? For S3 T1 ? T2 (T2 Write(X) After T1 Write(X)) T2 ? T1 (T1 Write(X) After T2 Read (X)) For S4 T1 ? T2 (T2 Read/Write(X) After T1 Write(X))

    53. Serializability Facts Serializability Emphasizes Throughput A Schedule that is Serializable is Distinct From Being Serial Serializable Executions Allow us to Enjoy the Benefits of Concurrency without Giving up Any Correctness Serial Execution Does Not Permit Interleaving of Operations from Different Transactions, and Thus can Lead to Low CPU Utilization However, we May NOT GET the Same Result

    54. Serializability Facts Testing for the Serializability of a Schedule is Difficult in Practice: Reasons: Finding a Serializable Schedule for an Arbitrary Set of Transactions is NP-hard The Interleaving of Operations From Concurrent Transactions is Determined by the OS Scheduler Dynamically at Run-time Thus, it is Practically Almost Impossible to Determine the Ordering of Operations Beforehand to Ensure Serializability

    55. Transaction Processing Issues Transaction Structure (Usually Called Transaction Model) Flat (Simple), Nested Internal Database Consistency Semantic Data Control (Integrity Enforcement) Algorithms Reliability Protocols Atomicity & Durability Local Recovery Protocols Global Commit Protocols Concurrency Control Algorithms How to Synchronize Concurrent Transaction Executions (Correctness Criterion) Intra-Transaction Consistency, Isolation

    56. Transaction Execution Who Participates in Transaction Execution?

    57. Concurrency Control Different Locking-Based Algorithms Binary Locks (Lock and Unlock) Share Read Locks and Exclusive Write Locks Write Lock Does Not Imply Read 2 Phase Protocol All Locks Must Precede All Unlocks in Trans. True for All Transactions - Schedule Serializable Concurrency Control Implementation Techniques Optimistic Concurrency Control Time-Based Access to Information Consider “When” Information Read/Written to Identify Potential or Prior Conflicts We’ll Deviate from Chapter 20 Notation

    58. Summary of CC Techniques Two-Phase Locking Most Important in Practice Used by a Majority of DBMSs Serializes in the Middle of Transactions Low Overhead Relatively Low Concurrency Timestamp-Based Based on Multiple Versions of Data Items Serializes at the Beginning of Transactions Mostly Used in Distributed DBMSs Optimistic Concurrency Control Methods Serializes at the End of Transactions Relatively High Concurrency

    59. Recalling Important Concepts Transaction: Sequence of Database Commands that Must be Executed as a Single Unit (Program) Recall SQL Update Query Equivalent to Multiple Operations Read from DB, Modify (Local Copy), Write to DB Modify Sometimes Delete and Insert Granularity: Size of Data that is Locked for an Executing DB Transaction - Wide Range Database Relation (Tuple vs. Entire Table) Attribute (Column) Meta-Data (System Catalog) Locking: Provides Means for Synchronization

    60. Transaction Example Two Possible Outcomes for T1 and T2 If T1 First, then A = 150 If T2 First, then A = 60 Is this a Problem? The Two Different Orderings of T1 and T2 Represent Alternate Serial Schedules (Non-Interleaved) Key Concept: Concurrent (Interleaved) Execution of Several DB Transactions is Correct if and only if its Effect is the Same as that Obtained by Running the Same Transactions in a Serial Order This is the Concept of Serializability!

    61. Recalling Key Definitions A Schedule for a Set of Transactions is the Order in When the Elementary Steps (Read, Lock, Assign, Commit, etc.) are Performed A Schedule is Serial if All Steps of Each Transaction Occur Consecutively A Schedule is Serializable if it is Equivalent to “Some” Serial Schedule If T1, T2 and T3 are Transactions - What are the Possible Serial Schedules? T1 T2 T3 T1 T3 T2 T2 T1 T3 Different Serial Schedules for 4 Transactions?

    62. Another Example of Serializability Is Either Schedule Serializable?

    63. Locks Lock: Variable Associated with a Data Item in DB, Describing the Status of that Item w.r.t. Possible Ops. A Means of Synchronizing the Access by Concurrent Transactions to the Database Item Managed by Lock Manager Binary Locks: Lock(x) and Unlock(x) A Transaction T Must Issue the Lock(x) before any Read(x) or Write(x) A Transaction T Must use the Unlock(x) After all Read(x)/Write(x) Operations are Completed in T System Catalog Maintains a Lock Table for All Locked Items Lock(x)(or Unlock(x)) will not be Granted if there Already Exists a Lock(x) (or Unlock(x))

    64. Database Transaction is a Sequence of Lock/Unlocks Item Locked must Eventually be Unlocked A Transaction Holds a Lock between Lock and Unlock Statements Lock/Unlock Assumes that the Value of the Item Changes For a Number of Transactions that Lock/Unlock A, we’d have: f1(f2(f3( … fn( a0)))) A Basic Lock/Unlock Model

    65. Example - Assessing Schedule Consider Three Transactions Below: T1 has f1(a) and f2(b) T2 has f3(b) and f4(c) and f5(a) T3 has f6(a) and f7 (c) Functions Represent actions that Modify Instances a, b, and c of Data Items A, B, and C, Respectively

    66. Example - Assessing Schedule Consider the Schedule with Changes to a, b, and c Is this Schedule Serializable?

    67. Is this Schedule Serializable? Focus on the Final Line - It indicates the Effective Order of Execution of Each Transaction for a, b, and c T1 has f1(a) and f2(b) T2 has f3(b) and f4(c) and f5(a) T3 has f6(a) and f7 (c) For A - Order of Transactions is T1 T2 T3 For B - T2 Must Precede T1 For C - T2 Must Precede T3 Can All Three Conditions be True w.r.t. Order?

    68. Determining Serializability in this Model Examine Schedule Based on Order in Which Various Transactions Obtain Locks Order must be Equivalent to Some Hypothetical Serial Schedule of Transactions If Orders for Different Data Items Forces Two Transactions to Appear in a Different Order (T2 Must Precede T1 and T1 Must Precede T2 ) There is a Paradox! This is Equivalent to Searching for Cycles in a Directed Graph

    69. Algorithm 1: Binary Lock Model Input: Schedule S for Transactions T1, T2 , … Tk Output: Determination if S is Serializable, and If so, an Equivalent Serial Schedule Method: Create a Directed Precedence Graph G: Let S = a1 ; a2 ; … ; an where each ai is Tj :Lock Am or Tj : Unlock Am For each ai = Tj : Unlock Am , find next ap = Ts : Lock Am (1 < p ? n) (Ts is next Trans. to lock Am), and if so, draw Arc in G from Tj to Ts Repeat Until All Unlock/Lock are Checked Review the Resulting Precedence Graph If G has Cycles - Non-Serializable If G is Acyclic - Topological Sort to Find an Equivalent Serial Schedule

    70. Precedence Graph for Prior Example Look for Unlock Lock Combos on the Same Data Item T2 Unlock B and T1 Lock B T1 Unlock A and T2 Lock A T2 Unlock C and T3 Lock C T2 Unlock A and T3 Lock A

    71. Another Example Look for Unlock Lock Combos on the Same Data Item T2 Unlock A and T3 Lock A T1 Unlock B and T2 Lock B

    72. Two-Phase Protocol Two-Phase Protocol - All Locks Must Precede All Unlocks in the Schedule for a Transaction Which of the Transactions Below are Two-Phase? Why or Why Not?

    73. Theorems Regarding Serializability Theorem 1: Algorithm 1 Correctly Determines if a Schedule S is Serializable (omit the proof). Theorem 2: If S is any Schedule of 2 Phase Transactions (i.e., all of its Transactions are 2-Phase), then S is Serializable. Proof by Contradiction. Suppose Not - they by Theorem 1, S has a Precedence Graph G with a Cycle T1 ? T2 ? T3 … ? Tp ? T1 UNL L UNL UNL L In T1 ? T2 , T1 is Unlock, so all Remaining Actions must also be Unlock, since S is 2 Phase However, in Tp ? T1 , T1 is Lock, which is a Contradiction to Fact that S is 2 Phase

    74. Problems of Binary Locks Only One Transaction Can Hold a Lock on a Given Item No Shared Reading is Allowed - Too Restrictive For Example T1 is Read Only on X - Yet Needs Full Lock T2 is Read Only on X and Y - Needs Full Locks

    75. A Read/Write Lock Model Refines the Granularity of Locking to Differentiate Between Read and Write Locks Improves Concurrent Access Rlock (Shared): If T has an Rlock A, then Any Other Transaction can Also Rlock A, but All Transactions are Forbidden from Wlock A until All Transactions with Rlock A issue Ulock A (Multiple Reads) Wlock (Exclusive): If T has Wlock A, then All Other Transactions are Forbidden to Rlock or Wlock A Until T Ulocks A (Write Implies Reading, Single Write) Two Schedules are Equivalent if: Produce Same Value for Each Data Item Each Rlock on an Item Occurs in Both Schedules at a Time When Locked Item has the Same Value

    76. Algorithm 2: Read/Write Lock Model Input: Schedule S for Transactions T1, T2 , … Tk Output: Is S Serializable? If so, Serial Schedule Method: Create a Directed Precedence Graph G: Suppose in S, Ti :Rlock A. If Tj : Wlock A is the Next Transaction to Wlock A (if it exists) then place an Arc from Ti to Tj. Repeat for all Ti’s. Note for all Rlocks before the Wlock on A! Suppose in S, Ti :Wlock A. If Tj : Wlock A is the Next Transaction to Wlock A (if it exists) then place an Arc from Ti to Tj. Further, if there exists Tm :Rlock A after Ti :Wlock A but before Tj : Wlock A, then Draw an Arc from Ti to Tm. Review the Resulting Precedence Graph If G has Cycles - Non-Serializable If G is Acyclic - Topological Sort for Serial Schedule

    77. Consider the Following Schedule What are the Dependencies Among Transactions?

    78. Consider the Following Schedule What is the Precedence Graph G?

    79. Precedence Graph What is the Resulting Precedence Graph? Is the Schedule Serializable? Why or Why Not?

    80. A Read-Only/Write-Only Lock Model Revision of the Read/Write Model for Algorithm 2 Refining Our Assumptions Assume that a Wlock on an Item Does not Mean that the Transaction First Reads the Item Contrary to First Two Models Example: Read A; Read B; C=A+B; A=A-1; Write A; Write C Reads A, B and Writes A,C (No Read on C) Reformulate Notion of Equivalent Schedules

    81. How Does This Model Differ from Alg. 2? Consider the Schedule Segment: T1 : Wlock A T1 : Ulock A T2 : Wlock A T2 : Ulock A In Algorithm 2 - T2 : Wlock A Assumes that T2 Reads the Value Written by T1 However, This Need Not be True in the New Model If Between T1 and T2, No Transaction Rlocks A, then Value Written by is T1 Lost, and T1 Does not Have to Precede T2 in a Schedule w.r.t. A

    82. Redefine Serializability Conditions on Serializability Must be Redefined in Support of the Write-Does-Not-Assume Read Model If in Schedule S, T2 Reads “A” Written by T1, then T1 Must Precede T2 in any Serial Schedule Equivalent to S Further, if there is a T3 that Writes “A”, then in any Serial Schedule Equivalent to S, T3 may either Precede T1 or Follow T2, but may not Appear Between T1 and T2 Graphically, we have:

    83. Augmentation of Precedence Graph In Support of the Write Does Not Imply Read Model, we must Augment the Precedence Graph: Add an Initial Transaction To that Writes Every Item, and a Final Transaction Tf that Reads Every Item When a Transaction T’s Output is Invisible in Tf (I.e., the Value is Lost), Then T is Referred to as a Useless Transaction Useless Transactions have no Paths from Transaction to Tf Note: Maintain Same set of Locks (Rlock, Wlock, Ulock) with Different Interpretation on Wlock

    84. Intuitive View of Algorithm 3 If T2 Reads Value of “A” Written by T1 , then T2 Must Precede in any Serial Schedule For WR Combo - Draw an Arc from T1 to T2 Now Consider a T3 that also Writes “A” T3 Must be either Before T1 or After T2 Add in a Pair of Arcs T3 to T1 and T2 to T3 of Which one Must be Chosen in the Final Precedence Graph Serializability Occurs if After Choices Made for each “T3” Pair, the Resulting Graph is Acyclic G is Referred to as a “Polygraph” with Nodes, Arcs, and Alternate Arcs

    85. Algorithm 3 Input: Schedule S for Transactions T1, T2 , … Tk Output: Is S Serializable? If so, Serial Schedule Method: Create a Directed Polygraph Graph P: 1. Augment S with Dummy To (Write Every Item) an Dummy Tf (Read Every Item) 2. Create Initial Polygraph P by Adding Nodes for To, Tf, and Each Ti Transaction , in S 3. Place an Arc from Ti to Tj Whenever Tj Reads A in Augmented S (with Dummy States) that was Last Written by Ti. Repeat this Step for all Arcs. Don’t Forget to Consider Dummy States! 4. Discover Useless Transactions - T is Useless if there is no Path from T to Tf This is the “Initialization” Phase of Algorithm 3

    86. Algorithm 3 Method: Reassess the Initial Polygraph P: 5. For Each Remaining Arc Ti to Tj (meaning that Tj Reads Item A Written by Ti ), Consider all other T ? To and T ? Tf that Also Writes A: If Ti = To and Tj = Tf then Add No Arcs If Ti = To and Tj ? Tf then Add Arc from Tj to T If Ti ? To and Tj = Tf then Add Arc from T to Ti If Ti ? To and Tj ? Tf then Add Arc Pair from T to Ti and Tj to T 6. Determine if P is Acyclic by “Choosing” One Transaction Arc for Each Pair - Make Choices Carefully 7. If Acyclic - Serializable - Perform Topological Sort without To , Tf for Equivalent Serial Schedule. Else - Not Serializable

    87. Algorithm 3 Example

    88. Resulting Polygraph - Steps 1-4 1. Add To and Tf to S, 2. Add To , Tf , T1 , T2 , T3 , T4 to Polygraph P 3. Look for Ti Write X to Tj Read X for all Items X 4. Look for Useless Transactions - No Paths from T to Tf

    89. Resulting Polygraph - Steps 1-4 1. Add To and Tf to S, 2. Add To , Tf , T1 , T2 , T3 , T4 to Polygraph P 3. Look for Ti Write X to Tj Read X for all Items X 4. For - T3 Remove Arcs Into T3

    90. Algorithm 3 Example - Step 5 - Writes on A

    91. Resulting Polygraph - Step 5 - A:WR 5. For Each Arc Ti to Tj Consider All T’s that Write X I. If Ti = To and Tj = Tf then Add No Arcs II. If Ti = To and Tj ? Tf then Add Arc from Tj to T III. If Ti ? To and Tj = Tf then Add Arc from T to Ti IV. If Ti ? To and Tj ? Tf then Add Pair from T to Ti and Tj to T Check Items A (see new arcs - case II and III)

    92. Resulting Polygraph - Step 5 - A:WR

    93. Algorithm 3 Example-Step 5 - Writes on C/D

    94. Resulting Polygraph-Step 5- C:WR & D:WR 5. For Each Arc Ti to Tj Consider All T’s that Write X I. If Ti = To and Tj = Tf then Add No Arcs II. If Ti = To and Tj ? Tf then Add Arc from Tj to T III. If Ti ? To and Tj = Tf then Add Arc from T to Ti IV. If Ti ? To and Tj ? Tf then Add Pair from T to Ti and Tj to T Do any Other Transactions Write C or Write D for the arrows labeled C:WR and D:RW Respectively?

    95. Algorithm 3 Example - Step 5 - Writes on B

    96. Resulting Polygraph - Step 5 and 6 5. For Each Arc Ti to Tj Consider All T’s that Write X I. If Ti = To and Tj = Tf then Add No Arcs II. If Ti = To and Tj ? Tf then Add Arc from Tj to T III. If Ti ? To and Tj = Tf then Add Arc from T to Ti IV. If Ti ? To and Tj ? Tf then Add Pair from T to Ti and Tj to T B (see new arcs - including alternates - dashed) For T1 to T2 and T4 - add T2 to T4 and T4 to T1 Either T4 After T2 or Before T1

    97. Resulting Polygraph - Step 5 and 6 6. Which Option of Pair of Arcs Should be Chosen? Why?

    98. Final Polygraph - Step 7 Final Graph Above - Delete Dummy States below Topological Sort Yields Order: T1 , T2 , T3 , T4

    99. Implementation Issues for CC Return to Earlier Diagram… Transaction Manager + Schedule Implement CC

    100. Implementation Issues for CC To Implement Algorithms 1 to 3, Focus on Software Infrastructure (TM and SC) Protocol (CC Model for Algorithms 1 to 3) TM/SC: Arbitrates and Controls Transaction Execution Protocol: Restrictions on the Elementary Steps of a Transaction in Order to Promote Serializability TM/SC + Protocol Comprise the Requirements/Specification of the Concurrency Control Mechanism Concurrency Control Mechanism Itself Can be Modeled as a Lock Manager with Lock Tables

    101. Implementation Issues for CC Locking Modes - In Support of Algorithms 1-3, there is Requirement to Establish Locking Modes Thus, Whatever the Locking Choices (e.g., Binary, Read/Write, etc.), there is a Table that Lists the Compatibility of the Locks w.r.t. Concurrent Behavior For Example, Tables Below Illustrates all Legal Concurrent Actions of Two Transactions R/W Locks (on left) R/W/Increment Locks (on right)

    102. Implementation Issues for CC Locking Modes - Can be Extended and Refined Based on the Level of Granularity that is Desired For Example: Retrieve-Delete-Update-Insert Questions: How Can Two Deletes/Inserts be Compatible? Will Effect of a Delete/Insert be Lost?

    103. Implementation Issues for CC Answer: Focus on Buffer Management Capability Smart Buffer Manager Tracks All Blocks at All Times If T1 loaded Block 123 at Time t, when T2 Goes to Access Block 123 at Time t+10, Buffer Manager Checks to See if Block Already in Memory Buffer Manager also has Concurrency Control!

    104. Implementation Techniques for CC Algorithms 1 to 3 as Presented are Not Directly Implementable! Don’t Integrate the CC Requirements (Protocol) with the TM/SC Typical Implementation Techniques Utilize Queueing Strategies to Impose an Ordering on Transactions: Queue for Each Transaction that Tracks the Data Items Needed by the Transaction for its Execution Queue for Each Data Item that Tracks the Locks Requested and Held by All Transactions Contain Inverse Data of One Another

    105. Examples of Queues What is the State of Each Lock? What is the State of Each Transaction? What Happens when a Transaction, T1, Completes?

    106. Examples of Queues Algorithms that Manage Queues Implement the CC Strategy Lock State is Often Maintained within Queue

    107. A Sidetrack to Recovery Basics Transactions are Liable to Fail for Many Reasons Hardware or Software Failure Deadlock Occurs Transaction Error (e.g., Divide by Zero) after Partial Execution In Either Case We May Need to Abort a Completed Transaction Due to Error in Another Transaction We Must Recover the DB to “Correct” State What do OS’s Do? Weekly Backups of File System Incremental Backups (To another Disk) Raid Arrays System and Editor Log Files

    108. Database Recovery Approaches Evolved from OS Techniques Backup Copies of Database Tape Copies (early days) and CD Copies Online (Shadow Database System or FTP) Off Site Storage of DB (Daily/Weekly) Maintenance of Journal or Log File Containing All Changes to DB Since Last “Backup” Each Journal Entry Contains Transaction ID Old/New Values of Data Item(s) Beginning/Ending Point of Transaction When Failure Occurs Redo Aborted Transactions/Rollback Completed Transactions/Undo Partially Executed Trans.

    109. Two Phase Commit Policy All Actions for a Transaction are Performed in a Workspace (in Memory) Rather than Directly on the DB Copy of the Data These Actions are Written in Journal (Including the Commit Action) Leads to Two-Phase Commit Policy Transaction Cannot Write to DB Until Committed Transaction Cannot Commit Until All Changes have been Recorded First in the Jorunal Two Phases are: Phase 1: Write Data in Journal Phase 2: Write Data in DB Failure Can Occur Anytime!

    110. Why is Two Phase Commit Important? Suppose DB Writes Occur Before Commit Assume a Transaction Aborts in the Middle of Processing Undo DB Changes Made to Actual Database Prior to Failure Relatively Straightforward and Manageable Undo Actions of Other Transactions that Read Information Written by Aborted Transaction Impossible! Undo May Require you to Propagate to Many Other Transactions, Particularly if Aborted Transaction was Long-Duration (hours) Basic Concepts of Recovery are Used to Non-Locking Optimistic CC Approach!

    111. Why Optimistic Concurrency Control? Motivate by Disadvantages of Locking Techniques Lock Maintenance Deadlock-Free Locking Protocols Limit Concurrency Secondary Memory Access Causes Locks to be Held for a Long Duration Locks Typically Held Until Transaction Completes, Which Reduces Concurrency Often Needed in “Worst” Case Only Overhead - Locking + Deadlock Detection Key Concept Write Collisions in Large Databases for “Many” Applications are Rare OCC: “Don’t Worry be Happy” Approach

    112. Basic Ideas of OCC Interference Between Transactions is Rare and Locking Incurs too Much Overhead Instead, Allow Each Transaction to Execute Freely, and Check Serializability at the end of the Transaction Win (Allow to Commit) If No Interference Occurs or There have been No Conflicts

    113. How Does OCC Work? Execute Transactions Ad-Hoc - Let them Go Uncontrolled Maintain Information of “Relevant” Actions Against DB (Often in Conjunction with Recovery/Journal) When Transactions Finish - Check to see if Everything Proceeded Satisfactorily Assumes that Probability of Transaction Interference is Quite Small Two Questions re. OCC: How Do We know Everything Went OK? How do we Recover if it Didn’t?

    114. OCC Utilizes Timestamps Timestamps are Clock Ticks used to Record the Major Milestones in the Execution of a Transaction Examples Include: Start Time of Transaction Read/Write Times for DB Items Finish Time of Transaction Commit Time of Transaction Two Important Definitions are: Read Time of an Item: Highest Time Stamp Possessed by Any Transaction that Reads the Item Write Time of an Item: Highest Time Stamp Possessed by Any Transaction that Wrote the Item

    115. How are Timestamps Used? Focus on “When” Reads and Writes Occur Transaction Cannot Read an Item if its Value was Not Written Until After the Transaction Finished its Execution Transaction T with Timestamp t1 Cannot Read an Item with a Write Time of t2 if t2 > t1 If this is the Case, T Must Abort and be Restarted Can’t Read Item if it hasn’t been Written Transaction Cannot Write an Item if that Item has its Old Value Read at a Later Time Transaction T with Timestamp t1 Cannot Write an Item with a Read Time of t2 if t2 > t1 If this is the Case, T Must Abort and be Restarted Can’t Write Item Being Read at a Later Time

    116. Algorithm 4: Optimistic CC Let T be a Transaction with Timestamp t Attempting to Perform Operation X on a Data Item I with Readtime tR and Writetime tW If (X = Read and t ? tW ) or (X = Write and t ? tR ) then Perform Operation If t > tR then set tR = t for Data Item I If t > tW then set tW = t for Data Item I If (X = Write and tR ? t < tW ) the Do Nothing since Later Write will Cancel out the Write of T If (X = Read and t < tW ) or (X = Write and t < tR ) then Abort the Operation 1st - T trying to Read Item Before it was Written 2nd - T trying to Write an Item Before it was Read

    117. Example of OCC What Happens at Each Step w.r.t. RT/WT?

    118. Example of OCC What Happens at Step 6? WT(C) =150 < RT(C)=175 Consequence - Abort T2

    119. Example of OCC Step (7) T3 can Finish, but No Effect Since 175 < 200 - Discard

    120. Summary of Example T1 Completes Successfully; T2 Aborts; T3 Completes but Doesn’t Write A

    121. Recovery Consideration Actual Write Operations of Previous Example are Phase 1 of Two-Phase Commit (Write to Journal) Commit - Phase 2 - Writes to DB Between Write to Log and Write to DB, No Other Transaction is Allowed to Read Items being Written OCC Reduces Work as Follows: One Step for Read, Two for Writes (write/commit) In Locking, we had Four Steps: Lock, Read or Write, Unlock, Commit

    122. Viewing OCC vs. Phases of Execution Read Phase: Database Information Read from Secondary Storage into Primary Memory All Writes are to Local Workspace Validate Phase: Check to see if Integrity of Data has not been Violated Write Phase: Update the DB (Secondary Storage) from Local Copies

    123. Contrasting PCC and OCC Transaction Control PCC: Control by Having Transactions Wait OCC: Control by Having Transactions Backed up Serializability PCC: Ordering of Data Items OCC: Ordering of Transactions Biggest Potential Problem PCC: Deadlock, rather Preventing it OCC: Starvation Different Applications Suited to Different Approaches Some DBMS Support Both DBA Can Configure on Application-by-Application Basis

    124. Concluding Remarks Background OS Concepts of Sharing and Synchronization Deadlock Detection, Prevention, Avoidance Chapter 19 Transaction Processing Concepts Different Problems re. Concurrency Control Deadlock, Livelock, Starvation Lost Update, Dirty Read, etc. Serial Schedule and Serializability Chapter 20 Deviated from Textbook Notation 3 Pessimistic Locking Based CC Algorithms 1 Optimistic Timestamp Based CC Algorithm Role of Recovery in CC

More Related