1.24k likes | 1.68k Views
Overview of Material. Key Background Topics:Synchronization in Operating SystemsTransaction and Deadlock ConceptsPrevention, Avoidance, DetectionChapter 19 - Transaction ProcessingConcurrency ControlData Consistency ProblemsSchedules and SerializabilityChapter 20 - Concurrency ControlDiffer
E N D
1. Chapters 19 and 20: TransactionProcessing and Concurrency Control A portion of these slides are being used with the permission of Dr. Ling Lui, Associate Professor, College of Computing, Georgia Tech.
Remaining slides represent new material.
2. Overview of Material Key Background Topics:
Synchronization in Operating Systems
Transaction and Deadlock Concepts
Prevention, Avoidance, Detection
Chapter 19 - Transaction Processing
Concurrency Control
Data Consistency Problems
Schedules and Serializability
Chapter 20 - Concurrency Control
Different Locking-Based Algorithms
2 Phase Protocol
Deadlock and Livelock
Optimistic Concurrency Control
3. What is Synchronization? Ability of Two or More Serial Processes to Interact During Their Execution to Achieve Common Goal
Recognition that “Today’s” Applications Require Multiple Interacting Processes
Client/Server and Multi-Tiered Architectures
Inter-Process Communication via TCP/IP
Fundamental Concern: Address Concurrency
Control Access to Shared Information
Historically Supported in Database Systems
Currently Available in Many Programming Languages
4. Thread Synchronization Suppose X and Y are Concurrently Executing in Same Address Space
What are Possibilities?
5. Thread Synchronization Will Second Part Still Finish After Third Part?
Will Second Part Now Finish Before Third Part?
What Happens if Variables are Shared?
This is the Database Concern - Concurrent Transactions Against Shared Tables!
6. Databases have Transactions A Transaction is
A Logic Unit of Database Processing
Represents the Collection of Actions that Make Consistent Transformations of System States while Preserving System Consistency
Interleaved (A and B) and Concurrent (C and D)
7. Two Sample Transactions Transaction T1 Reads/Writes X/Y, Modifying X by Subtracting N and Y by Adding N
Transaction T2 Reads X and Modifies X by Adding M
Transactions can Execute
Serially: T1 followed by T2 (or reverse)
Interleaved: Operation by Operation
8. Why Do we Need to Synchronize? Promote Sharing of Resources, Data, etc.
Cooperating Processes Norm Not Exception
Difficult to Program Solutions
Handle Concurrent Behavior of Processes
Multiple Processes Interacting via OS Resources
Under Control of Process Manager/Scheduler
Performance, Parallel Algorithms & Computations
Multi-Processor Architectures (CSE228)
Different OS to Handle Multiple Processors
Client/Server and Multi-Tier Architectures
Underlying Database Support
Concurrent Transactions
Shared Databases
9. Potential Problems without Synchronization? Data Inconsistency
Lost-Update Problem
Impact on Correctness of Executing Software
Deadlock
Two Processes (Transactions)
Each Hold Unique Resource (Data Item) and Want Resource (Date Item) of Other Process (Trans.)
Processes Wait Forever
Non-Determinacy of Computations
Behavior of Computation Different for Different Executions
Two Processes (Transactions) Produce Different Results When Executed More than Once on Same Data
10. Classic Synchronization Techniques Goal: Shared Variables and Resources
Two Approaches:
Critical Sections
Define a Segment of Code as Critical Section
Once Execution Enters Code Segment it Cannot be Interrupted by Scheduler
Release Point for Critical Section for Interrupts
We’ll Briefly Review
Semaphores
Proposed in 1960s by E. Dijkstra
Utilizes ADTs to Design and Implement Behavior that Guarantees Consistency and Non-Deadlock
Recall your OS Course (CSE258)
11. Critical Sections
12. Critical Sections
13. Critical Sections (continued) There is a Race to Execute Critical Sections
Sections May Be Different Code in Different Processes: Cannot Detect With Static Analysis
Results of Multiple Execution Are Not Determinate
Need an OS Mechanism to Resolve Races
If p1 Wins, R1 and R2 Added to Balance - Okay
If p2 Wins, its Changed Balance Different from One Held by p1 which Adds/Writes Wrong Value
14. General Problem: A Deadly Embrace T1 has A Wants B - Won’t Release A Until it Gets B
T2 has B Wants A - Won’t Release B Until it Gets A
T3 has C Wants A - Won’t Release B Until it Gets A
What is the End Result?
Deadlock!
15. Deadlock in Databases Databases Must Control Access to Information by Multiple Concurrent Transactions (Processes)
How do we Prevent Simultaneous Updates of Database by Concurrent Transactions (Processes)?
Data is the Resource in Database System
16. Addressing Deadlock Deadlock is Global Condition!
Need to Analyze All Processes that Need All Resources
Can’t Make Local Decision Based on Needs of One Process
Four Deadlock Approaches:
Prevention: Never Allow Deadlock to Occur
Avoidance: System Makes Decision to Head Off Future Deadlock State
Detection & Recovery: Check for Deadlock (Periodically or Sporadically), Then Recover
Manual Intervention: Operator Reboot if System Seems Too Slow
17. Prevention: A First Look Design the System So that Deadlock is Impossible
Deadlock Only Occurs If All Following TRUE!!
Mutual Exclusion: Allocated Data Items are Exclusive Property of Transaction
Hold and Wait: Transaction Can Hold Data Items While Waiting for Another Resource
Circular Waiting: T1 has A needs B, T2 has B needs C, … Tn has Z needs A
No Preemption: Only Transaction Can Release Data Items or Withdraw Data Items Request
All Four Necessary for Deadlock to Exist
Prevention Requires Concurrency Control Manager to Violate at Least One Condition at All Times!
18. Avoidance: A First Look Construct a Formal Model of System States
Via Model, Choose a Strategy that Will Not Allow the System to Go to a Deadlock State
Predictive Approach: Requires Transaction to Declare Intent re. Data Items in Advance
Transaction X Needs A, B, and D
Represents “Maximum Claim” on Data Items
Transaction X Won’t Proceed Until all Data Items Available
May Require “Long” Waits
Amenable to Formal Solution/Algorithm
19. Detection and Recovery: A First Look When Deadlock Occurs, Can we Detect and Recover?
Two Phases to Algorithm
Detection: Is there Deadlock?
Recovery: Preempt Data Items from Transactions
Detection Algorithm
When is it Executed?
What is its Overhead?
Too Often - Wastes Data Items
Too Infrequent - Blocked Transactions Don’t Do Enough Work
Dominant Commercial Solution
20. Deadlock in Databases Concurrent Access to Database Information
Optimistic Concurrency Control
Assume Problems Infrequent (ATM Example)
Maintain Transaction Log
Detect and Correct Errors in System via Log “Long-After” Their Occurrence
Similar to What in OS? Deadlock Concepts?
Pessimistic Concurrency Control
Assume Problems will Occur (Airline Example)
Require Transactions to Lock Portions of Data for Read and Write Requests
Similar to What in OS? Deadlock Concepts?
21. Prevention Necessary Conditions for Deadlock
Mutual Exclusion
Hold and Wait
Circular Waiting
No Preemption
Ensure that at Least One of the Necessary Conditions is False at All Times
Why Must Mutual Exclusion Hold at All Times?
Some Data Items (System Catalog) Must be Exclusively Held by a Transaction
How Can a Prevention Strategy be Designed to Guarantee Failure of One of Other Conditions?
22. Hold and Wait Invalidate: Hold and Wait: Transaction Can Hold One Data Item While Waiting for Another Data Item
Approach 1: Targeted to Batch Systems
Transaction Must Request All Data Items it Needs
Transaction Competes for All Data Items Even if Needs Only One Data Item at Time
Holds Data Items “Done” With
Approach 2: Targeted to Timesharing
For Transaction to Acquire a Data Item
Must Release All Held Data Items
Reacquire All (Released &New) Data Items Needed
Overhead to Reacquire Held Data Items
Could Encourage Starvation
23. Avoidance Requires a Multi-Phase Approach
Construct a Model of System States
Choose a Strategy that Guarantees that the System Will Not Go to a Deadlock State
Service Transactions in Some Order, Not Necessarily Order Received
Requires Extra Information for Each Transaction
Maximum Claim - Every Data Item Each Transaction Will Ever Request
Concurrency Controller Sees the Worst Case and can Allow Transitions Based on that Knowledge
Goal: To Maintain “Safe” State
24. Synopsis of Techniques Resource Allocation Policy:
Detection - Very Liberal - requested Resources (Data Items) are Granted Where Possible
Prevention - Conservative - Undercommits Resources (Data Items)
Avoidance - Moderate - Between Detection/Prev.
Different Invocation Schemes
Detection - Periodically to Test for Deadlock
Prevention - Request All Resources at Once, Preempt, and Order Resources
Avoidance - Manipulate Transactions to Find at Least One Safe Execution Path
25. Advantages Disadvantages Detection
Never Delays Transaction Initiation
Facilitates On-Line Processing
Prevention
Works Well for “Short” Transactions
Enforceable Via Compile Time Checks
Run-Time Computation Reduced
Avoidance
No Preemption Needed Detection
Inherent Preemption Losses
Prevention
Inefficient
Preempts Too Often
Disallows Incremental Transaction Requests
Avoidance
Future Transaction Requirements Must be Known in Advance
Transactions can be Blocked for Long Periods
26. Transaction Processing Concepts Basic Transaction Processing Concepts
What is a Transaction?
Why do we need Concurrency Control in a Multi-User Environment?
Atomic Transactions and ACID Properties
Serial execution and Serializability
Concurrency Control Techniques
Locking, Timestamps, and Multiversion
Optimistic Concurrency Control
Transactions Provide
Atomic/Reliable Execution in the Presence of Failures
Correct Execution of Multiple User Accesses
27. What is a Transaction? A Transaction is
A Logic Unit of Database Processing
Represents the Collection of Actions that Make Consistent Transformations of System States while Preserving System Consistency
28. Two Sample Transactions Transaction T1 Reads/Writes X/Y, Modifying X by Subtracting N and Y by Adding N
Transaction T2 Reads X and Modifies X by Adding M
Their Interleaved Execution can Yield Dramatically Different Results!
What are the Possibilities?
29. Example of Transaction for Query Query: User Steve Reserves Seat on Flight 123
Transaction Comprised of Three Steps:
Update the Seats Available (CAP) for Flight 123
If Steve is a New Customer, Insert Information for Customer “Steve” into the CUST Table
Insert Information for the Reservation into the Flight-Customer Table FC To Record the Flight
30. Termination of Transactions Note: Checking to See if Steve Customer is Omitted
31. What is Concurrency Control? Single User vs. Multi-user Environment
Programs May Executed in an Interleaved Fashion
Concurrency Control
Concurrent Execution of Transactions May Interfere with Each Other
May Produce an Incorrect Overall Result
Even If Each Transaction is Correct When Executed in Isolation
Why is Concurrency Control Needed?
The Lost Update Problem
The Dirty Read Problem
The Incorrect Summary Problem
The Unrepeatable Read Problem
32. The Lost Update Problem Problem: Item X has an incorrect value Since its Update by T1 is “lost” (Overwritten by T2)
33. The Dirty Read Problem Problem: Item X read by T2 is “dirty” (incorrect) Since
Due to T1 Failing before Completion (Commit), System Must Undo the Update and Change X Back to Original Value
It is Created by a Trans. That Has Not Been Completed/Committed
Unfortunately T2 has Read the “temporary” value of X
z
34. The Incorrect Summary Problem Problem: Inconsistent Values w.r.t. Time - X has been Changed but Y has Not - X and Y are “Correct”
35. Summary of the Problems Lost Update Problem
Two processes execute the programs that intend to update the same data item X concurrently
X may end up with just one update
Dirty Data Read Problem
A process may write intermediate values into the database
Further writes invalidate that particular value
Process rollback also invalidate that value
36. Summary of the Problems Incorrect Summary Problem
Query: Calculate total checking deposits
Update: Transfer $1 M from Acct 1 to Acct 2
If query reads account 1 before the update and account 2 after the update, the result is off by $1M
The Unrepeatable Read Problem
Consider at Transaction T
T Reads Data Item X at Time t
Another Transaction Y Modifies X at Time t+1
T then Read X again at Time t+2
T has Read Two Different values of X!
37. Further Transaction and System Concepts Transaction Moves Through Many States from Begin to End
From System Issue, Key Concern are Potential Abort
When Can Aborts Occur? What are Issues?
38. Further Transaction and System Concepts Aborting Active Transaction
Recovery Likely Not Needed
Reads/Writes on “Local” Copies
Permanent Copy Not Updated
39. Further Transaction and System Concepts Aborting Partially Committed Transaction
Transaction Commits by Writing Values to DB
Suppose Write A, Write B, Write C
If Failure After Write A and Before Write C, Transaction Aborts and Corrective Action Needed
Must “Undo” Effect of All Completed Writes
40. Desirable Properties of Transactions (ACID) Database Consists of Set of Data Items
Read(x) Gets Last Stored Value in X
Write(x) Stores a New Value Into X
Atomicity: A Set of R/W Operations that Either Completes Entirely or Not at All
Consistency: R/W Operations take the Database from a One Consistent State to Another Consistent State
Isolation: No Intermediate Values Produced by the R/W Operations will be Visible to Other Transactions
Durability: Once the Transaction is Completed, and All the Updates Are Committed, then these Changes Must Never be Lost because of Subsequent Failure
41. What is a Schedule? A Schedule S is a Sequence of R/W Operations, Which End with Commit or Abort
Different Transactions May Interleave with One Another
Each Transaction a Sequence of R/W Operations
Two Schedules S1 and S2 are Equivalent, Denoted As S1 ? S2 , If and Only If S1 and S2
Execute the Same Set of Transactions
Produce the Same Results (i.e., Both Take the DB to the Same Final State)
42. Transactions and a Schedule Below are Transactions T1 and T2
Note that the Their Interleaved Execution Shown Below is an Example of One Possible Schedule
There are Many Different Interleaves of T1 and T2
43. Equivalent Schedules Are the Two Schedules below Equivalent?
S1 and S4 are Equivalent, since They have the Same Set of Transactions and Produce the Same Results
44. Serializability of Schedules A Serial Execution of Transactions Runs One Transaction at a Time (e.g., T1 and T2 or T2 and T1)
All R/W Operations in Each Transaction Occur Consecutively in S, No Interleaving
Consistency: a Serial Schedule takes a Consistent Initial DB State to a Consistent Final State
A Schedule S is Called Serializable If there Exists an Equivalent Serial Schedule
A Serializable Schedule also takes a Consistent Initial DB State to Another Consistent DB State
An Interleaved Execution of a Set of Transactions is Considered Correct if it Produces the Same Final Result as Some Serial Execution of the Same Set of Transactions
We Call such an Execution to be Serializable
45. Example of Serializability Consider S1 and S2 for Transactions T1 and T2
If X = 10 and Y = 20
After S1 or S2 X = 7 and Y = 40
46. Example of Serializability Consider S1 and S2 for Transactions T1 and T2
If X = 10 and Y = 20
After S1 or S2 X = 7 and Y = 40
Is S3 a Serializable Schedule?
47. Example of Serializability Consider S1 and S2 for Transactions T1 and T2
If X = 10 and Y = 20
After S1 or S2 X = 7 and Y = 40
Is S4 a Serializable Schedule?
48. Two Serial Schedules with Different Results Consider S1 and S2 for Transactions T1 and T2
If X = 10 and Y = 20
After S1 X = 7 and Y = 28
After S2 X = 7 and Y = 27
49. The Serializability Theorem A Dependency Exists Between Two Transactions If:
They Access the Same Data Item Consecutively in the Schedule and One of the Accesses is a Write
Three Cases: T2 Depends on T1 , Denoted by T1 ?T2
T2 Executes a Read(x) after a Write(x) by T1
T2 Executes a Write(x) after a Read(x) by T1
T2 Executes a Write(x) after a Write(x) by T1
Transaction T1 Precedes Transaction T2 If:
There is a Dependency Between T1 and T2, and
The R/W Operation in T1 Precedes the Dependent T2 Operation in the Schedule
50. The Serializability Theorem A Precedence Graph of a Schedule is a Graph G = <TN, DE>, where
Each Node is a Single Transaction; i.e.,TN = {T1, ..., Tn} (n>1)
and
Each Arc (Edge) Represents a Dependency Going from the Preceding Transaction to the Other i.e., DE = {eij | eij = (Ti, Tj), Ti, Tj ??TN}
Use Dependency Cases on Prior Slide
The Serializability Theorem
A Schedule is Serializable if and only of its Precedence Graph is Acyclic
51. Serializability Theorem Example Consider S1 and S2 for Transactions T1 and T2
Consider the Two Precedence Graphs for S1 and S2
No Cycles in Either Graph!
52. What are Precedence Graphs for S3 and S4? For S3
T1 ? T2 (T2 Write(X) After T1 Write(X))
T2 ? T1 (T1 Write(X) After T2 Read (X))
For S4 T1 ? T2 (T2 Read/Write(X) After T1 Write(X))
53. Serializability Facts Serializability Emphasizes Throughput
A Schedule that is Serializable is Distinct From Being Serial
Serializable Executions Allow us to Enjoy the Benefits of Concurrency without Giving up Any Correctness
Serial Execution Does Not Permit Interleaving of Operations from Different Transactions, and Thus can Lead to Low CPU Utilization
However, we May NOT GET the Same Result
54. Serializability Facts Testing for the Serializability of a Schedule is Difficult in Practice:
Reasons:
Finding a Serializable Schedule for an Arbitrary Set of Transactions is NP-hard
The Interleaving of Operations From Concurrent Transactions is Determined by the OS Scheduler Dynamically at Run-time
Thus, it is Practically Almost Impossible to Determine the Ordering of Operations Beforehand to Ensure Serializability
55. Transaction Processing Issues Transaction Structure (Usually Called Transaction Model)
Flat (Simple), Nested
Internal Database Consistency
Semantic Data Control (Integrity Enforcement) Algorithms
Reliability Protocols
Atomicity & Durability
Local Recovery Protocols
Global Commit Protocols
Concurrency Control Algorithms
How to Synchronize Concurrent Transaction Executions (Correctness Criterion)
Intra-Transaction Consistency, Isolation
56. Transaction Execution Who Participates in Transaction Execution?
57. Concurrency Control Different Locking-Based Algorithms
Binary Locks (Lock and Unlock)
Share Read Locks and Exclusive Write Locks
Write Lock Does Not Imply Read
2 Phase Protocol
All Locks Must Precede All Unlocks in Trans.
True for All Transactions - Schedule Serializable
Concurrency Control Implementation Techniques
Optimistic Concurrency Control
Time-Based Access to Information
Consider “When” Information Read/Written to Identify Potential or Prior Conflicts
We’ll Deviate from Chapter 20 Notation
58. Summary of CC Techniques Two-Phase Locking
Most Important in Practice
Used by a Majority of DBMSs
Serializes in the Middle of Transactions
Low Overhead
Relatively Low Concurrency
Timestamp-Based
Based on Multiple Versions of Data Items
Serializes at the Beginning of Transactions
Mostly Used in Distributed DBMSs
Optimistic Concurrency Control Methods
Serializes at the End of Transactions
Relatively High Concurrency
59. Recalling Important Concepts Transaction: Sequence of Database Commands that Must be Executed as a Single Unit (Program)
Recall SQL Update Query
Equivalent to Multiple Operations
Read from DB, Modify (Local Copy), Write to DB
Modify Sometimes Delete and Insert
Granularity: Size of Data that is Locked for an Executing DB Transaction - Wide Range
Database
Relation (Tuple vs. Entire Table)
Attribute (Column)
Meta-Data (System Catalog)
Locking: Provides Means for Synchronization
60. Transaction Example Two Possible Outcomes for T1 and T2
If T1 First, then A = 150
If T2 First, then A = 60
Is this a Problem?
The Two Different Orderings of T1 and T2 Represent Alternate Serial Schedules (Non-Interleaved)
Key Concept: Concurrent (Interleaved) Execution of Several DB Transactions is Correct if and only if its Effect is the Same as that Obtained by Running the Same Transactions in a Serial Order
This is the Concept of Serializability!
61. Recalling Key Definitions A Schedule for a Set of Transactions is the Order in When the Elementary Steps (Read, Lock, Assign, Commit, etc.) are Performed
A Schedule is Serial if All Steps of Each Transaction Occur Consecutively
A Schedule is Serializable if it is Equivalent to “Some” Serial Schedule
If T1, T2 and T3 are Transactions - What are the Possible Serial Schedules?
T1 T2 T3
T1 T3 T2
T2 T1 T3
Different Serial Schedules for 4 Transactions?
62. Another Example of Serializability Is Either Schedule Serializable?
63. Locks Lock: Variable Associated with a Data Item in DB, Describing the Status of that Item w.r.t. Possible Ops.
A Means of Synchronizing the Access by Concurrent Transactions to the Database Item
Managed by Lock Manager
Binary Locks: Lock(x) and Unlock(x)
A Transaction T Must Issue the Lock(x) before any Read(x) or Write(x)
A Transaction T Must use the Unlock(x) After all Read(x)/Write(x) Operations are Completed in T
System Catalog Maintains a Lock Table for All Locked Items
Lock(x)(or Unlock(x)) will not be Granted if there Already Exists a Lock(x) (or Unlock(x))
64. Database Transaction is a Sequence of Lock/Unlocks
Item Locked must Eventually be Unlocked
A Transaction Holds a Lock between Lock and Unlock Statements
Lock/Unlock Assumes that the Value of the Item Changes
For a Number of Transactions that Lock/Unlock A, we’d have: f1(f2(f3( … fn( a0)))) A Basic Lock/Unlock Model
65. Example - Assessing Schedule Consider Three Transactions Below:
T1 has f1(a) and f2(b)
T2 has f3(b) and f4(c) and f5(a)
T3 has f6(a) and f7 (c)
Functions Represent actions that Modify Instances a, b, and c of Data Items A, B, and C, Respectively
66. Example - Assessing Schedule Consider the Schedule with Changes to a, b, and c
Is this Schedule Serializable?
67. Is this Schedule Serializable? Focus on the Final Line - It indicates the Effective Order of Execution of Each Transaction for a, b, and c
T1 has f1(a) and f2(b)
T2 has f3(b) and f4(c) and f5(a)
T3 has f6(a) and f7 (c)
For A - Order of Transactions is T1 T2 T3
For B - T2 Must Precede T1
For C - T2 Must Precede T3
Can All Three Conditions be True w.r.t. Order?
68. Determining Serializability in this Model Examine Schedule Based on Order in Which Various Transactions Obtain Locks
Order must be Equivalent to Some Hypothetical Serial Schedule of Transactions
If Orders for Different Data Items Forces Two Transactions to Appear in a Different Order(T2 Must Precede T1 and T1 Must Precede T2 )There is a Paradox!
This is Equivalent to Searching for Cycles in a Directed Graph
69. Algorithm 1: Binary Lock Model Input: Schedule S for Transactions T1, T2 , … Tk
Output: Determination if S is Serializable, and If so, an Equivalent Serial Schedule
Method: Create a Directed Precedence Graph G:
Let S = a1 ; a2 ; … ; an where each ai is Tj :Lock Am or Tj : Unlock Am
For each ai = Tj : Unlock Am , find next ap = Ts : Lock Am (1 < p ? n) (Ts is next Trans. to lock Am), and if so, draw Arc in G from Tj to Ts
Repeat Until All Unlock/Lock are Checked
Review the Resulting Precedence Graph
If G has Cycles - Non-Serializable
If G is Acyclic - Topological Sort to Find an Equivalent Serial Schedule
70. Precedence Graph for Prior Example Look for Unlock Lock Combos on the Same Data Item
T2 Unlock B and T1 Lock B
T1 Unlock A and T2 Lock A
T2 Unlock C and T3 Lock C
T2 Unlock A and T3 Lock A
71. Another Example Look for Unlock Lock Combos on the Same Data Item
T2 Unlock A and T3 Lock A
T1 Unlock B and T2 Lock B
72. Two-Phase Protocol Two-Phase Protocol - All Locks Must Precede All Unlocks in the Schedule for a Transaction
Which of the Transactions Below are Two-Phase?
Why or Why Not?
73. Theorems Regarding Serializability Theorem 1: Algorithm 1 Correctly Determines if a Schedule S is Serializable (omit the proof).
Theorem 2: If S is any Schedule of 2 Phase Transactions (i.e., all of its Transactions are 2-Phase), then S is Serializable.
Proof by Contradiction.
Suppose Not - they by Theorem 1, S has a Precedence Graph G with a Cycle
T1 ? T2 ? T3 … ? Tp ? T1 UNL L UNL UNL L
In T1 ? T2 , T1 is Unlock, so all Remaining Actions must also be Unlock, since S is 2 Phase
However, in Tp ? T1 , T1 is Lock, which is a Contradiction to Fact that S is 2 Phase
74. Problems of Binary Locks Only One Transaction Can Hold a Lock on a Given Item
No Shared Reading is Allowed - Too Restrictive
For Example
T1 is Read Only on X - Yet Needs Full Lock
T2 is Read Only on X and Y - Needs Full Locks
75. A Read/Write Lock Model Refines the Granularity of Locking to Differentiate Between Read and Write Locks
Improves Concurrent Access
Rlock (Shared): If T has an Rlock A, then Any Other Transaction can Also Rlock A, but All Transactions are Forbidden from Wlock A until All Transactions with Rlock A issue Ulock A (Multiple Reads)
Wlock (Exclusive): If T has Wlock A, then All Other Transactions are Forbidden to Rlock or Wlock A Until T Ulocks A (Write Implies Reading, Single Write)
Two Schedules are Equivalent if:
Produce Same Value for Each Data Item
Each Rlock on an Item Occurs in Both Schedules at a Time When Locked Item has the Same Value
76. Algorithm 2: Read/Write Lock Model Input: Schedule S for Transactions T1, T2 , … Tk
Output: Is S Serializable? If so, Serial Schedule
Method: Create a Directed Precedence Graph G:
Suppose in S, Ti :Rlock A. If Tj : Wlock A is the Next Transaction to Wlock A (if it exists) then place an Arc from Ti to Tj. Repeat for all Ti’s. Note for all Rlocks before the Wlock on A!
Suppose in S, Ti :Wlock A. If Tj : Wlock A is the Next Transaction to Wlock A (if it exists) then place an Arc from Ti to Tj. Further, if there exists Tm :Rlock A after Ti :Wlock A but before Tj : Wlock A, then Draw an Arc from Ti to Tm.
Review the Resulting Precedence Graph
If G has Cycles - Non-Serializable
If G is Acyclic - Topological Sort for Serial Schedule
77. Consider the Following Schedule What are the Dependencies Among Transactions?
78. Consider the Following Schedule What is the Precedence Graph G?
79. Precedence Graph What is the Resulting Precedence Graph?
Is the Schedule Serializable?
Why or Why Not?
80. A Read-Only/Write-Only Lock Model Revision of the Read/Write Model for Algorithm 2
Refining Our Assumptions
Assume that a Wlock on an Item Does not Mean that the Transaction First Reads the ItemContrary to First Two Models
Example:Read A; Read B; C=A+B; A=A-1; Write A; Write CReads A, B and Writes A,C (No Read on C)
Reformulate Notion of Equivalent Schedules
81. How Does This Model Differ from Alg. 2? Consider the Schedule Segment:T1 : Wlock A T1 : Ulock A T2 : Wlock A T2 : Ulock A
In Algorithm 2 - T2 : Wlock A Assumes that T2 Reads the Value Written by T1
However, This Need Not be True in the New Model
If Between T1 and T2, No Transaction Rlocks A, then Value Written by is T1 Lost, and T1 Does not Have to Precede T2 in a Schedule w.r.t. A
82. Redefine Serializability Conditions on Serializability Must be Redefined in Support of the Write-Does-Not-Assume Read Model
If in Schedule S, T2 Reads “A” Written by T1, then
T1 Must Precede T2 in any Serial Schedule Equivalent to S
Further, if there is a T3 that Writes “A”, then in any Serial Schedule Equivalent to S, T3 may either Precede T1 or Follow T2, but may not Appear Between T1 and T2
Graphically, we have:
83. Augmentation of Precedence Graph In Support of the Write Does Not Imply Read Model, we must Augment the Precedence Graph:
Add an Initial Transaction To that Writes Every Item, and a Final Transaction Tf that Reads Every Item
When a Transaction T’s Output is Invisible in Tf (I.e., the Value is Lost), Then T is Referred to as a Useless Transaction
Useless Transactions have no Paths from Transaction to Tf
Note: Maintain Same set of Locks (Rlock, Wlock, Ulock) with Different Interpretation on Wlock
84. Intuitive View of Algorithm 3 If T2 Reads Value of “A” Written by T1 , then T2 Must Precede in any Serial Schedule
For WR Combo - Draw an Arc from T1 to T2
Now Consider a T3 that also Writes “A”
T3 Must be either Before T1 or After T2
Add in a Pair of Arcs T3 to T1 and T2 to T3 of Which one Must be Chosen in the Final Precedence Graph
Serializability Occurs if After Choices Made for each “T3” Pair, the Resulting Graph is Acyclic
G is Referred to as a “Polygraph” with Nodes, Arcs, and Alternate Arcs
85. Algorithm 3 Input: Schedule S for Transactions T1, T2 , … Tk
Output: Is S Serializable? If so, Serial Schedule
Method: Create a Directed Polygraph Graph P:
1. Augment S with Dummy To (Write Every Item) an Dummy Tf (Read Every Item)
2. Create Initial Polygraph P by Adding Nodes for To, Tf, and Each Ti Transaction , in S
3. Place an Arc from Ti to Tj Whenever Tj Reads A in Augmented S (with Dummy States) that was Last Written by Ti. Repeat this Step for all Arcs.Don’t Forget to Consider Dummy States!
4. Discover Useless Transactions - T is Useless if there is no Path from T to Tf
This is the “Initialization” Phase of Algorithm 3
86. Algorithm 3 Method: Reassess the Initial Polygraph P:
5. For Each Remaining Arc Ti to Tj (meaning that Tj Reads Item A Written by Ti ), Consider all other T ? To and T ? Tf that Also Writes A:
If Ti = To and Tj = Tf then Add No Arcs
If Ti = To and Tj ? Tf then Add Arc from Tj to T
If Ti ? To and Tj = Tf then Add Arc from T to Ti
If Ti ? To and Tj ? Tf then Add Arc Pair from T to Ti and Tj to T
6. Determine if P is Acyclic by “Choosing” One Transaction Arc for Each Pair - Make Choices Carefully
7. If Acyclic - Serializable - Perform Topological Sort without To , Tf for Equivalent Serial Schedule. Else - Not Serializable
87. Algorithm 3 Example
88. Resulting Polygraph - Steps 1-4 1. Add To and Tf to S,
2. Add To , Tf , T1 , T2 , T3 , T4 to Polygraph P
3. Look for Ti Write X to Tj Read X for all Items X
4. Look for Useless Transactions - No Paths from T to Tf
89. Resulting Polygraph - Steps 1-4 1. Add To and Tf to S,
2. Add To , Tf , T1 , T2 , T3 , T4 to Polygraph P
3. Look for Ti Write X to Tj Read X for all Items X
4. For - T3 Remove Arcs Into T3
90. Algorithm 3 Example - Step 5 - Writes on A
91. Resulting Polygraph - Step 5 - A:WR 5. For Each Arc Ti to Tj Consider All T’s that Write X
I. If Ti = To and Tj = Tf then Add No Arcs
II. If Ti = To and Tj ? Tf then Add Arc from Tj to T
III. If Ti ? To and Tj = Tf then Add Arc from T to Ti
IV. If Ti ? To and Tj ? Tf then Add Pair from T to Ti and Tj to T
Check Items A (see new arcs - case II and III)
92. Resulting Polygraph - Step 5 - A:WR
93. Algorithm 3 Example-Step 5 - Writes on C/D
94. Resulting Polygraph-Step 5- C:WR & D:WR 5. For Each Arc Ti to Tj Consider All T’s that Write X
I. If Ti = To and Tj = Tf then Add No Arcs
II. If Ti = To and Tj ? Tf then Add Arc from Tj to T
III. If Ti ? To and Tj = Tf then Add Arc from T to Ti
IV. If Ti ? To and Tj ? Tf then Add Pair from T to Ti and Tj to T
Do any Other Transactions Write C or Write D for the arrows labeled C:WR and D:RW Respectively?
95. Algorithm 3 Example - Step 5 - Writes on B
96. Resulting Polygraph - Step 5 and 6 5. For Each Arc Ti to Tj Consider All T’s that Write X
I. If Ti = To and Tj = Tf then Add No Arcs
II. If Ti = To and Tj ? Tf then Add Arc from Tj to T
III. If Ti ? To and Tj = Tf then Add Arc from T to Ti
IV. If Ti ? To and Tj ? Tf then Add Pair from T to Ti and Tj to T
B (see new arcs - including alternates - dashed)
For T1 to T2 and T4 - add T2 to T4 and T4 to T1
Either T4 After T2 or Before T1
97. Resulting Polygraph - Step 5 and 6 6. Which Option of Pair of Arcs Should be Chosen? Why?
98. Final Polygraph - Step 7 Final Graph Above - Delete Dummy States below
Topological Sort Yields Order: T1 , T2 , T3 , T4
99. Implementation Issues for CC Return to Earlier Diagram…
Transaction Manager + Schedule Implement CC
100. Implementation Issues for CC To Implement Algorithms 1 to 3, Focus on
Software Infrastructure (TM and SC)
Protocol (CC Model for Algorithms 1 to 3)
TM/SC: Arbitrates and Controls Transaction Execution
Protocol: Restrictions on the Elementary Steps of a Transaction in Order to Promote Serializability
TM/SC + Protocol
Comprise the Requirements/Specification of the Concurrency Control Mechanism
Concurrency Control Mechanism Itself Can be Modeled as a Lock Manager with Lock Tables
101. Implementation Issues for CC Locking Modes - In Support of Algorithms 1-3, there is Requirement to Establish Locking Modes
Thus, Whatever the Locking Choices (e.g., Binary, Read/Write, etc.), there is a Table that Lists the Compatibility of the Locks w.r.t. Concurrent Behavior
For Example, Tables Below Illustrates all Legal Concurrent Actions of Two Transactions
R/W Locks (on left)
R/W/Increment Locks (on right)
102. Implementation Issues for CC Locking Modes - Can be Extended and Refined Based on the Level of Granularity that is Desired
For Example: Retrieve-Delete-Update-Insert
Questions:
How Can Two Deletes/Inserts be Compatible?
Will Effect of a Delete/Insert be Lost?
103. Implementation Issues for CC Answer:
Focus on Buffer Management Capability
Smart Buffer Manager Tracks All Blocks at All Times
If T1 loaded Block 123 at Time t, when T2 Goes to Access Block 123 at Time t+10, Buffer Manager Checks to See if Block Already in Memory
Buffer Manager also has Concurrency Control!
104. Implementation Techniques for CC Algorithms 1 to 3 as Presented are Not Directly Implementable!
Don’t Integrate the CC Requirements (Protocol) with the TM/SC
Typical Implementation Techniques Utilize Queueing Strategies to Impose an Ordering on Transactions:
Queue for Each Transaction that Tracks the Data Items Needed by the Transaction for its Execution
Queue for Each Data Item that Tracks the Locks Requested and Held by All Transactions
Contain Inverse Data of One Another
105. Examples of Queues What is the State of Each Lock?
What is the State of Each Transaction?
What Happens when a Transaction, T1, Completes?
106. Examples of Queues Algorithms that Manage Queues Implement the CC Strategy
Lock State is Often Maintained within Queue
107. A Sidetrack to Recovery Basics Transactions are Liable to Fail for Many Reasons
Hardware or Software Failure
Deadlock Occurs
Transaction Error (e.g., Divide by Zero) after Partial Execution
In Either Case
We May Need to Abort a Completed Transaction Due to Error in Another Transaction
We Must Recover the DB to “Correct” State
What do OS’s Do?
Weekly Backups of File System
Incremental Backups (To another Disk)
Raid Arrays
System and Editor Log Files
108. Database Recovery Approaches Evolved from OS Techniques
Backup Copies of Database
Tape Copies (early days) and CD Copies
Online (Shadow Database System or FTP)
Off Site Storage of DB (Daily/Weekly)
Maintenance of Journal or Log File Containing
All Changes to DB Since Last “Backup”
Each Journal Entry Contains
Transaction ID
Old/New Values of Data Item(s)
Beginning/Ending Point of Transaction
When Failure Occurs
Redo Aborted Transactions/Rollback Completed Transactions/Undo Partially Executed Trans.
109. Two Phase Commit Policy All Actions for a Transaction are Performed in a Workspace (in Memory) Rather than Directly on the DB Copy of the Data
These Actions are Written in Journal (Including the Commit Action)
Leads to Two-Phase Commit Policy
Transaction Cannot Write to DB Until Committed
Transaction Cannot Commit Until All Changes have been Recorded First in the Jorunal
Two Phases are:
Phase 1: Write Data in Journal
Phase 2: Write Data in DB
Failure Can Occur Anytime!
110. Why is Two Phase Commit Important? Suppose DB Writes Occur Before Commit
Assume a Transaction Aborts in the Middle of Processing
Undo DB Changes Made to Actual Database Prior to Failure
Relatively Straightforward and Manageable
Undo Actions of Other Transactions that Read Information Written by Aborted Transaction
Impossible! Undo May Require you to Propagate to Many Other Transactions, Particularly if Aborted Transaction was Long-Duration (hours)
Basic Concepts of Recovery are Used to Non-Locking Optimistic CC Approach!
111. Why Optimistic Concurrency Control? Motivate by Disadvantages of Locking Techniques
Lock Maintenance
Deadlock-Free Locking Protocols Limit Concurrency
Secondary Memory Access Causes Locks to be Held for a Long Duration
Locks Typically Held Until Transaction Completes, Which Reduces Concurrency
Often Needed in “Worst” Case Only
Overhead - Locking + Deadlock Detection
Key Concept
Write Collisions in Large Databases for “Many” Applications are Rare
OCC: “Don’t Worry be Happy” Approach
112. Basic Ideas of OCC Interference Between Transactions is Rare and Locking Incurs too Much Overhead
Instead, Allow Each Transaction to Execute Freely, and Check Serializability at the end of the Transaction
Win (Allow to Commit) If No Interference Occurs or There have been No Conflicts
113. How Does OCC Work? Execute Transactions Ad-Hoc - Let them Go Uncontrolled
Maintain Information of “Relevant” Actions Against DB (Often in Conjunction with Recovery/Journal)
When Transactions Finish - Check to see if Everything Proceeded Satisfactorily
Assumes that Probability of Transaction Interference is Quite Small
Two Questions re. OCC:
How Do We know Everything Went OK?
How do we Recover if it Didn’t?
114. OCC Utilizes Timestamps Timestamps are Clock Ticks used to Record the Major Milestones in the Execution of a Transaction
Examples Include:
Start Time of Transaction
Read/Write Times for DB Items
Finish Time of Transaction
Commit Time of Transaction
Two Important Definitions are:
Read Time of an Item: Highest Time Stamp Possessed by Any Transaction that Reads the Item
Write Time of an Item: Highest Time Stamp Possessed by Any Transaction that Wrote the Item
115. How are Timestamps Used? Focus on “When” Reads and Writes Occur
Transaction Cannot Read an Item if its Value was Not Written Until After the Transaction Finished its Execution
Transaction T with Timestamp t1 Cannot Read an Item with a Write Time of t2 if t2 > t1
If this is the Case, T Must Abort and be Restarted
Can’t Read Item if it hasn’t been Written
Transaction Cannot Write an Item if that Item has its Old Value Read at a Later Time
Transaction T with Timestamp t1 Cannot Write an Item with a Read Time of t2 if t2 > t1
If this is the Case, T Must Abort and be Restarted
Can’t Write Item Being Read at a Later Time
116. Algorithm 4: Optimistic CC Let T be a Transaction with Timestamp t Attempting to Perform Operation X on a Data Item I with Readtime tR and Writetime tW
If (X = Read and t ? tW ) or (X = Write and t ? tR ) then Perform Operation
If t > tR then set tR = t for Data Item I
If t > tW then set tW = t for Data Item I
If (X = Write and tR ? t < tW ) the Do Nothing since Later Write will Cancel out the Write of T
If (X = Read and t < tW ) or (X = Write and t < tR ) then Abort the Operation
1st - T trying to Read Item Before it was Written
2nd - T trying to Write an Item Before it was Read
117. Example of OCC What Happens at Each Step w.r.t. RT/WT?
118. Example of OCC What Happens at Step 6? WT(C) =150 < RT(C)=175
Consequence - Abort T2
119. Example of OCC Step (7) T3 can Finish, but No Effect Since 175 < 200 - Discard
120. Summary of Example T1 Completes Successfully; T2 Aborts; T3 Completes but Doesn’t Write A
121. Recovery Consideration Actual Write Operations of Previous Example are Phase 1 of Two-Phase Commit (Write to Journal)
Commit - Phase 2 - Writes to DB
Between Write to Log and Write to DB, No Other Transaction is Allowed to Read Items being Written
OCC Reduces Work as Follows:
One Step for Read, Two for Writes (write/commit)
In Locking, we had Four Steps:
Lock, Read or Write, Unlock, Commit
122. Viewing OCC vs. Phases of Execution Read Phase:
Database Information Read from Secondary Storage into Primary Memory
All Writes are to Local Workspace
Validate Phase:
Check to see if Integrity of Data has not been Violated
Write Phase:
Update the DB (Secondary Storage) from Local Copies
123. Contrasting PCC and OCC Transaction Control
PCC: Control by Having Transactions Wait
OCC: Control by Having Transactions Backed up
Serializability
PCC: Ordering of Data Items
OCC: Ordering of Transactions
Biggest Potential Problem
PCC: Deadlock, rather Preventing it
OCC: Starvation
Different Applications Suited to Different Approaches
Some DBMS Support Both
DBA Can Configure on Application-by-Application Basis
124. Concluding Remarks Background
OS Concepts of Sharing and Synchronization
Deadlock Detection, Prevention, Avoidance
Chapter 19
Transaction Processing Concepts
Different Problems re. Concurrency Control
Deadlock, Livelock, Starvation
Lost Update, Dirty Read, etc.
Serial Schedule and Serializability
Chapter 20
Deviated from Textbook Notation
3 Pessimistic Locking Based CC Algorithms
1 Optimistic Timestamp Based CC Algorithm
Role of Recovery in CC