460 likes | 569 Views
University of Palestine. Software Engineering Department. Advanced Database Course (ESED5204). Eng. Hanan Alyazji. Concurrency Control. Concurrency Control. Problem : in a multi-user environment, simultaneous access to data can result in interference and data loss.
E N D
University of Palestine Software Engineering Department Advanced Database Course (ESED5204) Eng. Hanan Alyazji
Concurrency Control • Problem : in a multi-user environment, simultaneous access to data can result in interference and data loss. • Solution : Concurrency Control • Multiple users access databases and use computer systems simultaneously. Example: In concurrent execution environment if T1 conflicts with T2 over a data item A, then the existing concurrency control decides if T1 or T2 should get the A and if the other transaction is rolled-back or waits.
What is Concurrent Process? • The process of managing simultaneous operations against a database so that data integrity is maintained and the operations do not interfere with each other in a multi-user environment. • Concurrency Control process: Ensure serializability of transactions in multiuser database environment. Example: • Airline reservation system. • An airline reservation system is used by hundreds of travel agents and reservation clerks concurrently.
Why Concurrent Process? • Better transaction throughput and response time. • Better utilization of resource. BUT simultaneous execution of transactions creates data integrity & consistency problems: • Lost updates • Uncommitted data • Inconsistent retrievals
1. Lost Update Problem • Two transactions accessing the same database item have their operations interleaved in a way that makes the database item incorrect. Transaction A Transaction B 1 Read data B writes over the same data written by A, without taking into account the changes made by A 1 Read data 2 Write data Write data 2
Example: T2 T1 read_item (X) X = X + M write_item (X) read_item ( X ) X = X – N write_item (X ) t i m e • Aftertermination of T2, X = X + M. T1's update to X is lost because T2 wrote over X. Generally, lost update problem is characterized by: • T2 reads X, T1 writes X, and T2 writes X.
Example: X=4 Y=8 N=2 M=3 • item X has incorrect value because its update from T1 is “lost” (overwritten) • T2 reads the value of X before T1 changes it in the database and hence the updated database value resulting from T1 is lost.
2. Uncommitted Read Problem: • Data are not committed when 2 transactions T1 & T2 are executed concurrently and the first transaction is rolled back after second transaction has already accessed uncommitted data. • An uncommitted read problem occurs when a first transaction relies on a change which has not yet been committed, which is rolled back after the second transaction has begun. • An Uncommitted Readis also called “Dirty Read”.
Dirty read • One transaction updates a database item and then the transaction fails. The updated item is accessed by another transaction before it is changed back to its original value. • Dirty Read is data written by an uncommitted transaction. Transaction A Transaction A Transaction B Transaction B 1 1 Write data Write data An Uncommitted Read (also called a Dirty Read) happens when B reads uncommitted data modified by A 1 1 Read data Read data 2 2 2 2
Example: T2 T1 read_item (X) X = X + M write_item (X) read_item ( X ) X = X – N write_item (X ) T1 fails t i m e • T2 has already read item X = X - N value, and that value is going to be altered by DBMS back to X. Generally, dirty read problem is characterized by: • T1 writes X, T2 reads X, and T1 fails.
3. Inconsistent retrievals Problem: • Inconsistent retrievals or unrepeatable read Occurs when transaction reads several values but second transaction updates some of them during execution of first. • A transaction, if it reads the same data item more than once, should always read the same value. • It occurs when a transaction T1 reads a database item A, another transaction T2 updates A and commits, and then T1 reads the altered A again.
Example: T2 T1 read_item (X) X = X + M write_item (X) read_item ( X ) read_item (X ) t i m e • Transaction T1has got two different values of X in two subsequent reads, because T2has changed it in the meantime. Generally, unrepeatable read problem is characterized by: • T1 reads X, T2 writes X, and T1 reads X.
Important Note: Having two transactions T1 & T2 if: • T2 reads X, T1 writes X, and T2 writes Lost update problem • T1 writes X, T2 reads X, and T1 fails dirty readproblem • T1 reads X, T2 writes X, and T1 reads X unrepeatable readproblem
A Question for You …… • What is the difference between: Dirty read and Unrepeatable read problem? • Answers: • The difference is: • The dirty read is a consequence of reading updates made by a transaction before it has successfully finished (and has even failed later). • The unrepeatable read is a consequence of allowing a transaction to read data that the other one is altering.
Concurrency Control Techniques • Two basic concurrency control techniques: • Locking • Timestamping • Both are conservative approaches: delay transactions in case they conflict with other transactions.
Locking • A lock is a variable associated with a data item that describes the status of the item with respect to possible operations that can be applied to it. • Locking is an operation which secures a permission to Read or a permission to Write a data item for a transaction. • Example:Lock (X): Data item X is locked in behalf of the requesting transaction. • Unlocking is an operation which removes these permissions from the data item. • Example:Unlock (X): Data item X is made available to all other transactions.
Example: T1 read_lock (Y) read_item (Y); unlock (Y); write_lock (X); read_item (X); X:=X+Y; write_item (X); unlock (X);
Locking • A transaction locks a portion of the database to prevent concurrency problems. • Before reading from a resource a transaction must acquire a read-lock and Before writing to a resource a transaction must acquire a write-lock. • Two locks modes: • Shared mode:shared lock (X). More than one transaction can apply share lock on X for reading its value but no write lock can be applied on X by any other transaction. • Exclusive mode:Write lock (X). Only one write lock on X can exist at any time and no shared lock can be applied by any other transaction on X.
Locking - Basic Rules • If transaction has read lock on an item, the item can be read but not updated. • If transaction has write lock on an item, the item can be both read and updated. • Reads cannot conflict, so multiple transactions can hold read locks simultaneously on the same item. • Write lock gives one transaction exclusive access to an item. • Need to commit a transaction before a write lock is released.
Database requires that all transactions should be well-formed. A transaction is well-formed if: • It must lock the data item before it reads or writes to it. • It must unlock the data item after it is done with it. • It must not lock an already locked data item. • It must not try to unlock a free data item.
Two-Phase Locking (2PL) • Two-Phase Locking occurs if all locking operations precede first unlock operation in the transaction. • All lock operations must precede the first unlock operation. Theorem: If all transactions in a schedule obey locking rules and two phase locking protocol, the schedule is a conflict serializable one.
Example: T1 T2 read-lock(X) read-lock(X) Read(X) Read(X) write-lock(Y) unlock(X) unlock(X) write-lock(Y) Read(Y) Read(Y) Y = Y + X Y = Y + X Write(Y) Write(Y) unlock(Y) unlock(Y) • T1 follows 2PL protocol • All of its locks are acquired before it releases any of them • T2 does not • It releases its lock on X and then goes on to later acquire a lock o Y
Exercise:Which transaction follow the 2PL protocol?? T3 read_lock (Y); read_item (Y); write_lock (X); unlock(Y); read_item (X); X:=X+Y; write_item (X); unlock (X); T1 read_lock (Y) read_item (Y); unlock (Y); write_lock (X); read_item (X); X:=X+Y; write_item (X); unlock (X); T2 read_lock (X); read_item (X); unlock (X); Write_lock (Y); read_item (Y); Y:=X+Y; write_item (Y); unlock (Y); T4 read_lock (X); read_item (X); Write_lock (Y); unlock(X) read_item (Y); Y:=X+Y; write_item (Y); unlock (Y)
Lost Update and Two Phase Locking T2 T1 write_lock(X ) //has to wait write_lock(X ) read_item (X) X = X + M write_item(X ) unlock(X ) read_lock(X ) read_item ( X ) X = X – N write_lock(X ) write_item (X ) unlock(X ) T2 can not obtain a write_lock on X since T1 holds a read lock on X and it has to wait. When T1 releases lock on X, T2 acquires a lock on X and finishes successfully. t i m e Two phase locking provides for a safe conflict serializable schedule.
Two Phase Locking: Dirty Read T2 T1 write_lock(X ) read_item ( X ) X = X – N write_item (X ) write_lock(Y ) unlock(X ) read_item ( Y ) Y = Y + Q write_item (Y ) unlock(Y ) //T1 fails before it commits write_lock(X ) read_item X X = X + M write_item(X ) unlock(X ) If T1 gets exclusive lock on X first, T2 has to wait until T1 unlocks X Two phase locking alone does not solve the dirty read problem, because T2 is allowed to read uncommitted database item X t i m e Two phase locking alone does not solve the dirty read problem.
T1 T2 Deadlocks • 2PL can lead to deadlocks • Different transactions wait for each other to release locks. • Only one way to break deadlock: abort one or more of the transactions. • Represent the waiting relationship as waiting graph • Directed edge from Ti to Tj if Ti waits for Tj T1:T2: xlock (x) write (x) xlock (y) write (y) xlock (y) xlock (x) waiting graph
T1 T2 2. Deadlock Detection • Alternative is to allow deadlocks to happen but to check for them and fix them if found. • Create a waits-for graph: • Nodes are transactions • There is an edge from Ti to Tj if Ti is waiting for Tj to release a lock. • Periodically check for cycles in the waits-for graph.
Deadlock Detection: Example: T1: S(A), S(D), S(B) T2: X(B) X(C) T3: S(D), S(C), X(A) T4: X(B) T1 T2 T4 T3
Deadlock • Two ways of dealing with deadlocks: • Deadlock prevention. • Deadlock detection. 1.Deadlock Prevention: • Assign priorities based on timestamps. Assume Ti wants a lock that Tj holds. Two policies are possible: • Wait-Die: If Ti has higher priority, Ti waits for Tj; otherwise Ti aborts • Wound-wait: If Ti has higher priority, Tj aborts; otherwise Ti waits
wait? Wait-Die Algorithm: • Transactions given a timestamp when they arrive …. ts(Ti) • Ti can only wait for Tj if ts(Ti) < ts(Tj) ...else die Example1: T1 (ts =10) T2 (ts =20) T3 (ts =25) wait wait
Example 2: T1 (ts =22) T2 (ts =20) T3 (ts =25) requests A: wait for T2 or T3? Note: ts between 20 and 25. wait(A)
One option: T1 waits just for T3, transaction holding lock. But when T2 gets lock, T1 will have to die! T1 (ts =22) T2 (ts =20) T3 (ts =25) wait(A) wait(A)
Another option: T1 only gets A lock after T2, T3 complete, so T1 waits for both T2, T3 T1 dies right away! T1 (ts =22) T2 (ts =20) T3 (ts =25) wait(A) wait(A) wait(A)
Yet another option: T1 preempts T2, so T1 only waits for T3; T2 then waits for T3 and T1 T2 dies T1 (ts =22) T2 (ts =20) T3 (ts =25) wait(A) wait(A) wait(A)
wait Wound-wait Algorithm: • Transactions given a timestamp when they arrive … ts(Ti) • Ti wounds Tj if ts(Ti) < ts(Tj) else Ti waits “Wound”: Tj rolls back and gives lock to Ti Example1: T1 (ts =25) T2 (ts =20) T3 (ts =10) wait wait
Example2: T1 (ts =15) T2 (ts =20) T3 (ts =10) requests A: wait for T2 or T3? Note: ts between 10 and 20. wait(A)
One option: T1 waits just for T3, transaction holding lock. But when T2 gets lock, T1 waits for T2 and wounds T2. T1 (ts =15) T2 (ts =20) T3 (ts =10) wait(A) wait(A)
Another option: T1 only gets A lock after T2, T3 complete, so T1 waits for both T2, T3 T2 wounded right away! T1 (ts =15) T2 (ts =20) T3 (ts =10) wait(A) wait(A) wait(A)
Yet another option: T1 preempts T2, so T1 only waits for T3; T2 then waits for T3 and T1... T2 is spared! T1 (ts =15) T2 (ts =20) T3 (ts =10) wait(A) wait(A) wait(A)
Timestamping • A unique identifier created by DBMS that indicates relative starting time of a transaction. • Transactions with smaller timestamps, get priority in the event of conflict. • A larger timestamp value indicates a younger transaction. • Timestamp based algorithm uses timestamp to serialize the execution of concurrent transactions. • Conflict is resolved by rolling back and restarting transaction. • No locks so no deadlock.
Timestamping • In order to use timestamp values for serializable scheduling of transactions, the transaction manager of a DBMS associates with each database item X two timestamp (TS) values: • Read_TS(X): The timestamp (identifier) of the youngest transaction that has read X successfully. • Write_TS(X): The timestamp (identifier) of the youngest transaction that has written X successfully. Note: If Ti starts before Tj , then TS (Ti ) < TS (Tj) (*Ti is older than Tj)
Basic timestamp ordering • When a transaction tries to read a value • Is there a younger transaction that has already written the data item? • True => Abort (older) transaction • False => Continue and update write timestamp • When a transaction tries to write a data item • Is there a younger transaction that has already read or written the same data? • True => Abort the (older) transaction • False => Continue and update read timestamp
Example 1: • When a transaction tries to read a value • Is there a younger transaction that has already written the data item? • True => Abort (older) transaction • False => Continue No problem, T2 has only read x Transactions can both commit, no concurrency problem.
When a transaction tries to write a data item • Is there a younger transaction that has already read or written the same data? • True => Abort the (older) transaction • False => Continue Example 2: No problem, T2 has only read x Problem, T2 has written x T1 will be rolled back and started again with a new timestamp