740 likes | 758 Views
Learn about consistency and isolation properties in database transactions, scheduling, serializability, and conflict-serializability. Understand how to ensure data integrity in concurrent transaction processing.
E N D
Introduction The consistency property requiresthat a transaction transforms a consistent DB state into another consistent DB state. The isolation property requires that concurrent transactions are executed as if they were executed in isolation. More specifically, concurrent transactions are executed in a way that is equivalent to executing the same transactions serially in some order.
Introduction A schedule is a sequence of actions of one or more transactions. The actions that we consider in this chapter are read and write operations in the buffer (not on disk). Need to ensure that schedules are serializable. At the same time, want to execute as many transactions as possible at the same time in order to maximize the throughput of the system and to minimize the response time.
Introduction Example T1: Read(A, t) T2: Read(A,s) t t+100 s s2 Write(A,t) Write(A,s) Read(B,t) Read(B,s) t t+100 s s2 Write(B,t) Write(B,s) Constraint: A=B
Serial Schedules A schedule is serial, if actions of different transactions are not interleaved, otherwise it is non-serial. A serial schedule executes one transaction at a time. Serial schedules can be denoted by the sequence of their transactions: e.g., (T1,T2) or (T2,T1). For a serial schedule, isolation is trivially satisfied. But the throughput of the DBS is very low, and the response times are very high.
A B 25 25 125 125 250 250 250 250 Serial Schedules Schedule A (serial) T1 T2 Read(A,t); t t+100 Write(A,t); Read(B,t); t t+100; Write(B,t); Read(A,s); s s2; Write(A,s); Read(B,s); s s2; Write(B,s); Constraint: A=B
A B 25 25 50 50 150 150 150 150 Serial Schedules Schedule B (serial) T1 T2 Read(A,s); s s2; Write(A,s); Read(B,s); s s2; Write(B,s); Read(A,t); t t+100 Write(A,t); Read(B,t); t t+100; Write(B,t); resulting DB state different from schedule A but both results satisfy A = B
Serializable Schedules A schedule S is serializable, if there is a serial schedule S’ (of the same actions) such that - for every initial DB state, and- for every semantics of the transactions, the effects of S and S’ are the same. The order of transactions in the serial schedule is undefined (T1 before T2 or T2 before T1). A serializable schedule transforms a consistent DB state into another consistent DB state.
A B 25 25 125 250 125 250 250 250 Serializable Schedules Schedule C (non-serial) T1 T2 Read(A,t); t t+100 Write(A,t); Read(A,s); s s2; Write(A,s); Read(B,t); t t+100; Write(B,t); Read(B,s); s s2; Write(B,s); • schedule equivalent to serial schedule (T1,T2) • schedule is serializable
A B 25 25 125 250 50 150 250 150 Serializable Schedules Schedule D (non-serial) T1 T2 Read(A,t); t t+100 Write(A,t); Read(A,s); s s2; Write(A,s); Read(B,s); s s2; Write(B,s); Read(B,t); t t+100; Write(B,t); • resulting DB state inconsistent with A = B • schedule is not serializable
A B 25 25 125 125 25 125 125 125 Serializable Schedules Schedule E (non-serial) T1 T2’ Read(A,t); t t+100 Write(A,t); Read(A,s); s s1; Write(A,s); Read(B,s); s s1; Write(B,s); Read(B,t); t t+100; Write(B,t); • same as schedule D, but changed semantics of T2 • resulting DB state consistent with A = B
Serializable Schedules Semantics of a transaction: “function” to be computed, defined by the transaction code. In general, it is too hard to analyze the semantics of a transaction automatically. Therefore, the scheduler ignores the semantics of the transactions and considers only the sequence of read and write operations. We assume the worst case: if there is something that T can do to make the DB state inconsistent, then T will do that.
Serializable Schedules We adopt the following notations: rT(X): transaction T reads database element X, wT(X): transaction T writes database element X. We use r1(X) or w1(X) as shorthand for rT1(X) or wT1(X), resp. An action is of the form rT(X) or wT(X). A transaction Ti is a sequence of actions with subscript i.
Serializable Schedules A schedule S of a set of transactions Trans is a sequence of actions that contains all actions of all transactions T in Trans in the same order in which they appear in the definition of T. ExampleT1=r1(A) w1(A) r1(B) w1(B)T2=r2(A) w2(A) r2(B) w2(B) S = r1(A) w1(A) r2(A) w2(A) r1(B) w1(B) r2(B) w2(B)
Conflict-Serializability Conflict-serializability is stronger than serializability, but easier to enforce. Most commercial DBMS enforce conflict-serializability. It is based on the notion of a conflict. A pair of consecutive actions in a schedule constitutes a conflict if swapping these actions may change the effect of at least one of the transactions involved.
Conflict-Serializability Most pairs of actions do not cause a conflict. ri(X) and rj(Y) never cause a conflict, even if X = Y, since they do not modify the DB state. ri(X) and wj(Y) do not cause a conflict if . wi(X) and rj(Y) do not cause a conflict if . wi(X) and wj(Y) do not cause a conflict if .
Conflict-Serializability The following three situations do cause a conflict: Actions of the same transaction, i.e. i = j. Two writes of the same database element by different transactions, i.e. wi(X) and wj(X), .Depending on the schedule, the results of either wi(X) or wj(X) survive, which may be different. A read and a write of the same database element by different transactions, i.e. ri(X) and wj(X), . ri(X) may read a different version of X.
Conflict-Serializability Any two actions of different transactions may be swapped, unless they involve the same database element and at least one of them is a write. If there is a sequence of non-conflicting swaps that transforms schedule S into a serial schedule S’, then S is serializable. Schedules S1, S2 are conflict equivalent, if S1 can be transformed into S2 by a series of swaps on non-conflicting actions.
Conflict-Serializability A schedule is conflict serializable if it is conflict equivalent to some serial schedule. ExampleS=r1(A)w1(A)r2(A)w2(A)r1(B)w1(B)r2(B)w2(B) is conflict equivalent to the serial scheduleS’=r1(A) w1(A) r1(B)w1(B) r2(A) w2(A) r2(B) w2(B) operations on critical DB elements are always first performed by T1, then by T2
Conflict-Serializability If transactions Ti and Tj contain at least two pairs of conflicting actions, then for each of these pairs the action of Ti has to be performed before that of Tj (or always Tj before Ti). Given a schedule S, Ti takes precendence over Tj, denoted by Ti <S Tj, if there are actions Ai of Ti and Aj of Tj such that- Ai is ahead of Aj in S,- both Ai and Aj involve the same database element, and at least one of them is a write.
Conflict-Serializability If Ti takes precendence over Tj, then a schedule S’ that is conflict equivalent to S must have Ai before Aj. Precedence graph: directed graph with nodesrepresenting the transactions of S, i.e. node label i for transaction Ti,edges representing precedence relationships, i.e. edge from node i to j if Ti <S Tj. Notation: P(S)
Conflict-Serializability ExampleS = w3(A) w2(C) r1(A) w1(B) r1(C) w2(A) r4(A) w4(D) P(S) 3 1 2 4 based on A based on C
Conflict-Serializability Lemma 1 S1, S2 conflict equivalent P(S1) = P(S2) ProofAssume P(S1) P(S2) Ti, Tj: Ti Tj in P(S1) and not in P(S2) S1 = …pi(A)... qj(A)… pi, qj S2 = …qj(A)…pi(A)... in conflict S1, S2 not conflict equivalent
Conflict-Serializability NoteP(S1)=P(S2) S1, S2 conflict equivalent Counter exampleS1=w1(A) r2(A) w2(B) r1(B) S2=r2(A) w1(A) r1(B) w2(B) P(S1)=P(S2)= 1 2 S1 not conflict equivalent to S2, since w1(A) and r2(A) cannot be swapped
Conflict-Serializability Theorem 2P(S) acyclic S conflict serializable Proof (i) Assume S is conflict serializable. S’: S’ is serial, S conflict equivalent to S’. P(S’) = P(S) according to Lemma 1. P(S’) is acyclic because S’ is serial. P(S) is acyclic.
Conflict-Serializability Proof (ii) Assume P(S) is acyclic. Transform S as follows: (1) Take T1 to be transaction with no incoming edges. T1 exists, since P(S) is acyclic. (2) Move all T1 actions to the front: S = ……. qj(A)…….p1(A)….. This does not create any conflicts, since there is no Tj with Tj T1. (3) We now have S’ = < T1 actions ><... rest ...>. (4) Repeat above steps to serialize rest. T1 T2 T3 T4 P(S)
Conflict-Serializability How to enforce that only conflict-serializableschedules are executed? There are two alternative approaches:- pessimistic concurrency control Lock data elements to prevent P(S) cycles from occurring.- optimistic concurrency control Detect P(S) cycles and undo participating trans- actions, if necessary.
Enforcing Serializability by Locks Before accessing a database element, a transaction requests a lock on that element in order to prevent other transactions from accessing the same database element at the “same” time. Typically, different types of locks are used for different types of access operations, but we first introduce a simplified lock protocol with only one type of lock.
Enforcing Serializability by Locks We introduce two new actions: li (X): lock database element X ui (X): unlock database element X, i.e. release lock. A locking protocol must guarantee the consistency of transactions: - A transaction can only read or write database X element if it currently holds a lock on X.- A transaction must unlock all database elements that is has locked at some later time. A consistent transaction is also called well-formed.
Enforcing Serializability by Locks A locking protocol must also guarantee the legality of schedules: At most one transaction can hold a lock on database element X at a given point of time. If there are actions li (X) followed by lj (X) in some schedule, then there must be an action ui(X) somewhere between these two actions.
Enforcing Serializability by Locks • Example • S1 = l1(A)l1(B)r1(A)w1(B)l2(B)u1(A)u1(B) r2(B)w2(B)u2(B)l3(B)r3(B)u3(B) S1 illegal, because T2 locks B before T1 has unlocked it • S2 = l1(A)r1(A)w1(B)u1(A)u1(B) l2(B)r2(B)w2(B)l3(B)r3(B)u3(B) T1 inconsistent, because T1 writes B before locking it • S3 = l1(A)r1(A)u1(A)l1(B)w1(B)u1(B) l2(B)r2(B)w2(B)u2(B)l3(B)r3(B)u3(B) schedule legal and all transactions consistent
Enforcing Serializability by Locks Schedule F Schedule F is legal, but not serializable. A B T1 T2 25 25 l1(A);Read(A) A A+100;Write(A);u1(A) 125 l2(A);Read(A) A Ax2;Write(A);u2(A) 250 l2(B);Read(B) B Bx2;Write(B);u2(B) 50 l1(B);Read(B) B B+100;Write(B);u1(B) 150 250 150
Two-Phase Locking A legal schedule of consistent transactions is not necessarily conflict-serializable. However, a legal schedule with the following locking protocol is conflict-serializable. Two-phase locking (2PL)In every transaction, all lock actions precede all unlock actions. Growing phase: acquire locks, no unlocks. Shrink phase: release locks, no locks.
# locks held by Ti time Growing Shrinking Phase Phase Two-Phase Locking Example
Two-Phase Locking Schedule G T1 T2 l1(A);Read(A) A A+100;Write(A) l1(B); u1(A) l2(A);Read(A) A Ax2;Write(A);l2(B) Read(B);B B+100 Write(B); u1(B) l2(B); u2(A);Read(B) B Bx2;Write(B);u2(B); Schedule G is serializable. delayed changed order!
Two-Phase Locking In 2PL, each transaction may be thought of as executing all of its actions when issuing the first unlock action. Thus, the order according to the first unlock action defines a conflict-equivalent serial schedule. Theorem 3(1) legality of schedule, and (2) consistency of transactions and (3) 2PL conflict-serializability.
Two-Phase Locking Lemma 4 Ti Tj in S SH(Ti) <S SH(Tj) where Shrink(Ti) = SH(Ti) = first unlock action of Ti Proof Ti Tj means that S = … pi(A) … qj(A) … and pi,qj conflict According to (1), (2): S = … pi(A) … ui(A) … lj(A) ... qj(A) … According to (3): Therefore, SH(Ti) <S SH(Tj). SH(Ti) SH(Tj)
Two-Phase Locking Proof of theorem 3 Given a schedule S. Assume P(S) has cycle T1 T2 …. Tn T1 By lemma 4: SH(T1) < SH(T2) < ... < SH(T1). Contradiction, so P(S) acyclic. By theorem 2, S is conflict serializable. 2PL allows only serializable schedules.
Two-Phase Locking Not all serializable schedules are allowed by 2PL. Example S1: w1(x) w3(x) w2(y) w1(y) The lock by T1 for y must occur after w2(y), so the unlock by T1 for x must also occur after w2(y)(according to 2PL). Because of the schedule legality, w3(x) cannot occur where shown in S1 because T1 holds the x lock at that point. However, S1 serializable (equivalent to T2, T1, T3).
Two-Phase Locking Deadlocks may happen under 2PL, when two or more transactions have got a lock and are waiting for another lock currently held by one of the other transactions. Example (T2 reversed) T1: Read(A, t) T2: Read(B,s) t t+100 s s2 Write(A,t) Write(B,s) Read(B,t) Read(A,s) t t+100 s s2 Write(B,t) Write(A,s)
Two-Phase Locking Possible schedule Deadlock cannot be avoided, but can be detected(cycle in wait graph). At least one of the participating transactions needs to be aborted by the DBMS. T1 T2 l1(A); Read(A) l2(B);Read(B) A A+100;Write(A) B Bx2;Write(B) l1(B) l2(A) delayed, wait for T1 delayed, wait for T2
Two-Phase Locking So far, we have introduced the simplest possible 2PL protocol and showed that it works. There are many approaches for improving its performance, i.e. allowing a higher degree of concurrency: - shared locks,- increment locks,- multiple granularity locks,- tree-based locks.
Shared and Exclusive Locks • In principle, several transactions can read database element A at the same time, as long as none is allowed to write A. • In order to enable more concurrency, we distinguish two different types of locks: • shared (S) lock: there can be multiple shared locks on X, permission only to read A. • exclusive (X) lock: there can be only one exclusive lock on A, permission to read and write A.
Shared and Exclusive Locks • We introduce the following lock actions for database element A and transaction i: • sl-i(A): lock A in S mode xl-i(A): lock A in X mode • u-i(A): unlock whatever modes Ti has locked A • Modify consistency of transactions as follows: • A read action ri(A) must be preceded by sl-i(A) or xl-i(A) with no intervening ui(A). • A write action ri(A) must be preceded by xl-i(A) with no intervening ui(A).
Shared and Exclusive Locks Typically, a transaction does not know its needs for locks in advance. What if transaction Ti reads and writes the same database element A? Ti will request both shared and exclusive locks on A at different times. Example Ti=... sl-1(A) … r1(A) ... xl-1(A) …w1(A) ...u(A)… If Ti knows lock needs, request X lock right away.
Shared and Exclusive Locks • Modify legality of schedules as follows: • If xl-i(A) appears in a schedule, then there cannot follow an xl-j(A) or sl-j(A),without an intervening ui(A). • If sl-i(A) appears in a schedule, then an xl-j(A) cannot followwithout an intervening ui(A). • All other consistency and legality as well as the 2PL requirements remain unchanged. • The proof of Theorem 3 still works.
Shared and Exclusive Locks A compatibility matrix is a convenient way to specify a locking protocol. Rows correspond to lock already held by another transaction, columns correspond to a lock being requested by current transaction. Lock requested S X Lock held S Yes No in mode X No No
Shared and Exclusive Locks If a transaction first reads A and later writes A, it has to upgrade its S lock to an X lock. Upgrading is a frequent source of deadlocks. T1 T2 sl-1(A) sl-2(A) r1(A) r2(A) xl-1(A) xl-2(A) w1(A)
Update Locks In order to avoid such deadlocks (as far as possible), we introduce another type of lock. An update lock ul-i(A) gives transaction i the privilege to - read database element A and to- upgrade its lock on A to an X lock. An update lock is not shared. Read locks cannot be upgraded.
Update Locks Compatibility matrix Lock requested S X U Lock held S Yes No Yes in mode X No No No U No No No Example T1 T2 ul-1(A) ul-2(A) r1(A) xl-1(A) w1(A) U is not symmetric!