350 likes | 643 Views
Formal approaches to transaction recovery in databases systems - an overview. Contents. Part 1 Concurrency control. Page model. Histories and schedules. Serializability of schedules. Herbrand semantics of schedules. Final state serializability. View serializability.
E N D
Formal approaches to transaction recovery in databases systems - an overview DSL, 7 November 2006
Contents Part 1 Concurrency control Page model Histories and schedules Serializability of schedules Herbrand semantics of schedules Final state serializability View serializability Conflict serializability DSL, 7 November 2006
Contents Part 2 Transaction recovery Expanded histories and schedules Expanded conflict serializability Reducibility and prefix reducibility Recoverability Avoiding cascading aborts Strictness Rigorousness Log recoverability DSL, 7 November 2006
Motivating example T1 T2 x=y=50 x+y=$100 read(x) x = $50 write(x:=x-10) x = $40 read(x)(x,x-10) read(y) print(x+y) • read(y) y = $50 • write(y:=y+10) y = $60 DSL, 7 November 2006
Page model, actions, transactions Database server contains a finite set D = {x, y, z, …} of data items also called as pages A set of actions performed by the database applications includes read ( r(x) ), write ( w (x) ), commit(c) and abort(a) Atransactiont is a finite sequence of actions of the form r(x), w(x) written as: t = p1, …, pn where pi {r(x), w(x)} for i=1, ... , n and x D Example: t = r(x) r(y) w(z) r(u) w(x) DSL, 7 November 2006
Histories and schedules Let T = {t1, …, tn} be a set of transactions where each ti has the form ti = (opi, <i) where opi is a set of operations of ti and <i denotes their ordering Ahistory for T is a pair s = (op(s), <s) such that: (1) s consists of union of the operations from the given transactions plus the termination operations c, a (2) each transaction ends with either commit(c) or abort (a) (3) all transaction orders are contained in the order given by <s (4) commitor abort operation always appears as the last step of a transaction (5) Every pair of operations p, q op(s) from the distinct transactions that access the same data item and have at least one write operation is ordered in s in such a way that either p <sq or q <sp Example: r1(x) r2(x) w1(x) w2(y) r3(x) w3(y) c3 w1(z) w1(y) c1 c2 A schedule is a prefix of a history DSL, 7 November 2006
Serializability of schedules A history s is serial if for any two transactions ti and tj in it all operations from ti are ordered in s before all operations of tj or vice versa We define and equivalence relation on the set S of all schedules. The equivalence relation decomposes S into equivalence classes: [S] = {[s] | s S} All schedules in one such class are pairwise equivalent, hence any schedule s can be chosen as the representative of that class The elements of a class for which a serial schedule can be chosen as the representative are called as serializable DSL, 7 November 2006
Herbrand semantics of schedules Let s be a schedule. The Herbrand semantics Hs of steps ri(x), wj(x) op(s) is recursively defined as follows: (1) Hs(ri(x)) := Hs(wj(x)) where wj(x) is the last write operation on x in s before ri(x) (2) Hs(wi(x)) := fix(Hs(ri(y1)), … , Hs(ri(ym)) ) where ri(yj) j=1, …, m represents all read operation of transaction ti that occur in s before wi(x) and fix is an uninterpreted m-ary function symbol Example: s = w0(x) w0(y) c0 r1(x) r2(x) w2(x) c2 c1 Hs(wo(x)) = f0x( ) Hs(wo(y)) = f0y( ) Hs(r1(x)) = Hs(wo(x)) = f0x( ) Hs(r2(x)) = Hs(wo(x)) = f0x( ) Hs(w2(x)) = f2x(Hs(r2(x)) ) = f2x(f0x( ) ) DSL, 7 November 2006
Herbrand universe Let D = {x, y, z, … } be a set of data items. Let op(t) denote a set of all steps of transaction t. The Herbrand universe HU for transactions ti, i=1, … , m is the smallest set of symbols satisfying the following conditions: (1) f0x( ) HU for each x D, where f0x( ) is a 0-ary function symbol (constant) (2) If wi(x) op(ti) |{ri(y): y D ri(y) <tiwi(x)}| = m and if vi, … , vm HU the fix(v1, … , vm) HU where fix is m-ary function symbol. DSL, 7 November 2006
Schedule semantics The semantics of a schedule s is the mapping H[s]: D HU defined by H[s](x) := Hs(wi(x)) where wi(x) is the last operation from s writing x, for each x D Example: s = w0(x) w0(y) c0 r1(x) r2(y) w2(x) w1(y) c2 c1 H[s](x) = f2x(f0y( )) H[s](y) = f1y(f0x( )) DSL, 7 November 2006
Final state equivalence Let s and s' be schedules. s and s' are called final state equivalent (s f s') if op(s) = op(s') and H[s] = H[s'] Example: s = r1(x)r2(y)w1(y) r3(z) w3(z) r2(x) w2(z) w1(x) s' = r3(z)w3(z)r2(y) r2(x) w2(z) r1(x) w1(y) w1(x) H[s](x) = Hs(w1(x)) =f1x(f0x( )) = Hs'(w1(x)) = H[s'](x) H[s](y) = Hs(w1(y)) = f1y(f0x( )) = Hs'(w1(y)) = H[s'](y) H[s](z) = Hs(w2(z)) = f2z(f0x( ), f0y( )) = Hs'(w1(y)) = H[s'](z) DSL, 7 November 2006
Final state equivalence Let s and s' be schedules. s and s' are called final state equivalent (s f s') if op(s) = op(s') and H[s] = H[s'] Example: s = r1(x)r2(y)w1(y) w2(y) c1 c2 s' = r1(x)w1(y) r2(y) w2(y) c1 c2 H[s](y) = Hs(w2(y)) = f2y(f0y( )) H[s'](y) = Hs'(w2(y)) = f2y(Hs'(r2(y))) =f2y(Hs'(w1(y)))) = f2y(f1y (Hs'(r1(x)))) = f2y(f1y (f0x( ))) DSL, 7 November 2006
Final state serializability A history s is final state serializable if there exists a serial history s' such that s f s' Example: s = r1(x)r2(y)w1(y) r3(z) w3(z) r2(x) w2(z) w1(x) c1 c2 c3 s ft3 t2 t1 Final state equivalence of two schedules s and s' can be decided in time polynomial in the length of the two schedules Final state serializability of a schedule s and s' can be decided in time exponentially many (n!) in the total number of transactions involved in s DSL, 7 November 2006
Final state serializability Final state serializability is still insufficient as a correctness criterion ! Example: s = r2(x)w2(x)r1(x) r1(y) r2(y) w2(y) c2 c1 s ft2 t1 Schedule s is incorrect as it reveals inconsistent read anomaly DSL, 7 November 2006
Example: s = r2(x)w2(x)r1(x) r1(y) r2(y) w2(y) c2 c1 Hs(r1(y)) = f0y( ) s' = t2 t1 = r2(x)w2(x)r2(y) w2(y) r1(x) r1(y) c2 c1 Hs'(r1(y)) = Hs'(w2(y)) = f2y(Hs'(r2(x)), Hs'(r2(y))) = f2y(f0x( ),f0y( )) s vt2 t1 View equivalence Let s and s' be schedules. s and s' are called view equivalent (s v s') if: op(s) = op(s') H[s] = H[s'] (3) Hs(p) = Hs'(p) for all read or write steps p DSL, 7 November 2006
View serializability A history s is view serializable if there exists a serial history s' such that s v s' Example: s = w1(x)r2(x)r2(y) w1(y) c2 c1 History s is final state equivalent to either one of two possible serial orders History s is not view serializable because both serial orders differ from s in terms of the read-from operations t1 t2 = w1(x)w1(y) c1 r2(x)r2(y) c2 t2 t1 = r2(x)r2(y) c2 w1(x)w1(y) c1 A problem of deciding for a given history s whether s is view serializable is NP complete DSL, 7 November 2006
Conflict equivalence Let s be a schedule, t, t' trans(s). The operations p t and q t' are in conflict if they access the same data item and at least one of them is write operation. Conflict relation of schedule s is defined as conf(s) = {(p,q) : p and q are in conflict and p <s q } Example: s = w1(x)r2(x)w2(y) r1(y) w1(y) c2 c1 conf(s) = {(w1(x),r2(x)),(w2(y), r1(y)), (w2(y), r1(y)) } Let s and s' be two schedules. S and s' are called conflict equivalent (s c s') if: op(s) = op(s') (2) conf(s) = conf(s') DSL, 7 November 2006
t1 t2 t3 Conflict serializability A history s is conflict serializable if there exists a serial history s' such that at s c s' Example: s = w1(x)w2(x)w2(y) w1(y) w3(x), w3(y) c2 c3 c1 History s is view serializable, s v t1t2t3 History s is not conflict serializable DSL, 7 November 2006
Motivating example T1 T2 read(x) write(x) read(x) abort write(x) s = r1(x)w1(x)r2(x) a1 r2(x) w2(x) c2 DSL, 7 November 2006
Expanded history T1 T2 read(x) write(x) read(x) write-1(x) write(x) s = r1(x)w1(x)r2(x) w1-1(x) r2(x) w2(x) c2 c1 DSL, 7 November 2006
Expansion of a schedule Let s be a schedule, an expansion of s, denoted exp(s), is defined as follows: Steps of exp(s): (a) ti commit(s) op(ti) op(exp(s) ) (b) ti abort(s) (op(ti) - {ai}) {ci} {wi-1(x) : wi(x) ti} op(exp(s) ) (c) ti active(s) (op(ti) - {ci}) {wi-1(x) : wi(x) ti} op(exp(s) ) DSL, 7 November 2006
Expansion of a schedule Let s be a schedule, an expansion of s, denoted exp(s), is defined as follows: (2) Step ordering in exp(s): (a) all steps from op(s) op(exp(s) ) occur in exp(s) in the same order as in s (b) all inverse steps of an aborted transaction occur in exp(s) after their original steps and before the respective commit operation (c) all inverse steps of trasactions in active(s) occur in exp(s) after the original steps of s and before their corresponding commits (d) the ordering of inverse steps is the reverse of the ordering of the corresponding original steps DSL, 7 November 2006
Expansion of a schedule Example: s = w1(x) w2(x) w2(y) w1(y) exp(s) = w1(x)w2(x)w2(y)w1(y)w1-1(y)w2-1(y)w2-1(x)w2-1(x)c2c1 Example: s = r1(x)w1(x)r2(y)w2(y)r3(z)w3(z)r4(y)w4(y)a1c3r2(z)w2(z) exp(s) = r1(x)w1(x)r2(y)w2(y)r3(z)w3(z)r4(y)w4(y) w1-1(x)c1c3 r2(z)w2(z) w2-1(z)w4-1(y)w2-1(y)c2 c4 DSL, 7 November 2006
Expanded Conflict Serializability Let s be a schedule, then s is expanded conflict serializable if its expansion exp(s) is conflict serializable Example: s = r1(x) w1(x) r2(x) a1 c2 exp(s) = r1(x) w1(x) r2(x) w1-1(x)c1 c2 Schedule s is not expanded conflict serializable due to a cyclic conflict w1(x) r2(x) w1-1(x) Example: s' = r1(x) w1(x) a1 r2(x)c2 exp(s') = r1(x) w1(x) w1-1(x) c1 r2(x)c2 Schedule s is expanded conflict serializable, it is equivalent to a serial execution t1 t2 DSL, 7 November 2006
Expanded Conflict Serializability Example: s = w1(x) w2(x) a2 a1 exp(s) = w1(x) w2(x) w2-1(x)c2w1-1(x)c1 Schedule s is not expanded conflict serializable due to a cyclic conflict w1(x) w2(x) w2-1(x)w1-1(x) However, a cycle is irrelevant as both transactions are aborted exp(s) = w1(x) w2(x) w2-1(x)c2 w1-1(x)c1= exp(s) = w1(x) c2w1-1(x)c1= w1(x) w1-1(x)c1= exp(s) = c1= DSL, 7 November 2006
Reducibility A schedule s is reducible if its expansion exp(s) can be transformed into a serial history by finitely many applications of the following rules: (1) Commutativity rule: If p, q op(exp(s)) such that p < q and p and q do not conflict and there exists no step o op(exp(s)) such that p < o < q then the order of p and q can be reversed Undo rule: If p, q op(exp(s)) are inverses of each other and there exists no step o op(exp(s)) such that p < o < q then the pair p,q of steps can be removed from exp(s) Null rule: If p op(exp(s)) has the form p = ri(x) such that ti active(s) abort(s) then p can be removed from exp(s) Ordering rule: Two commutative, unordered operations an be arbitrarily ordered DSL, 7 November 2006
Reducibility Example: s = r1(x)w1(x) r2(x) w2(x) a2 a1 exp(s) = r1(x)w1(x) r2(x) w2(x) w2-1(x)c2w1-1(x)c1 exp(s) = r1(x)w1(x) r2(x) w2(x) w2-1(x)c2w1-1(x) c1 exp(s) = r1(x)w1(x) r2(x) c2w1-1(x)c1 exp(s) = r1(x)w1(x) c2 w1-1(x)c1 exp(s) = r1(x)w1(x) w1-1(x) c2 c1 exp(s) = r1(x)c2 c1 exp(s) = c2 c1 DSL, 7 November 2006
Reducibility Example: s = w1(x) w2(x) c2a1 exp(s) = w1(x) r2(x) w2(x) c2w1-1(x)c1 Schedule s is not expanded conflict serializable and it is not reducible DSL, 7 November 2006
Prefix reducibility A schedule s is prefix reducibleif each of its prefixes is reducible Every expanded conflict serializable schedules is a reducible schedule Expanded conflict serializable schedules and prefix reducible schedules are incomparable with respect to set inclusion Example: s = w1(x) w2(x) c2 c1is expanded conflict serializable and it is not prefix reducible s = w1(x) w2(x) a2 a1 is not expanded conflict serializable and it is prefix reducible DSL, 7 November 2006
Recoverability A schedule s is recoverableif for any two transactions ti, tj if ti reads from tj in s and ci op(s) then cj <sci Example: s = w1(x)w1(y)r2(u)w2(x)r2(y)w2(y)w3(u) c3 c2w1(z)c1 A schedule above is not recoverable because a transaction t2 reads y from transaction t1 and t2 commits before t1 DSL, 7 November 2006
Avoiding cascading aborts A schedule s avoids cascading abortsif a transaction ti reads x from tj then cj <sri(x) Example: s = w1(x)w1(y)r2(u)w2(x)r2(y)w2(y)w3(u) c3 c2w1(z)c1 A schedule above does not avoid cascading aborts because a transaction t2 reads y from transaction t1 and t1 commits after r2(y) DSL, 7 November 2006
Strictness A schedule s is strictif for all transactions ti involved in the schedule and for all pi(x) op(s), p {r, w}: if wj(x) <s pi(x) thenaj<s pi(x) or cj <spi(x) Example: s = w1(x)w1(y)r2(u)w2(x)w1(z)c1r2(y)w2(y)w3(u) c3 c2 A schedule above is not strict because a transaction t2 overwrites x before transaction t1 that wrote it first commits DSL, 7 November 2006
Rigorousness A schedule s is rigorous if it is strict and for all transactions ti, tj if rj(x) <s wi(x) then aj<s wi(x) or cj <swi(x) Example: s = w1(x)w1(y)r2(u)w1(z)c1w2(x)r2(y)w2(y)w3(u)c3 c2 A schedule above is not rigorous because transaction t2 reads data item u and u is overwritten by a transaction t3 before t2 commits DSL, 7 November 2006
Log recoverability A schedule s is log recoverable if the following two properties hold: s is recoverable (2) for all transactions ti, tj if there is write/write conflict of the form wi(x) < wj(x) in s then ai<wj(x) or ci < cjif tj commits and aj<aiif ti aborts. A schedule s is prefix reducible if and only if it is log recoverable and its committed projection is conflict serializable DSL, 7 November 2006
References G. Weikum and G. Vossen Transactional Information Systems Theory, Algorithms, and the Practice of Concurrency Control and Recovery, Morgan Kaufmann, 2002 DSL, 7 November 2006