250 likes | 276 Views
From Relational Calculus to Relational Algebra. Tuple relational calculus, domain relational calculus, and relational algebra. Domain Relational Calculus 1. .... reminder Predicate Calculus query languages ... a query = finding values satisfying predicate
E N D
From Relational Calculus to Relational Algebra Tuple relational calculus, domain relational calculus, and relational algebra CS319 Theory of Databases
Domain Relational Calculus 1 • .... reminder • Predicate Calculus query languages ... • a query = finding values satisfying predicate • Two kinds of predicate calculus language • primitive objects “tuples” tuple relational calculus • primitive objects “domain values” • domain relational calculus CS319 Theory of Databases
Domain Relational Calculus 2 • Form of relational calculus expressions • Formula is built up using operators of FOPC from • atomic clauses of three types: • 1. R(s1 s2 … sk), where R is relation name, and each si is domain variable • 2. siquj, where s and u are domain variables, and qis • an arithmetic comparison operator (such as <, = etc) • 3. si qc, where s is domain variable, c is a constant CS319 Theory of Databases
Domain Relational Calculus 3 • Semantics of theatomic clauses: • 1. R(s1s2 …sk), where R is relation name, • and each si is domain variable • 2. siquj, where s and u are domain variables, and q is • an arithmetic comparison operator (such as <, = etc) • 3. si qc, where s is domain variable, c is a constant • 1. “s1s2… sk represents a tuple in R” • 2. “domain value represented by si is in relationqto the domain value represented by uj” • 3. “domain value represented by si is in relationqto the constant c” CS319 Theory of Databases
From tuple to domain relational calculus 1 • The definition of safety is expressed in terms of constraints on component values of tuples ... so generalises directly to domain relational calculus • Theorem 2: For every safe tuple relational calculus expression there is an equivalent safe domain relational calculus expression. • Omit formal proof: essentially a syntactic transformation involving substitution for tuple variables CS319 Theory of Databases
From tuple to domain relational calculus 2 • An illustrative example • Can express RCS in tuple relational calculus as: • { w | ($u)($v)(R(u) Ù S(v) Ùf(u,v)) } • where f(u,v) º (u[2]=v[1] Ù w[1]=u[1] Ù w[2]=v[2]) • in domain relational calculus, this becomes: • { w1w2 | ($u1)($u2)($v1)($v2) • (R(u1u2) Ù S(v1v2) ÙF(u1,u2,v1,v2)) } • where F(u1,u2,v1,v2) º (u2=v1Ù w1=u1Ù w2=v2) CS319 Theory of Databases
From domain relational calculus to algebra 1 • Theorem 3: For every safe expression in domain relational calculus, there is an equivalent relational algebra expression. • Proof (sketch only) • Use induction on number of operators in y to construct an algebraic expression for { t1t2 ... tn | y(t1, t2, ..., tn) } • To simplify the induction, begin with two lemmas: • there is a relational algebra exp to represent dom(y) • don’t need to consider and as independent cases CS319 Theory of Databases
From domain relational calculus to algebra 2 • Use induction on number of operators in y to construct an algebraic expression for { t1t2 ... tn | y(t1, t2, ..., tn) } • By safety, enough to show: • for each subformula w of y of the form • { t1t2 ... tm | w(t1, t2, ..., tm) } • $ a relational algebra expression E whose value is • dom(y)* Ç { t1t2 ... tm | w(t1, t2, ..., tm) } • where dom(y)* = set of tuples with compts in dom(y) • i.e. can restrict attention to tuples in dom(y)* CS319 Theory of Databases
From domain relational calculus to algebra 3 • Lemma A: If y is any formula in domain relational calculus, there is a relational algebra expression to represent the unary relation dom(y) • Note: unary relation º set of 1-tuples º set • Proof: Suppose R has arity k. Let • D(R) ºP1(R) ÈP2(R) È ... ÈPk(R). • dom(y) is the union of all D(R)'s over relations R referred to in y together with the set of all constants {a1, a2, ..., an} referred to in y. • Thus can take D as an algebraic expression: • D = R referred to in y D(R) È {a1, a2, ..., an} CS319 Theory of Databases
From domain relational calculus to algebra 4 • Lemma B: If y is any formula in domain relational • calculus, there is a formula y' in domain relational • calculus with no occurrences of Ù or " such that • { t1t2 ... tn | y(t1, t2, ..., tn) } and { t1t2 ... tn | y'(t1, t2, ..., tn) } • are equivalent. This transformation respects safety. • Proof: Wherever the operators Ù and " appear in y: • replace fÙr by (fr). • replace ("v)(f(v)) by ($v)(f(v)). • Need to show that safety is preserved ... CS319 Theory of Databases
From domain relational calculus to algebra 5 • Proof of Lemma B: Wherever the operators Ù and " appear in y: • replace fÙr by (fr). • replace ("v)(f(v)) by ($v)(f(v)). • To show that safety is preserved ... • Observe that dom(y) = dom(y'): • this takes care of the first safety condition. • Note also that if ("v)(f(v)) safe • v Ï dom(f)* Þf(v) true Þf(v) false • Hence ($v)(f(v)) is also safe CS319 Theory of Databases
From domain relational calculus to algebra 6 • Proof of Theorem 3 (cont.) • Consider relation defined by { t1t2...tn | y(t1, t2, ..., tn) } • where y is a safe relational calculus expression • By Lemmata: • can assume neither Ù or " occurs in y • enough to show for each subformula w of y of form • { t1t2...tm | w(t1, t2, ..., tm) } • $ a relational algebra expression E whose value is • dom(y)* Ç { t1t2...tm | w(t1, t2, ..., tm)} • where dom(y)* = set of tuples with compts in dom(y) • Prove this by induction on the number of operators in y. CS319 Theory of Databases
From domain relational calculus to algebra 7 • I.e. prove for all subformulae w of y in particular for y itself by induction on number of operators N in w. • N=0: consider the relation defined by • dom(y)* Ç { t1t2 ... tm | w(t1, t2, ..., tm) } • where w is an atomic formula. • Let D be relational algebra expression for dom(y). • Two cases: • 1. w(ti, tj) = tiq tj or w(ti) = tiq c • where q is an arithmetic comparison operator • 2. w(t1, t2, ..., tm) = R(ti(1)ti(2) ... ti(k)) CS319 Theory of Databases
From domain relational calculus to algebra 8 • Proof of Theorem 3: Base of induction (cont.) 1. w(ti, tj) = tiq tj or w(ti) = tiq c where q is an arithmetic comparison operator 2. w(t1, t2, ..., tm) = R(ti(1)ti(2) ... ti(k)) • For case 1: use expression E ºsiqj(D´D). • For case 2: have w(t1, t2, ..., tm) = R (ti(1)ti(2) ... ti(k)) • By safety, every index r, where 1rm, must be an • index i(j) for some j. Define algebraic expression • E ºÕj(1), j(2), ..., j(m)(sC(R)) • where C is conjunction of relations r=s over pairs (r,s) • such that i(r)=i(s) and j(r) is an index such that i(j(r))=r. CS319 Theory of Databases
From domain relational calculus to algebra 9 • Illustrative example for Case 2: • Take domain relational calculus expression • { t1t2t3 | R(t3t2t1t2) } • Consider indices j for which i(j) = r: this defines a pattern • j=1 j=2 j=3 j=4 • r=1: • • r=2: • • • r=3: • • Suitable expression is E ºÕ3, 4,1(s2=4(R)) CS319 Theory of Databases
From domain relational calculus to algebra 10 • Proof of Thm 3: The induction step • Three cases to consider in the induction step ... • Assume form of w(t1, t2, ..., tm) is • 1. f(u1, u2, ..., up) Úr(v1, v2, ..., vr) • 2. f(t1, t2, ..., tm) • 3. ($t) (f(t1, t2, ..., tm, t)) CS319 Theory of Databases
From domain relational calculus to algebra 11 • Proof of Thm 3: The induction step • Case 1: Assume form of w(t1, t2, ..., tm) is • f(u1, u2, ..., up) Úr(v1, v2, ..., vr) • Can assume • (by safety) the variables u1, u2, ..., up, v1, v2, ..., vr • include all the variables t1, t2, ..., tm • variables in { u1, u2, ..., up } are distinct • variables in { v1, v2, ..., vr } are distinct CS319 Theory of Databases
From domain relational calculus to algebra 12 • Proof of Thm 3: The induction step for operator Ú • Illustrative example shows principle • w(t1, t2, ..., tm) f(u1, u2, ..., up) Úr(v1, v2, ..., vr) • where, in particular case of m =4, p=3, r=2: • w(t1, t2, t3, t4) ºf(t1, t3, t4) Úr(t2, t4) • Let F and G be relational algebra expressions for • { t1t2t3 | f(t1, t2, t3) } and { t1t2 | r(t1, t2) } respectively .... • Need to write down a relational algebra expression for • dom(y)* Ç { t1t2t3t4 | w(t1, t2, t3, t4) } which is also • dom(y)* Ç { t1t2t3t4 | f(t1, t3, t4) Úr(t2, t4) } • ... to do this, must use expression D for dom(y) CS319 Theory of Databases
From domain relational calculus to algebra 13 • Proof of Thm 3: The induction step for operator Ú (cont.) • Need a relational algebra expression for • dom(y)* Ç { t1t2t3t4 | f(t1, t3, t4) Úr(t2, t4) } • Set of tuples t1t2t3t4 satisfying f(t1, t3, t4) is constrained • so that t1t3t4 is a tuple in the relation defined by • algebraic expression F. If D is the algebraic expression • for dom(y) then F´D defines tuples t1t3t4t2 satisfying • f(t1, t3, t4) within dom(y)*. Hence Õ1, 4, 2, 3(F´D) defines • tuples t1t2t3t4 satisfying f(t1, t3, t4) within dom(y)*. CS319 Theory of Databases
From domain relational calculus to algebra 14 • Proof of Thm 3: The induction step for operator Ú (cont.) • Need a relational algebra expression for • dom(y)* Ç { t1t2t3t4 | f(t1, t3, t4) Úr(t2, t4) } • Õ1, 4, 2, 3(F´D) defines tuples t1t2t3t4 satisfying f(t1, t3, t4) in dom(y)*. • Similarly G ´ D ´ D defines tuples t2t4t1t3 satisfying • r(t2, t4) within dom(y)* and Õ3, 1, 4, 2(G ´ D ´ D) defines • tuples t1t2t3t4 satisfying r(t2, t4) within dom(y)*. • Hence can take E ºÕ1, 4, 2, 3(F´D)ÈÕ3, 1, 4, 2(G´D´D) CS319 Theory of Databases
From domain relational calculus to algebra 15 • Case 2: w(t1, t2, ..., tm) is f(t1, t2, ..., tm) • If F is an algebraic expression for • dom(y)* Ç { t1t2 ... tm | f(t1, t2, ..., tm) } • and D is an algebraic expression for dom(y) then • D ´ D ´ ... ´ D - F • m times • represents the relation • dom(y)* - { t1t2 ... tm | f(t1, t2, ..., tm) } • = dom(y)* - (dom(y)* - {t1t2 ... tm | f(t1, t2, ..., tm) }) • = dom(y)* Ç { t1t2 ... tm | f(t1, t2, ..., tm) } CS319 Theory of Databases
From domain relational calculus to algebra 16 • Case 3: • w(t1, t2, ..., tm) is (t) (f(t1, t2, ..., tm, t)) • By induction, have an algebraic expression F for • dom(y)* Ç { t1t2 ... tmtm+1 | f(t1, t2, ..., tm, tm+1) } • Since y is safe: • t satisfies f(t1, t2, ..., tm, t) Þ t Î dom(y)* • Hence Õ1, 2, ..., m(F) represents the required relation: • dom(y)* Ç { t1t2 ... tm | (t) (f(t1, t2, ..., tm, t)) } CS319 Theory of Databases
From domain relational calculus to algebra 17 • Have proved the equivalence of relational algebra and domain / tuple relational calculus ... • Theorems 1, 2 and 3 together prove • relational algebra • domain relational calculus • tuple relational calculus • all have the same expressive power. Thus • A query language is complete if and only if it has the expressive power of one of these formalisms. CS319 Theory of Databases
To follow …Mathematical foundations and features of SQL CS319 Theory of Databases