560 likes | 711 Views
Final Revision 2. CS 157A Lecture 26. Prof. Sin-Min Lee. Review Theoretical Query Languages. Relational Algebra. SELECT ( σ ) PROJECT ( π ) UNION ( ) SET DIFFERENCE ( – ) CARTESIAN PRODUCT ( ) RENAME ( ρ ). RA: gives semantics to practical query languages
E N D
Final Revision 2 CS 157A Lecture 26 Prof. Sin-Min Lee
ReviewTheoretical Query Languages Relational Algebra • SELECT ( σ ) • PROJECT ( π ) • UNION ( ) • SET DIFFERENCE ( – ) • CARTESIAN PRODUCT ( ) • RENAME ( ρ ) • RA: gives semantics to practical query languages • Above set: minimal relational algebra • will look at some redundant (but useful!) operators today
Review Express the following query in the RA: Find the names of customers who have both accounts and loans T1 ρT1 (cname2, lno) (borrower) T2 depositor T1 T3 σcname = cname2 (T2) Result π cname (T3) Above sequence of operators (ρ, , σ) very common. Motivates additional (redundant) RA operators.
4. Outer Joins ( ) 2. Division ( ) • Redundant: Above can be expressed in terms of minimal RA • e.g. depositor borrower = • π …(σ…(depositor ρ…(borrower))) • Added as convenience • Natural Join ( ) Relational AlgebraRedundant Operators • 5. Update ( ) (we’ve already been using) 3. Generalized Projection (π)
Relation1 Relation2 Notation: depositor borrower ≡ πcname,acct_no,lno (σcname=cname2 (depositor ρt(cname2,lno) (borrower))) Natural Join Idea: combines ρ, , σ = r s
Division Relation2 Relation1 Notation: Idea: expresses “for all” queries = s Query: Find values for A in r which have corresponding B values for allB values in s r
Division ¸ ´ Another way to look at it: and 17 ¸3=5 The largest value of i such that: i´ 3 ≤ 17 Relational Division = s t The largest value of t such that: ( t s Í r ) r
Division A More Complex Example ? = t s r
Generalized Projection p e1,…,en (Relation) Notation: e1,…,en can include arithmetic expressions – not just attributes Example credit = Then… πcname, limit - balance (credit) =
loan borrower = Outer Joins Motivation: loan = borrower = = • Join result loses… • any record of Perry • any record of Hayes
loan borrower = Outer Joins borrower = loan = 1. Left Outer Join ( ) • preserves all tuples in left relation ┴ = NULL
loan borrower = Outer Joins borrower = loan = 2. Right Outer Join ( ) • preserves all tuples in right relation ┴ = NULL
loan borrower = Outer Joins borrower = loan = 3. Full Outer Join ( ) • preserves all tuples in both relations ┴ = NULL
Update Identifier Query Notation: Common Uses: 1. Deletion: r r – s e.g., account account – σbname=Perry (account) (deletes all Perry accounts) 2. Insertion: r r s e.g., branch branch {(Waltham, Boston, 7M)} (inserts new branch with bname = Waltham, bcity = Boston, assets = 7M) e.g., depositor depositor (ρtemp (cname,acct_no) (borrower)) (adds all borrowers to depositors, treating lno’s as acct_no’s) 3. Update: r πe1,…,en (r) e.g., account πbname,acct_no,bal*1.05 (account) (adds 5% interest to account balances)
Another Theoretical Query LanguageRelational Calculus Two flavors: • Tuple relational calculus (TRC) • Domain relational calculus (DRC) Logic-based query language ({x | … }, , , , , , , …) More declarative than RA RA: πlno (σamt > 1000 (loan)) Procedural • Select loan tuples with amt > 1000 • Project the result of 1 on lno TRC: {t | s loan (t [lno] = s [lno] s [amt] > 1000) } Non-procedural • No order of evaluation implied • Basis for SQL
Tuple Relational CalculusSome Queries 1. Find loans for amounts > $1200 {t | t loan t[amt] > 1200} Basic Form: {x | P(x)} • set comprehension: “the set of all x such that P(x) is true” • x: tuple variable • logic contained in predicate (P) Result • t loan • t [amt] > 1200 • (1) (2) (equivalent to σ(2) (loan))
Tuple Relational CalculusPredicates Given {x | P(x)}, what can P(x) be? 1. Simple predicate (, =, ≠, <, >, ≤, ≥) • e.g., t loan • e.g., t [amt] > 1200 2. Compound predicate (, , , ) • e.g., (t loan) t [amt] > 1200 • e.g., (t [bname] = “Downtown”) • e.g., (t [bname] = “Downtown”) t [amt] > 1200 ( º OR, º AND, º NOT)
Tuple Relational CalculusPredicates Given {x | P(x)}, what can P(x) be? 3. Quantified Predicates (, ) • (a) Existential Quantification () • t r (Q (t)) • true if there exists some tuple in r (t) such that Q(t) is true • e.g., s loan (s [lno] = “L-17”) • (b) Universal Quantification () • t r (Q (t)) • true if for all tuples in r (t), Q(t) is true • e.g., s loan (s [amt] > 100)
Tuple Relational CalculusMore Queries 2. {t | t loan s loan (s [amt] > t [amt])} A. Returns everything in loan except for (Redwood, L-23, 2000) 3. {t | t loan s loan (s [amt] >t [amt])} A. Returns Q. Express a TRC query to find the largest loan A. {t | t loan s loan (s [amt] <t [amt])} OR {t | t loan ( s loan (s [amt] > t [amt]))}
Tuple Relational CalculusProjection Queries σ : Find loans for amts > 1200 {t | t loan (t [amt] > 1200)} t | t loanindicates that t has same structure as tuples in loan π: Find loan numbers for all loans for amts > 1200 {t | s loan(t [lno] = s [lno] s [amt] > 1200)} No predicate of form: t relation t consists of attributes used in set comprehension with t (i.e., lno) Result =
Tuple Relational CalculusProjection Queries Q. Find names and cities of branches with assets > $3M. A. {t | s branch (t [branch] = s [branch] t [bcity] = s [bcity] s [assets] > 3M)} Result =
Tuple Relational CalculusJoin Queries Find the names of customers w/ loans at the Perry branch. Answer has form {t | P(t)}. Strategy for determining P(t): 1. What tables are involved? • borrower (s), loan (u) 2. What are the conditions? • (a)Projection: t [cname] = s [cname] • (b)Join: s [lno] = u [lno] • (c)Selection: u [bname] = “Perry”
Tuple Relational CalculusJoin Queries Find the names of customers w/ loans at the Perry branch. A. {t | s borrower (P(t,s))} such that: P(t,s)ºt [cname] = s [cname] u loan (Q(t,s,u)) Q(t,s,u)ºs [lno] = u [lno] u [bname] = “Perry” OR {t | s borrower ( t [cname] = s [cname] u loan (s [lno] = u [lno] u [bname] = “Perry”))} unfolded version (either is ok)
Tuple Relational CalculusJoin Queries Q. Find loan numbers of loans held at branches in Brooklyn. 1. Tables involved loan (s), branch (u) 2. Conditions (a) Projection t [lno] = s [lno] (b) Join s [bname] = u [bname] (c) Selection u [bcity] = “Brooklyn” A. {t | s loan (P(t,s))} such that: P(t,s)ºt [lno] = s [lno] u branch (Q(t,s,u)) Q(t,s,u)ºs [bname] = u [bname] u [bcity] = “Brooklyn”
Tuple Relational CalculusJoin Queries • Find the names and cities of customers having a loan • from the Perry branch 1. Tables involved borrower (s), customer (u), loan (v) 2. Conditions (c) Selection (a) Projection • t [cname] = s [cname] • t [ccity] = u [ccity] v [bname] = “Perry” (b) Join • s [cname] = u [cname] • s [lno] = v [lno] A. {t | s borrower (P(t,s))} P(t,s)ºt [cname] = s [cname] u customer (Q(t,s,u)) Q(t,s,u)ºt [ccity] = u [ccity] s [cname] = u [cname] v loan (R(t,s,u,v)) R(t,s,u,v)ºs [lno] = v [lno] v [bname] = “Perry”
P(t) Q(t) Tuple Relational CalculusImplication (Þ) Resembles if … then • p q : true if p being true always means q is also true • p q ≡ p q Example • {t | t loan P(t) Q(t) } • P(t)≡t [bname] = “Perry” • Q(t)≡t [amt] > 1000 Result =
Tuple Relational CalculusImplication (Þ) Often is used with to express “for all” queries e.g., Find names of customers who have an account at all branches located in Brooklyn Connection of all to implies Rewording of example: • Find names of customers for whom the following property hold: • For every branch, if the branch in located in Brooklyn, this implies • that the customer has an account at that branch.
Tuple Relational CalculusImplication (cont.) • Find names of customers for whom the following • property holds: For every branch, if the branch is • located in Brooklyn, this implies that the • customer has an account at that branch. A. {t | s branch (s [bcity] = “Brooklyn” P(t,s))} What is P(t,s)?
Tuple Relational CalculusImplication (cont.) • Find names of customers for whom the following • property holds: For every branch, if the branch is • located in Brooklyn, this implies that the • customer has an account at that branch. A. {t | s branch (s [bcity] = “Brooklyn” P(t,s))} 1. Tables involved branch (s), depositor (u), account (v) 2. Conditions • (c) Join (a) Implication • s [bname] = v [bname] • u [acct_no] = v [acct_no] s [bcity] = “Brooklyn” • (b) Projection • (d) Selection • - • t [cname] = u [cname]
Tuple Relational CalculusImplication (cont.) • Find names of customers for whom the following • property holds: For every branch, if the branch is • located in Brooklyn, this implies that the • customer has an account at that branch. A. {t | s branch (s [bcity] = “Brooklyn” P(t,s))} P(t,s)≡ u depositor (Q(t,s,u)) Q(t,s,u)≡ t [cname] = u [cname] v account (R(t,s,u,v)) R(t,s,u,v)≡ s [bname] = v [bname] u [acct_no] = v [acct_no]
LetDi be a domain variablec be a domain constant be a comparison operator Atoms r(D1, D2, …, Dn) Di Dj Di c Let F, F1 and F2 be formulas Formulas ( F ) not F F1 and F2 F1 or F2Let D be free* in F(D) (exists D) F(D) (forall D) F(D)* a variable is free in a formula if it is not quantified by exists or forall Domain Relational CalculusAtoms & Formulas
Domain Relational CalculusValid Expression { D1, …, Dn | F(D1, …, Dn) } is a valid DRC expression if it has only the variables appearing to the left of the vertical bar | free in F.Any other variable appearing in F must be bound. free vs. bound variables • free (global): variable is not explicitly quantified • bound (free): variable is declared explicitly through quantification and its scope is the quantified formula
Domain Relational CalculusRelational Completeness condition (r):{ R1, …, Rn | r(R1, …, Rn) and condition} ai,…,aj(r): { Ri, …, Rj | r(R1, …, Ri, …, Rj, …, Rn)} r s: { D1, …, Dn | r(D1, …, Dn) or s(D1, …, Dn) } r - s: { D1, …, Dn | r(D1, …, Dn) and not s(D1, …, Dn) } q × r : { Q1, …, Qm, R1, …, Rn | q(Q1, …, Qm) and r(R1, …, Rn) }
Tuple Relational CalculusSyntax Summary { T1, …, Tn | F(T1, …, Tn) } • F describes the properties of the data to be retrieved. • The output schema of F is given by the tuple variables T1, …, Tn that act as global variables in F.
LetT and Ti be tuple variablesaj be an attributec be a domain constant be a comparison operator Atoms r(T) Ti.am Tj.an T.ai c Let F, F1 and F2 be formulas Formulas ( F ) not F F1 and F2 F1 or F2Let T be free* in F(T) (exists T) F(T) (forall T) F(T)* a variable is free in a formula if it is not quantified by exists or forall Tuple Relational CalculusAtoms & Formulas
Tuple Relational CalculusValid Expression { T1, …, Tn | F(T1, …, Tn) } is a valid TRC expression if it has only the variables appearing to the left of the vertical bar | free in F.Any other variable appearing in F must be bound. free vs. bound variables • free (global): variable is not explicitly quantified • bound (free): variable is declared explicitly through quantification and its scope is the quantified formula
Tuple Relational CalculusRelational Completeness condition (r):{ R| r(R) and condition} ai…,aj(r): { R.ai, …, R.aj | r(R)} r s: { T | r(T) or s(T) } r - s: { T | r(T) and not s(T) } q × r : { Q, R | q(Q) and r(R) }
Introduction - Relational algebra is procedural: it specifies the procedure to be followed in order to get the answer to the query. - Relational calculus is declarative: it describes (declares) the answer to the query without specifying how to get it. - Relational calculus strongly resembles First Order Predicate Logic, or simply first order logic. - There are two variants of relational calculus: - Tuple relational calculus (TRC) - Domain relational calculus (DRC)
TUPLE RELATIONAL CALCULUS - Aquery statement in TRC is a set declaration having the form: { P first-order logic formula} - This is to be read as ‘the set of tuple variables, P, for which the specified first order logic formula is true’. - Thus a TRC query is a request (to the DBMS) to produce a set of tuples corresponding to the tuples of the relational answer in SQL. - Example Given the following query: (Q11) Find all sailors with a rating above 7. The TRC statement of this query is {S S Sailors S.rating > 7}.
SYNTAX AND SEMANTICS OF TRC • The syntax and semantics of TRC is that of first-order logic. It is stated quite precisely in the text and there is no need to repeat it here. Instead we shall examine a few query applications.
QUERY Q12 (Q12) Find the names and ages of sailors with a rating above 7. {P ∃S Sailors (S.rating > 7 P.name = S.name P.age = S.age)} Remarks 1. The fact that the tuple variable P occurs with two attributes (using the dot notation) means that solely these two attributes are required in the answer relation. 2. The symbols used are the usual first-order logic symbols: ∀: for all ∃: there exists ⋀ : and ⋁: or ¬ : not ⇒ : implies
QUERIES 1,2,7,9,14 The TRC statements for these queries are pretty well self explanatory, especially with the added English statements of how to read them.
DOMAIN RELATIONAL CALCULUS (1) - The form of a DRC query is as follows: {<X1, X2, … , Xn> logical DRC formula} signifying that the system must construct (and output) a set of all the tuples which satisfy the stated logical DRC formula in terms of the n attributes X1, X2, … ,Xn. Thus, the answer is a relational instance with attributes X1, X2, … , Xn, these attributes corresponding to those of some of the relations in the database. - Again, the approach used by the system is left unspecified. - The Syntax and the semantics of the DRC are explicitly and precisely described in the text.
DOMAIN RELATIONAL CALCULUS (2) Example: (Q11) Find all sailors with a rating above 7. { < I, N, T, A > <I, N, T, A > ∈ Sailors ⋀ T > 7 } Other queries are illustrated and described in the text with all necessary explanation.
EXPRESSIVE POWER OF ALGEBRA AND CALCULUS( 1) Safety - Certain queries stated in the relational calculus may lead to answers which contain an infinite number of tuples (or at least as many as the system can handle). Example: Consider the TRC query {S ¬(S ∈ Sailors)}. Since there is a quasi-infinite number of tuples that can be created with the attributes of sailors, the answer is (quasi)-infinite. - A query which yields a (quasi)-infinite answer is said to be unsafe, and, of course, should not be allowed by the system. - It is possible to define a safe formula in TRC (see text, section 4.4).
LabSessie 2Tuple Relational Calculus • A query in a tuple relational calculus is expressed as (set of all tuples t such that predicate P is true for t) • Query with constructs “or”, “and”, “there exists” (there exists a tuple t in relation such that predicate Q(t) is true) {t| P(t)} t r(Q(t))
Tuple Relational Calculus • Query with the construct “implies” (if P is true, then Q must be true) • Query with the construct “for all” (Q is true for all tuples t in relation r) P Q t r(Q(t))
{t| P(t)} Tuple Relational Calculus t r(Q(t)) • Relations • V(d, k): visits(drinker, kroeg) • S(k,b): servers(kroeg, bier) • L(d,b): likes(drinker, bier) • Example (one relation) • We want to have the drinkers and the pubs for visitors of the pub ‘Café’: {t|tV t[k]=‘Café’} V(d, k) S(k,b) L(d,b)
Tuple Relational Calculus • Example (one relation) • If we want only the drinker attribute rather than all the attribute of the V relation: {t| s V ( t[d] = s[d] s[k]=‘Café’)} • Example (two relations) • Fin the names of drinkers that likes Duvel {t| s V ( t[d] = s[d] u L ( s[d] = u[d] u[b]=‘Duvel’ ))} V(d, k) S(k,b) L(d,b) V(d, k) S(k,b) L(d,b)