350 likes | 658 Views
Relational Algebra. Relational Query Languages. Query = “retrieval program” Language examples: Theoretical : Relational Algebra Relational Calculus tuple relational calculus (TRC) domain relational calculus (DRC) Practical SQL (SEQUEL from System R) QUEL (Ingres) Datalog (Prolog-like)
E N D
Relational Query Languages • Query = “retrieval program” • Language examples: • Theoretical: • Relational Algebra • Relational Calculus • tuple relational calculus (TRC) • domain relational calculus (DRC) • Practical • SQL (SEQUEL from System R) • QUEL (Ingres) • Datalog (Prolog-like) • Theoretical QL’s: • give semantics to practical QL’s • key to understand query optimization in relational DBMSs
Relational Algebra • Basic operators • select ( ) • project (p ) • union ( ) • set difference ( - ) • cartesian product ( x ) • rename ( ) • The operators take one or two relations as inputs and give a new relation as a result. relation relation relational operator relation
Example Instances R1 S1 Boats S2 Schema: Boats(bid, bname, color) Sailors(sid, sname, rating, age) Reserves( sid, bid, day)
Projection • Examples: ; • Retains only attributes that are in the “projection list”. • Schemaof result: • exactly the columns in the projection list, with the same names that they had in the input relation. • Projection operator has to eliminate duplicates (How do they arise? Why remove them?) • Note: real systems typically don’t do duplicate elimination unless the user explicitly asks for it. (Why not?)
Projection S2
Selection () • Selects rows that satisfy selection condition. • Result is a relation. Schemaof result is same as that of the input relation. • Do we need to do duplicate elimination?
Selection • Notation: p(r) • p is called the selection predicate , rcan be the name of a table, or another query • Predicate: • Simple • attr1 = attr2 • Attr = constant value • (also, <, > , etc) • Complex • predicate1 AND predicate2 • predicate1 OR predicate2 • NOT (predicate)
Union and Set-Difference • All of these operations take two input relations, which must beunion-compatible: • Same number of columns (attributes). • `Corresponding’ columns have the same type. • For which, if any, is duplicate elimination required?
Union S1 S2
Set Difference S1 S2 – S1 S2
Cartesian-Product • S1 R1: Each row of S1 paired with each row of R1. Like the c.p for mathematical relations: every tuple of S1 “appended” to every tuple of R1 • Q: How many rows in the result? • Result schemahas one field per field of S1 and R1, with field names `inherited’ if possible. • May have a naming conflict: Both S1 and R1 have a field with the same name. • In this case, can use the renaming operator…
Cartesian Product Example R1 S1 R1 X S1 =
Rename ( ) • Allows us to refer to a relation by more than one name and to rename conflicting names Example: x (E) returns the expression E under the name X • If a relational-algebra expression E has arity n, then x(A1, A2, …, An)(E) returns the result of expression E under the name X, and with the attributes renamed to A1, A2, …., An. Ex. temp1(sid1,sname,rating, age, sid2, bid, day)(R1 x S1)
Compound Operator: Intersection • In addition to the 6 basic operators, there are several additional “Compound Operators” • These add no computational power to the language, but are useful shorthands. • Can be expressed solely with the basic ops. • Intersection takes two input relations, which must be union-compatible. • Q: How to express it using basic operators? R S = R (R S)
Intersection S1 S2
Compound Operator: Join • Joins are compound operators involving cross product, selection, and (sometimes) projection. • Most common type of join is a “natural join” (often just called “join”). R S conceptually is: • Compute R S • Select rows where attributes that appear in both relations have equal values • Project all unique atttributes and one copy of each of the common ones. • Note: Usually done much more efficiently than this. • Useful for putting “normalized” relations back together.
Natural Join Example R1 S1 R1 S1 =
Other Types of Joins • Condition Join (or “theta-join”): • Result schemasame as that of cross-product. • May have fewer tuples than cross-product. • Equi-join: special case: condition c contains only conjunction of equalities.
Compound Operator: Division • Useful for expressing “for all” queries like: Find sids of sailors who have reserved allboats. • For A/B attributes of B are subset of attrs of A. • May need to “project” to make this happen. • E.g., let A have 2 fields, x and y; B have only field y: A/B contains all tuples (x) such that for every y tuple in B, there is an xy tuple in A.
Examples of Division A/B B1 B2 B3 A/B2 A/B3 A/B1 A
A/B = Disqualified x values Expressing A/B Using Basic Operators • Division is not essential op; just a useful shorthand. • (Also true of joins, but joins are so common that systems implement joins specially.) • Idea: For A/B, compute all x values that are not `disqualified’ by some y value in B. • x value is disqualified if by attaching y value from B, we obtain an xy tuple that is not in A. • Disqualified x values =
Banking Example branch (branch-name, branch-city, assets) customer (customer-name, customer-street, customer-only) account (account-number, branch-name, balance) loan (loan-number, branch-name, amount) depositor (customer-name, account-number) borrower (customer-name, loan-number)
Example Queries • Find all loans of over $1200 amount >1200 (loan) • Find the loan number for each loan of an amount greater than $1200 ploan-number (amount> 1200 (loan)) loan (loan-number, branch-name, amount)
Example Queries • Find the names of all customers who have a loan, an depositor account, or both, from the bank pcustomer-name (borrower) pcustomer-name (depositor) • Find the names of all customers who have a loan and a depositor account at bank. pcustomer-name (borrower) pcustomer-name (depositor) depositor (customer-name, account-number) borrower (customer-name, loan-number)
Example Queries • Find the names of all customers who have a loan at the Perryridge branch but do not have an depositor account at any branch of the bank. pcustomer-name (branch-name = “Perryridge” (borrower loan)) loan (loan-number, branch-name, amount) depositor (customer-name, account-number) borrower (customer-name, loan-number) – pcustomer-name(depositor)
Example Queries account (account-number, branch-name, balance) Find the largest account balance • Rename account relation as d • The query is: p balance(account) - paccount.balance( account.balance < d.balance(account x rd (account)))
Example Queries account (account-number, branch-name, balance) • Find all customers who have an account from at least the “Downtown” and the Uptown” branches. • Query 1 pCN(BN=“Downtown”(depositoraccount)) pCN(BN=“Uptown”(depositoraccount)) where CN denotes customer-name and BN denotes branch-name. • Query 2 customer-name, branch-name(depositoraccount) temp(branch-name) ({(“Downtown”), (“Uptown”)}) depositor (customer-name, account-number)
Example Queries • Find all customers who have an account at all branches located in Boston.pcustomer-name, branch-name(depositoraccount) pbranch-name(branch-city = “Boston”(branch))
Extended Relational Operations • Additional Operators that extend the power of the language • Based on SQL… make the language less clean • Generalized projections • Outer Joins • Update
General Projection Notation: pe1, e2, …, en (Relation) ei: can include any arithmetic operation – not only attributes credit = Example: Then: pcname, limit – balance =
Outer Joins Motivation: loan borrower loan borrower = • Join result loses: • any record of Perry • any record of Smith
Outer Join ( ) • Left outer Join ( ) preserves all tuples in left relation loan borrower = • Right outer Join ( ) preserves all tuples in right relation
Outer Join (cont) • Full Outer Join ( ) • preserves all tuples in both relations
Update () • Deletion: r r – s • account account – s bname = Perry (account) • Insertion: r r s • branch branch {( BU, Boston, 9M)} • Update: r pe1, e2, …, en (r) • account pbname, acct_no, bal * 1.05 (account)