520 likes | 687 Views
Foundations of Database Systems. Relational Algebra Instructor: Zhijun Wang. Announcement. Project A sample format is available online Lab#1 will be given on next Saturday and Wednesday . Group 1: M409 Saturday 13:00-16:00 (Feb 23) Group 2:M402
E N D
Foundations of Database Systems Relational Algebra Instructor: Zhijun Wang DBMS and Applications
Announcement • Project A sample format is available online • Lab#1 will be given on next Saturday and Wednesday. Group 1: M409 Saturday 13:00-16:00 (Feb 23) Group 2:M402 Wednesday 18:30-21:30 (Feb27) DBMS and Applications
Relational Algebra • Relational algebra defines a set of operators that may work on relations. • Recall that relations are simply data sets. As such, relational algebra deals with set theory. • The operators in relational algebra are very similar to traditional algebra except that they apply to sets. DBMS and Applications
Relational Algebra Operators • Relational algebra provides several operators: • Projection • Selection • Union • Difference • Intersection • Product • Join DBMS and Applications
Projection Operator • A projection operation produces a second relation that is a subset of the first. • The subset is in terms of columns, not tuples • The resulting relation will contain a limited number of columns. However, every tuple will be listed. DBMS and Applications
Project Operation • Notation:A1, A2, …, Ak (r) where A1, A2 are attribute names and r is a relation name • The result is defined as the relation of k columns obtained by erasing the columns that are not listed • Duplicate rows removed from result, since relations are sets • E.g. To eliminate the branch-name attribute of accountaccount-number, balance (account) DBMS and Applications
A C A C 1 1 1 2 1 1 2 = Project Operation – Example • Relation r: A B C 10 20 30 40 1 1 1 2 • A,C (r) DBMS and Applications
Selection Operator • The selection operator is similar to the projection operator. It produces a second relation that is a subset of the first. • However, the selection operator produces a subset of tuples, not columns. • The resulting relation contains all columns, but only contains a portion of the tuples. DBMS and Applications
Select Operation • Notation: p(r) • p is called the selection predicate • Defined as: p(r) = {t | t rand p(t)} Where p is a formula in propositional calculus consisting of terms connected by : (and), (or), (not)Each term is one of: <attribute> op <attribute> or <constant> where op is one of: =, , >, . <. • Example of selection:branch-name=“Perryridge”(account) DBMS and Applications
Select Operation – Example • Relation r A B C D 1 5 12 23 7 7 3 10 • A=B ^ D > 5(r) A B C D 1 23 7 10 DBMS and Applications
Union Operator • The union operator adds tuples from one relation to another relation • A union operation will result in combined relation • This is similar to the logical operator ‘OR’ DBMS and Applications
Union Operation • Notation: r s • Defined as: r s = {t | t r or t s} • For r s to be valid. 1. r,s must have the same arity (same number of attributes) 2. The attribute domains must be compatible (e.g., 2nd column of r deals with the same type of values as does the 2nd column of s) • E.g. to find all customers with either an account or a loancustomer-name (depositor) customer-name (borrower) DBMS and Applications
Union Operator • JUNIOR and HONOR-STUDENT relations and their union: • Example of JUNIOR relation • Example HONOR-STUDENT relation • Union of JUNIOR and HONOR-STUDENT relations DBMS and Applications
Union Operation – Example A B A B • Relations r, s: 1 2 1 2 3 s r A B r s: 1 2 1 3 DBMS and Applications
Difference Operator • The difference operator produces a third relation that contains the tuples that appear in the first relation, but not the second • This is similar to a subtraction DBMS and Applications
Set Difference Operation • Notation r – s • Defined as: r – s = {t | t rand t s} • Set differences must be taken between compatible relations. • r and s must have the same arity • attribute domains of r and s must be compatible DBMS and Applications
Difference Operator JUNIOR relation HONOR-STUDENT relation JUNIOR minus HONOR-STUDENT relation DBMS and Applications
Set Difference Operation – Example A B A B • Relations r, s: 1 2 1 2 3 s r A B r – s: 1 1 DBMS and Applications
Intersection Operator • An intersection operation will produce a third relation that contains the tuples that are common to the relations involved. • This is similar to the logical operator ‘AND’ DBMS and Applications
Set-Intersection Operation • Notation: r s • Defined as: • rs ={ t | trandts } • Assume: • r, s have the same arity • attributes of r and s are compatible • Note: rs = r - (r - s) DBMS and Applications
Intersection Operator JUNIOR relation HONOR-STUDENT relation Intersection of JUNIOR and HONOR-STUDENT relations DBMS and Applications
Set-Intersection Operation - Example • Relation r, s: • r s A B A B 2 3 1 2 1 s r A B 2 DBMS and Applications
Product Operator • A product operator is a concatenation of every tuple in one relation with every tuple in a second relation • The resulting relation will have n x m tuples, where… n = the number of tuples in the first relation and m = the number of tuples in the second relation • This is similar to multiplication DBMS and Applications
Cartesian-Product Operation • Notation r x s • Defined as: r x s = {t q | t r and q s} • Assume that attributes of r(R) and s(S) are disjoint. (That is, R S = ). • If attributes of r(R) and s(S) are not disjoint, then renaming must be used. DBMS and Applications
Cartesian-Product Operation-Example C D E A B Relations r, s: 1 2 10 10 20 10 a a b b r s A B C D E r xs: 1 1 1 1 2 2 2 2 10 10 20 10 10 10 20 10 a a b b a a b b DBMS and Applications
Division Operation r s • Suited to queries that include the phrase “for all”. • Let r and s be relations on schemas R and S respectively where • R = (A1, …, Am, B1, …, Bn) • S = (B1, …, Bn) The result of r s is a relation on schema R–S = (A1, …, Am) r s = { t | t R-S(r) u s ( tu r ) } DBMS and Applications
Division Operation – Example A B B Relations r, s: 1 2 3 1 1 1 3 4 6 1 2 1 2 s A r s: r DBMS and Applications
Another Division Example D E A B C D E Relations r, s: a a a a a a a a a a b a b a b b 1 1 1 1 3 1 1 1 a b 1 1 s r r s: A B C a a DBMS and Applications
Join Operator • The join operator is a combination of the product, selection, and projection operators. There are several variations of the join operator… • Conditional join • Equijoin • Natural join • Outer join • Left outer join • Right outer join DBMS and Applications
Conditional Join Or theta-join • Notation: r c s • Here c is a condition. • Let r and s be relations on schemas R and S respectively.The result is a relation on schema R x S which satisfies condition c. DBMS and Applications
Conditional Join Example A B D E D A B E • r c s 10 11 21 32 33 2.4 3.2 4.3 5.5 5.8 30 31 42 45 48 0.4 1.2 1.3 2.5 2.8 32 32 33 33 5.5 5.5 5.8 5.8 0.4 1.2 0.4 1.4 30 31 30 31 C: A>D r s r s DBMS and Applications
Equal-Join Operation • A special case of conditional join: condition C contains only conjunction of equalities. In other words, the operator in equal join only contains = . DBMS and Applications
Natural-Join Operation • Notation: r s • Let r and s be relations on schemas R and S respectively.The result is a relation on schema R S which is obtained by considering each pair of tuples tr from r and ts from s. • If tr and ts have the same value on each of the attributes in RS, a tuplet is added to the result, where • t has the same value as tr on r • t has the same value as ts on s • Example: R = (A, B, C, D) S = (E, B, D) • Result schema = (A, B, C, D, E) • rs is defined as: r.A, r.B, r.C, r.D, s.E (r.B = s.B r.D = s.D (r x s)) DBMS and Applications
r s Natural Join Operation – Example B D E A B C D 1 3 1 2 3 a a a b b 1 2 4 1 2 a a b a b s r A B C D E 1 1 1 1 2 a a a a b DBMS and Applications
Data for Join Examples DBMS and Applications
Join Examples Equijoin Natural Join Left Outer Join DBMS and Applications
Outer Join • An extension of the join operation that avoids loss of information. • Computes the join and then adds tuples form one relation that does not match tuples in the other relation to the result of the join. • Uses null values: • null signifies that the value is unknown or does not exist • All comparisons involving null are (roughly speaking) false by definition. • Will study precise meaning of comparisons with nulls later DBMS and Applications
Outer Join – Example branch-name loan-number amount Relation loan Downtown Redwood Perryridge L-170 L-230 L-260 3000 4000 1700 Relation borrower customer-name loan-number Jones Smith Hayes L-170 L-230 L-155 DBMS and Applications
loan-number branch-name amount customer-name L-170 L-230 Downtown Redwood 3000 4000 Jones Smith Outer Join – Example • Inner Joinloan Borrower • Left Outer Join loan borrower loan-number branch-name amount customer-name L-170 L-230 L-260 Downtown Redwood Perryridge 3000 4000 1700 Jones Smith null DBMS and Applications
Outer Join – Example • Right Outer Join loanborrower loan-number branch-name amount customer-name L-170 L-230 L-155 Downtown Redwood null 3000 4000 null Jones Smith Hayes Full Outer Join loan borrower loan-number branch-name amount customer-name L-170 L-230 L-260 L-155 Downtown Redwood Perryridge null 3000 4000 1700 null Jones Smith null Hayes DBMS and Applications
Composition of Operations • Can build expressions using multiple operations • Example: A=C(r x s) • r x s • A=C(r x s) A B C D E 1 1 1 1 2 2 2 2 10 19 20 10 10 10 20 10 a a b b a a b b B C D E A 10 10 20 a a b 1 2 2 DBMS and Applications
Banking Example branch (branch-name, branch-city, assets) customer (customer-name, customer-street, customer-only) account (account-number, branch-name, balance) loan (loan-number, branch-name, amount) depositor (customer-name, account-number) borrower (customer-name, loan-number) DBMS and Applications
Example Queries • Find all loans of over $1200 amount> 1200 (loan) • Find the loan number for each loan of an amount greater than $1200 loan-number (amount> 1200 (loan)) DBMS and Applications
Example Queries • Find the names of all customers who have a borrower, depositor, or both, from the bank customer-name (borrower) customer-name (depositor) • Find the names of all customers who have a loan and an account at bank. customer-name (borrower) customer-name (depositor) DBMS and Applications
Example Queries • Find the names of all customers who have a loan at the Perryridge branch. customer-name (branch-name=“Perryridge” (borrower.loan-number = loan.loan-number(borrower x loan))) • Find the names of all customers who have a loan at the Perryridge branch but do not have an account at any branch of the bank. customer-name (branch-name = “Perryridge” (borrower.loan-number = loan.loan-number(borrower x loan)))– customer-name(depositor) DBMS and Applications
Example Queries • Find the names of all customers who have a loan at the Perryridge branch. • Query 1customer-name(branch-name = “Perryridge” (borrower.loan-number = loan.loan-number(borrower x loan))) Query 2 customer-name(loan.loan-number = borrower.loan-number( (branch-name = “Perryridge”(loan)) x borrower) ) DBMS and Applications
Example Queries Find the largest account balance • Rename account relation as d • The query is: balance(account) - account.balance (account.balance < d.balance(account x rd (account))) DBMS and Applications
Example Queries • Find all customers who have an account from at least the “Downtown” and the “Uptown” branches. • Query 1 CN(BN=“Downtown”(depositoraccount)) CN(BN=“Uptown”(depositoraccount)) where CN denotes customer-name and BN denotes branch-name. • Query 2 customer-name, branch-name(depositoraccount) temp(branch-name) ({(“Downtown”), (“Uptown”)}) DBMS and Applications
Example Queries • Find all customers who have an account at all branches located in Brooklyn city.customer-name, branch-name(depositoraccount) branch-name(branch-city = “Brooklyn”(branch)) DBMS and Applications
Summary of Relational Algebra Operators DBMS and Applications