530 likes | 768 Views
Relational Algebra. Ch 7.4-7.6. Outline. Set Theoretic operations union, intersection, difference, Cartesian product Relational operations project, select, rename, join, division. Set Theoretic Operations. Set Operators union intersection difference Cartesian Product
E N D
Relational Algebra Ch 7.4-7.6
Outline • Set Theoretic operations • union, intersection, difference, Cartesian product • Relational operations • project, select, rename, join, division
Set Theoretic Operations • Set Operators • union • intersection • difference • Cartesian Product • Set operators are binary and will only work on two relations or sets of data.
Set Theoretic Operations • Union: R ∪ S, is a relation that includes all tuples that are either in R or in S or in both (duplicates are eliminated). • Intersection: R ∩ S, is a relation that includes all tuples that are in both R and S. • Set Difference: R – S, is a relation that includes all tuples that are in R but not in S. • Union and Intersection are commutative and associative. • R ∪ S ≡ S ∪ R and R∪(S∪T) ≡ (R∪S)∪T • R ∩ S ≡ S ∩ R and R ∩(S ∩ T) ≡ (R ∩ S) ∩ T • R – S ≠ S – R (Difference is not commutative).
Set Theoretic Operators (cont’d) • Can only be used on union compatible sets • R (A1, A2, …, AN) and S(B1, B2, …, BN) are union compatible if: • degree (R) = degree (S) = N • domain (Ai) = domain (Bi) for all i
Union (⋃) • Assuming R & S are union compatible… • union: R⋃S is the set of tuples in either R or S or both. • since it is a set, there are no duplicate tuples
Set Operations (Example) Two union compatible relations Student ∪ Instructor
Example of Union Dept_phones (dcode, phone_number) R = S = R ⋃ S = ??? S ⋃ R = ???
Intersection (⋂) • Assuming that R & S are union compatible: • intersection: R ⋂ S is the set of tuples in both R and S • Note that R ⋂ S = S ⋂ R
Set Operations (Example) Student ∩ Instructor Two union compatible relations
Example: use R and S from before… • Dept_phones (dcode, phone_number) • R = S = • R ⋂ S = ??? • S ⋂ R = ???
Difference (-) • Difference: R – S is the set of tuples that appear in R but do not appear in S • Is (R – S) = (S – R) ??? • Example: R – S
Set Operations (Example) STUDENT – INSTRUCTOR Two union compatible relations INSTRUCTOR – STUDENT
Cartesian Product (X) • Sets do not have to be union compatible (usually not) • In general, the result of: R(A1, A2, …, AN) X S(B1, B2, …, BM) is Q (A1, A2, …, AN, B1, B2, …, BM) • If R has C tuples and S has D tuples, the result is C*D tuples.
Cartesian Product • Also called “cross product” • Note that union, intersection and cross product are commutative and associative
Cartesian Product Example R = Dept_phones (dcode, phone_number) S = Department (code, office) What is R X S? R = S =
Cartesian Product • Q has one tuple for each combination of tuples, one from R and one from S. • The operation applied by itself is generally meaningless unless it is followed by a selection. • It is common to use Select operation with Cartesian Product, so a special operation (Join) was developed to specify this sequence in a single operation.
Cartesian Product (cont) • Ex: Retrieve for each female employee a list of her dependents. FEMALE_EMPS ← SEX = “F” (EMPLOYEE) EMPNAMES ← PFNAME,LNAME,SSN (FEMALE_EMPS) EMP_DEPENDENTS ← EMPNAMES DEPENDENT ACTUAL_DEPENDENTS ← SSN = ESSN (EMP_DEPENDENTS) RESULT ← PFNAME,LNAME,DEPENDENT_NAME (ACTUAL_DEPENDENTS)
Relational Operations • developed specifically for relational databases • used to manipulate data in ways that set operations can’t • select • project • join • division
Selection Operation • Used to select a subset of the tuples • Selection is based on a “select condition” • The selection condition is basically a filter • Notation: σ<condition>(<Relation>)
Selection (cont’d) • To process a selection, we: • look at each tuple • see if we have a match (based on the condition) • The degree of the result is the same as the degree of the relation | σ | = | r(R) | • c1 (c2 (… (cn(R)) …)) = c1 AND c2 AND … AND cn (R)
The SELECT Operation () • Ex: Select all employees who work on department 4. DNO = 4 (EMPLOYEE) Ex: Select all employees who work on department 4 and their salaries are more than 25000 or who work on department 5 and their salaries are more than 30000. (DNO = 4 AND SALARY>25000) OR (DNO=5 AND SALARY>30000) (EMPLOYEE)
Selection Example • Faculty (fnum, name, office, salary, rank)
Selection Example (cont’d) • σ salary > 27000(Faculty) • σrank = associate(Faculty) • σ fnum = 34567(Faculty)
Selection Example (cont’d) • σ(salary>26000) and (rank=associate) (Faculty) • σ(salary<=26000) and (rank !=associate) (Faculty) • σmax(salary) (Faculty)
Selection (cont’d) • For the condition • any combination of expressions that can be resolved to a boolean value with the relation is okay • For the relation • any relational expression that resolves to a relation is okay
Back to the Dept & Phone example σcode=comp(department X dept_phone) • First, we resolve the cross product of Department and Dept_phone. • We then apply the select to the resulting relation.
Selection (cont’d) σ is commutative σc1 (σc2(R)) = σc2 (σc1(R))
Project Operation () • A select filters out rows • A project filters out columns • reduces data (columns) returned • reduces duplicate columns created by cross product (why?) • creates a new relation
Project (cont’d) • Notation: <attribute list>(Relation) • Thedegree of the result is the number of attributes in the <attribute list> of the project. • | | <= | r(R) |
The PROJECT Operation (P) • P<list1>(P<list2> (R)) = P<list1> (R) as long as <list2> contains the attributes in <list1>. • The project operation is not commutative. • Ex: List name and salary for each employee. • PLNAME,FNAME,SALARY (EMPLOYEE) • Ex: List sex and salary for each employee. PSEX,SALARY (EMPLOYEE)
Project Example Faculty (fnum, name, office, salary, rank) • name, office(Faculty) • fnum, salary(Faculty)
Rename () • Used to give a name to the resulting relation • Notation to make relational algebra easier to write and understand • We can now use the resulting relation in another relational algebra expression • Notation: <New Name> <Relational Expression>
Rename Example Faculty (fnum, name, office, salary, rank) Associatesσrank = associate(Faculty) Result name(Associates)
Join Operation • Join is a commonly used sequence of operators • Take the Cartesian product of two relations • Select only related tuples • (Possibly) eliminate duplicate columns
Join Example R = S = R1 R X S R2 σ dcode = code (R1) Result code, office, number (R2)
Join Example (cont’d) • You could do all of that, or you could do a join • Result R⋈dcode = code S
Kinds of Joins • There are 3 different kinds of joins • Theta join: A join with some condition specified • Equijoin: A join where the only comparison operator used is “=“ • Most common since most joins link together related tuples using a foreign key
Kinds of Joins (cont’d) • Natural join • Is an equijoin followed by the removal of duplicate (superfluous) column(s) • When people talk about “joins”, this is the one we most often mean. • A Natural join is denoted by (*) • Standard definition requires that the columns used to join the tables have the same name • This is a good idea b/c it makes relationships more obvious
Size of a Natural Join • if R contain nR tuples and S contain nS tuples, then the size of R ⋈<>S is between 0 and nR*nS • join selectivity = Expected size of the result nR * nS • This value is used by the optimizer to estimate the cost of the join.
Types of Joins • Left Outer Join • keep all of the tuples from the “left” relation • join with the right relation • pad the non-matching tuples with nulls • Right Outer Join • same as the left, but keep tuples from the “right” relation • Full Outer Join • same as left, but keep all tuples from both relations
Left Outer Join name phone name email • If we do a left outer join on R and S, and we match on the first column, the result is: R= S= name phone email
Right Outer Join name phone name email • If we do a right outer join on R and S, and we match on the first column, the result is: R= S= name phone email
Full Outer Join name phone name email • If we do a full outer join on R and S, and we match on the first column, the result is: R= S= name phone email
Complete Set of Relational Algebra Operators • It has been shown that {σ, , ⋃, –, X} is a complete set of operations. • Any other relational algebra operations can be expressed as a sequence of operations from this set.
Complete set (cont’d) • Intersection: R ⋂S ≡ (R ⋃ S) – ((R – S) ⋃ (S – R)) • Join: R <condition>S ≡ σ<condition>(R X S) • These other operations were added to express the power of relational algebra but are not really necessary.