350 likes | 476 Views
CS 405G: Introduction to Database Systems. Lecture 7: Relational Algebra II Instructor: Chen Qian Spring 2014. p. Review: Summary of core operators. Selection: Projection: Cross product: Union: Difference: Renaming:. σ p R π L R R X S R S R - S ρ S ( A 1 , A 2 , …) R.
E N D
CS 405G: Introduction to Database Systems Lecture 7: Relational Algebra II Instructor: Chen Qian Spring 2014 p
Review: Summary of core operators • Selection: • Projection: • Cross product: • Union: • Difference: • Renaming: σpR πLR RXS RS R- S ρS(A1, A2, …)R Chen Qian @ University of Kentucky
Review Summary of derived operators • Join: • Natural join: • Intersection: RpS RS RS • Many more • Outer join, Division, • Semijoin, anti-semijoin, … Chen Qian @ University of Kentucky
Using Join • What are the ids of Lisa’s classes? • Student(sid: string, name: string, gpa: float) • Course(cid: string, department: string) • Enrolled(sid: string, cid: string, grade: character) • An Answer: • Student_Lisa σname = “Lisa”Student • Lisa_Enrolled Student_Lisa Enrolled • Lisa’s classes πCID Lisa_Enrolled • Or: • Student_Enrolled Student Enrolled • Lisa_Enrolled σname = “Lisa” Student_Enrolled • Lisa’s classes πCID Lisa_Enrolled Chen Qian @ University of Kentucky
σname = “Lisa” πcid Join Example Chen Qian @ University of Kentucky
πCID IDs of Lisa’s classes Who’s Lisa? σname = “Lisa” Enroll Student IDs of Lisa’s Classes • πCID( (σname = “Lisa”Student) Enrolled) Chen Qian @ University of Kentucky
πSID Who’s Lisa? σname = “Lisa” πCID Lisa’s classes Enroll Student Enroll Students in Lisa’s Classes • SID of Students in Lisa’s classes • Student_Lisa σname = “Lisa”Student • Lisa_Enrolled Student_Lisa Enrolled • Lisa’s classes πCID Lisa_Enrolled • Enrollment in Lisa’s classes Lisa’s classes Enrolled • Students in Lisa’s class πSIDEnrollment in Lisa’s classes Students inLisa’s classes Chen Qian @ University of Kentucky
Tips in Relational Algebra • Use temporary variables • Use foreign keys to join tables Chen Qian @ University of Kentucky
πname πSID Student πCID Lisa’s classes Enroll Who’s Lisa? σname = “Lisa” Enroll Student An exercise • Names of students in Lisa’s classes Their names Students inLisa’s classes Chen Qian @ University of Kentucky
- All CID’s CID’s of the coursesthat Lisa IS taking πCID πCID Course σname = “Lisa” Enroll Student Set Minus (difference) Operation • CID’s of the courses that Lisa is NOT taking Chen Qian @ University of Kentucky
Enroll1(SID1, CID1,Grade1) Renaming Operation • Enrolled1(SID1, CID1,Grade1) Enrolled Chen Qian @ University of Kentucky
Example • We have the following relational schemas • Student(sid: string, name: string, gpa: float) • Course(cid: string, department: string) • Enrolled(sid: string, cid: string, grade: character) • SID’s of students who take at least two courses EnrolledEnrolled πSID (EnrolledEnrolled.SID = Enrolled.SID & Enrolled.CID¹Enrolled.CIDEnrolled) Chen Qian @ University of Kentucky
πSID1 SID1 = SID2 & CID1 ¹CID2 ρEnroll1(SID1, CID1,Grade1) ρEnroll2(SID2, CID2, Grade2) Enroll Enroll Example (cont.) Enroll1(SID1, CID1,Grade1) Enrolled Enroll2(SID2, CID2,Grade2) Enrolled πSID (Enroll1SID1 = SID2 & CID1 ¹CID2Enroll2) Expression tree syntax: Chen Qian @ University of Kentucky
How does it work? Enroll1SID1 = SID2 Enroll2 Chen Qian @ University of Kentucky
Enroll1SID1 = SID2 & CID1 ¹CID2Enroll2 Chen Qian @ University of Kentucky
Tips in Relational Algebra • A comparison is to identify a relationship Chen Qian @ University of Kentucky
- πSID πStudent1.SID Student1.GPA < Student2.GPA Student ρStudent1 ρStudent2 Student Student A trickier exercise • Who has the highest GPA? • Who has a GPA? • Who does NOT have the highest GPA? • Whose GPA is lower than somebody else’s? A deeper question:When (and why) is “-” needed? Chen Qian @ University of Kentucky
Review: Summary of core operators • Selection: • Projection: • Cross product: • Union: • Difference: • Renaming: σpR πLR RXS RS R- S ρS(A1, A2, …)R Chen Qian @ University of Kentucky
Review: Summary of derived operators • Join: • Natural join: • Intersection: RpS RS RS Chen Qian @ University of Kentucky
Review • Relational algebra • Use temporary variable • Use foreign key to join relations • A comparison is to identify a relationship Chen Qian @ University of Kentucky
Exercises of R. A. Reserves Sailors Boats Chen Qian @ University of Kentucky
Boat #103 πsname Who reserved boat #103? Sailors σbid = “103” Reserves Problem 1 Find names of sailors who’ve reserved boat #103 • Solution: Chen Qian @ University of Kentucky
Problem 2: Find names of sailors who’ve reserved a red boat πsname Red boats πSID Who reserved red boats? σcolor = “red” Sailors Boat Reserve • Information about boat color only available in Boats; so need an extra join: Names of sailors who reserved red boat Chen Qian @ University of Kentucky
πsname πSID Who reserved red boats? Sailors Red boats Reserve σcolor = “red” color = “green” Boat Problem 3: Find names of sailors who’ve reserved a red boat or a green boat • Can identify all red or green boats, then find sailors who’ve reserved one of these boats: Names of sailors who reserved red boat Chen Qian @ University of Kentucky
Add more rows to the input... RelOp Monotone operators • If some old output rows may need to be removed • Then the operator is non-monotone • Otherwise the operator is monotone • That is, old output rows always remain “correct” when more rows are added to the input • Formally, for a monotone operator op:RµR’ implies op( R ) µop( R’ ) What happens to the output? Chen Qian @ University of Kentucky
Why is “-” needed for highest GPA? • Composition of monotone operators produces a monotone query • Old output rows remain “correct” when more rows are added to the input • Highest-GPA query is non-monotone • Current highest GPA is 4.1 • Add another GPA 4.2 • Old answer is invalidated • So it must use difference! Chen Qian @ University of Kentucky
Selection: σpR Projection: πLR Cross product: RXS Join: RpS Natural join: RS Union: RUS Difference: R-S Intersection: R∩S Monotone Monotone Monotone Monotone Monotone Monotone Monotone w.r.t. R; non-monotone w.r.t S Monotone Classification of relational operators Chen Qian @ University of Kentucky
Why do we need core operator X? • Cross product • The only operator that adds columns • Difference • The only non-monotone operator • Union • The only operator that allows you to add rows? • Selection? Projection? Chen Qian @ University of Kentucky
Aggregate Functions and Operations • Aggregation function takes a collection of values and returns a single value as a result. avg: average valuemin: minimum valuemax: maximum valuesum: sum of valuescount: number of values • Aggregate operation in relational algebra G1, G2, …, GngF1( A1), F2( A2),…, Fn( An) (E) • E is any relational-algebra expression • G1, G2 …, Gn is a list of attributes on which to group (can be empty) • Each Fiis an aggregate function • Each Aiis an attribute name Chen Qian @ University of Kentucky
Aggregate Operation – Example • Relation r: A B C 7 7 3 10 sum-C gsum(c)(r) 27 Chen Qian @ University of Kentucky
Aggregate Operation – Example • Relation account grouped by branch-name: branch-name account-number balance Perryridge Perryridge Brighton Brighton Redwood A-102 A-201 A-217 A-215 A-222 400 900 750 750 700 branch-nameg sum(balance) (account) branch-name balance Perryridge Brighton Redwood 1300 1500 700 Chen Qian @ University of Kentucky
Null Values • It is possible for tuples to have a null value, denoted by null, for some of their attributes • null signifies an unknown value or that a value does not exist. • The result of any arithmetic expression involving null is null. • Aggregate functions simply ignore null values • For duplicate elimination and grouping, null is treated like any other value, and two nulls are assumed to be the same Chen Qian @ University of Kentucky
Null Values • Comparisons with null values return the special truth value unknown • If false was used instead of unknown, then not (A < 5) would not be equivalent to A >= 5 • Three-valued logic using the truth value unknown: • OR: (unknownortrue) = true, (unknownorfalse) = unknown (unknown or unknown) = unknown • AND: (true and unknown) = unknown, (false and unknown) = false, (unknown and unknown) = unknown • NOT: (not unknown) = unknown • Result of select predicate is treated as false if it evaluates to unknown Chen Qian @ University of Kentucky
Review • Expression tree • Tips in writing R.A. • Use temporary variables • Use foreign keys to join tables • A comparison is to identify a relationship • Use difference in non-monotonic results Chen Qian @ University of Kentucky
How to write answers to a R.A. problem • Go ahead to write down a single expression as long as you think it is correct. • However, the followings are recommended: • Draw an expression tree • Write down any • intermediate expressions • temporary variables • renaming operations • Because then you can get partial credits! Chen Qian @ University of Kentucky