430 likes | 562 Views
C20.0046: Database Management Systems Lecture #10. M.P. Johnson Stern School of Business, NYU Spring, 2008. Agenda. Subqueries, etc. Sets, etc. Nulls Outer joins. Operators on subqueries. Several new operators applied to (unary) selections: IN R EXISTS R UNIQUE R s > ALL R s > ANY R
E N D
C20.0046: Database Management SystemsLecture #10 M.P. Johnson Stern School of Business, NYU Spring, 2008 M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Agenda • Subqueries, etc. • Sets, etc. • Nulls • Outer joins M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Operators on subqueries • Several new operators applied to (unary) selections: • IN R • EXISTS R • UNIQUE R • s > ALL R • s > ANY R • x IN R • > is just an example op • Each expression can be negated with NOT M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Next:ALL op Employees(name, job, divid, salary) Find which employees are paid more than all the programmers SELECT name FROM Employees WHERE salary > ALL (SELECT salary FROM Employees WHERE job='programmer') M.P. Johnson, DBMS, Stern/NYU, Spring 2008
ANY/SOME op Employees(name, job, divid, salary) Find which employees are paid more than at least one vice president SELECT name FROM Employees WHERE salary > ANY (SELECT salary FROM Employees WHERE job='VP') M.P. Johnson, DBMS, Stern/NYU, Spring 2008
ANY/SOME op Employees(name, job, divid, salary) Find which employees are paid more than at least one vice president SELECT name FROM Employees WHERE salary > SOME (SELECT salary FROM Employees WHERE job='VP') M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Existential/Universal conditions Employees(name, job, divid, salary) Division(name, id, head) Find all divisions with an employee whose salary is > 100000 SELECT DISTINCT Division.name FROM Employees JOIN Division ON divid=id WHERE salary > 100000 Existential: easy! M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Universal conditions Employees(name, job, divid, salary) Division(name, id, head) Find all divisions in which everyone makes > 100000 Universal: less easy M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Universal conditions Employees(name, job, divid, salary) Division(name, id, head) • Idea: find divisions with some poor employee, and throw them out Find all divisions in which everyone makes > 100000 (SELECT DISTINCT name from Division) MINUS (SELECT DISTINCT Division.name FROM Employees JOIN Division ON divid=id WHERE salary > 100000) M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Or, universal with IN 1. Find the other divisions: in which someone makes <= 100000: SELECT name FROM Division WHERE id IN (SELECT divid FROM Employees WHERE salary <= 100000 2. Select the divisions we didn’t find: SELECT name FROM Division WHERE id NOT IN (SELECT divid FROM Employees WHERE salary <= 100000 M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Or, universal with ALL • Using <= ALL Employees(name, job, divid, salary) Division(name, id, head) Find all divisions in which everyone makes > 100000 M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Next: correlated subqueries • Acc(name,bal,type…) • Q: Who has the largest balance? • Can we do this with subqueries? M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Correlated Queries • Acc(name,bal,type,…) • Q: Find holder of largest account • (Later, could use MAX, but still need a subquery here…) SELECT name FROM Acc WHERE bal >= ALL (SELECT bal FROM Acc) M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Correlated Queries • So far, subquery executed once; • result used for higher query • More complicated: correlated queries • “[T]he subquery… [is] evaluated many times, once for each assignment of a value to some term in the subquery that comes from a tuple variable outside the subquery” (Ullman, p286). • Q: What does this mean? • A: That subqueries refer to vars from outer queries M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Correlated Queries • Acc(name,bal,type,…) • Q2: Find holder of largest account of each type SELECT name, type FROM Acc WHERE bal >= ALL (SELECT bal FROM Acc WHERE type=type) correlation M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Correlated Queries • Acc(name,bal,type,…) • Q2: Find holder of largest account of each type • Note: • scope of variables • this can still be expressed as single SFW SELECT name, type FROM Acc a1 WHERE bal >= ALL (SELECT bal FROM Acc WHERE type=a1.type) correlation M.P. Johnson, DBMS, Stern/NYU, Spring 2008
New topic: R.A./SQL Set Operators • Relations are sets have set-theoretic ops • Venn diagrams • Union: R1 R2 • Example: • ActiveEmployees RetiredEmployees • Difference: R1 – R2 • Example: • AllEmployees – RetiredEmployees = ActiveEmployees • Intersection: R1 R2 • Example: • RetiredEmployees UnionizedEmployees M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Set operations - example R: S: R S: M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Set operations - example R: S: R S: M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Set operations - example R: S: R - S: M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Set ops in SQL • Orthodox SQL has set operators: • UNION, INTERSECT, EXCEPT • Oracle SQL uses MINUS rather than EXCEPT • See the Ullman page on more differences • These ops applied to queries: (SELECT name FROM Person WHERE City = 'New York') INTERSECT (SELECT custname FROM Purchase WHERE store='Kim''s') M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Boat examples • Reserve(ssn,bmodel,color) • Q: Find ssns of sailors who reserved red boats or green boats SELECT DISTINCT ssn FROM reserve WHERE color = 'red' OR color = 'green' M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Boat examples • Reserve(ssn,bmodel,color) • Q: Find ssns of sailors who reserved red boats and green boats SELECT DISTINCT ssn FROM reserve WHERE color = 'red' AND color = 'green' M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Boat examples • Reserve(ssn,bmodel,color) • Q: Find ssns of sailors who reserved red boats and green boats SELECT DISTINCT r1.ssn FROM reserve r1, reserve r2 WHERE r1.ssn = r2.ssn AND r1.color = 'red' AND r2.color = 'green' M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Boat examples • Reserve(ssn,bmodel,color) • Q: Find ssns of sailors who reserved red boats and green boats (SELECT DISTINCT ssn FROM reserve WHERE color = 'red') INTERSECT(SELECT DISTINCT ssn FROM reserve WHERE color = 'green') M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Boat examples • Reserve(ssn,bmodel,color) • Q: Find ssns of sailors who reserved red boats or green boats (SELECT DISTINCT ssn FROM reserve WHERE color = 'red') UNION (SELECT DISTINCT ssn FROM reserve WHERE color = 'green') M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Boat examples • Reserve(ssn,bmodel,color) • Q: Find ssns of sailors who reserved red boats but not green boats (SELECT DISTINCT ssn FROM reserve WHERE color = 'red') EXCEPT (SELECT DISTINCT ssn FROM reserve WHERE color = 'green') M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Union-Compatibility • Situation: Cust1(name,address,…), Cust2(name,…) • Want: report of all customer names and addresses (if known) • Can’t do: • Both tables must have same sequence of types • Applies to all set ops (SELECT name, address FROM Cust1) UNION (SELECT name FROM Cust2) M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Result(name, address) Union-Compatibility • Situation: Cust1(name,address,…), Cust2(name,…) • Want: report of all customer names and addresses (if known) • But can do: • Resulting field names taken from first table (SELECT name, address FROM Cust1) UNION (SELECT name, '(N/A)' FROM Cust2) M.P. Johnson, DBMS, Stern/NYU, Spring 2008
First Unintuitive SQLism • Looking for R (S T) • But what happens if T is empty? • See transcript of this in Oracle on sales SELECT R.A FROM R, S, T WHERE R.A=S.A OR R.A=T.A M.P. Johnson, DBMS, Stern/NYU, Spring 2008
New topic: Nulls in SQL • If we don’t have a value, can put a NULL • Null can mean several things: • Value does not exists • Value exists but is unknown • Value not applicable • But null is not the same as 0 • See Douglas Foster Wallace… M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Null Values • x = NULL 4*(3-x)/7 = NULL • x = NULL x + 3 – x = NULL • x = NULL 3 + (x-x) = NULL • x = NULL x = 'Joe' is UNKNOWN • In general: no row using null fields appear in the selection test will pass the test • With one exception • Pace Boole, SQL has three boolean values: • FALSE = 0 • TRUE = 1 • UNKNOWN = 0.5 M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Null values in boolean expressions • C1 AND C2 = min(C1, C2) • C1 OR C2 = max(C1, C2) • NOT C1 = 1 – C1 • height > 6 = UNKNOWN • UNKNOWN OR weight > 190 = UNKOWN • (age < 25) AND UNKNOWN = UNKNOWN SELECT * FROM Person WHERE (age < 25) AND (height > 6 OR weight > 190) E.g.age=20height=NULLweight=180 M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Comparing null and non-nulls • The schema specifies whether null is allowed for each attribute • NOT NULL to forbid • Nulls are allowed by default • Unexpected behavior: • Some Persons are not included! • The “trichotomy law” does not hold! SELECT * FROM Person WHERE age < 25 OR age >= 25 M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Testing for null values • Can test for NULL explicitly: • x IS NULL • x IS NOT NULL • But: • x = NULLis never true • Now it includes all Persons SELECT * FROM Person WHERE age < 25 OR age >= 25 OR age IS NULL M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Null/logic review • TRUE AND UNKNOWN = ? • TRUE OR UNKNOWN = ? • UNKNOWN OR UNKNOWN = ? • X = NULL = ? M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Next: Outer join • Like inner join except that dangling tuples are included, padded with nulls • Left outerjoin: dangling tuples from left are included • Nulls appear “on the right” • Right outerjoin: dangling tuples from right are included • Nulls appear “on the left” M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Cross join - example MovieStar MovieExec M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Outer Join - Example SELECT * FROM MovieStar LEFT OUTER JOIN MovieExec ON MovieStart.name=MovieExec.name SELECT * FROM MovieStar RIGHT OUTER JOIN MovieExec ON MovieStart.name=MovieExec.name M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Outer Join - Example MovieStar MovieExec SELECT * FROM MovieStar FULL OUTER JOIN MovieExec ON MovieStart.name=MovieExec.name M.P. Johnson, DBMS, Stern/NYU, Spring 2008
New-style outer joins • Outer joins may be left, right, or full • FROM A LEFT [OUTER] JOIN B; • FROM A RIGHT [OUTER] JOIN B; • FROM A FULL [OUTER] JOIN B; • OUTER is optional • If OUTER is included, then FULL is the default • Q: How to remember left v. right? • A: It indicates the side whose rows are always included M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Review • Examples from sqlzoo.net SELECTL FROM R1, …, Rn WHERE C PL(sC(R1 x … Rn) M.P. Johnson, DBMS, Stern/NYU, Spring 2008