360 likes | 526 Views
Advanced SQL. Joe Meehean. SQL Set Operations. Syntax SELECT column1, column2, … FROM table1… WHERE conditions SET_KEYWORD SELECT column1, column2, … FROM table2… WHERE conditions. Union Operation. Uses UNION keyword Combines results of two queries
E N D
Advanced SQL Joe Meehean
SQL Set Operations • Syntax SELECT column1, column2, … FROM table1… WHERE conditions SET_KEYWORD SELECT column1, column2, … FROM table2… WHERE conditions
Union Operation • Uses UNION keyword • Combines results of two queries • Result contains all rows from both queries
Union Operation • Example: “red and green part ids” • Suppliers(sid, sname, address) • Parts(pid, pname, color) • Catalog(sid, pid, cost) • SELECT PartIdFROM Parts P1WHERE P1.color = ‘red’UNIONSELECT PartIdFROM Parts P2WHERE P2.color = ‘green’
Intersection Operation • Uses the INTERSECT keyword • Combines results of two queries • Results contains only rows that are in results of both queries
Intersection Operation • Example: “pids of orange bolts” • SELECT pidFROM Parts P1WHERE P1.color = ‘orange’INTERSECTSELECT pidFROM Parts P2WHERE P2.pname LIKE ‘%bolt%’
Difference Operation • Uses the EXCEPT keyword • Combines results of two queries • Rows in the first query, but not in second
Difference Operation • Example: “pids of orange parts that are not bolts” • SELECT pidFROM Parts P1WHERE P1.color = ‘orange’EXCEPTSELECT pidFROM Parts P2WHERE P2.pname LIKE ‘%bolt%’
Union Compatible • Union, intersect, and difference require union compatible queries • results of queries must be union compatible • cannot combine queries that are not • Union compatible tables • they have the same number of columns • AND corresponding columns have the same data type (e.g., VARCHAR) • AND length
SQL Set Operations and Uniqueness • SQL set operations • UNION, INTERSECT, and EXCEPT • removes duplicates • If you want duplicates use • UNION ALL, INTERSECT ALL, and EXCEPT ALL • Number of duplicate rows in result • for m in 1st table and n in 2nd table • UNION ALL: m + n • INTERSECT ALL: min(m, n) • EXCEPT ALL: m − n
What we are going to Learn • In SQL the result of a query is a table? • Can we query the resultant table? • I.e., can we query a query? • Nested queries • query results of a query • SELECT inside of a SELECT • two types of results • two different ways to construct
2 Types of Subquery Results • Scalar • result is a single column and row • can use as a subquery to replace any single value • e.g., 5000 • e.g., students with better than average GPASELECT StudentIDFROM StudentsWHERE GPA > ( SELECT AVG(GPA) FROM Students )
2 Types of SubqueryResults • Table • result of subquery is a table • outer query uses set operations to check contents of inner query results • SELECT column1,….FROM table1WHERE columnXset_operator( SELECT columnA, …FROM tableA WHERE condition)
Nested Queries Set Operators • IN • e.g., WHERE a IN nested_query • checks whether a is in the results of the nested query • EXISTS • e.g., WHERE/HAVING EXIST nested_query • true if the nested query returned at least one row • UNIQUE • e.g., WHERE UNIQUE nested_query • true if the nested query contains only unique rows • NOT • e.g., WHERE a NOT IN nested_query • negates IN, EXISTS, and UNIQUE
Nested Queries Set Operators • ANY • e.g., WHERE a op ANY nested_query • a is a column or a constant • op is a comparison operator (<,<=,=,<>,>=,>) • returns true if a op is true for any row in the nested query • e.g., 5,000 < ANY (SELECT salary FROM employee) • returns true if any employee salary is greater than 5,000 • ALL • e.g., WHERE a opALL nested_query • returns true if a op is true for all rows in the nested query
Type I Nested Query • Nested query evaluates 1 time • Produces a table • Outer query compares its rows to this table • e.g., course descriptions of Spring offerings • without a join between Offering and Course tables • SELECT CourseDescFROM CourseWHERE CourseNo IN ( SELECT CourseNo FROM OfferingsWHERE term = ‘Spring’)
When to Use Type I Nested Query • Do not need to reference outer query at all • nested query is independent of outer query • Type I queries cannot reference outer query • Type I queries like a procedure call • takes no parameters from outer query • returns result for outer query to use
When to Use Type I Nested Query • Deleting rows related to other rows • delete with join only supported by Access • more generally done nested queries • e.g., remove failing athletes from Athletes(StudentId, TeamId) • DELETE FROM AthletesWHERE StudentNo IN ( SELECT StudentNo FROM Students WHERE Students.GPA < 2.0 )
When to Use Type I Nested Query • Simple difference problems • like EXCEPT without union compatible requirement • generally of the form “blank that are not blank” • e.g., “all the employees who are not pilots: • e.g., “all the parts not supplied by ‘Knockoff Parts’” • e.g., “all of the students who are not athletes”SELECT FirstName, LastNameFROM StudentsWHERE StudentNo NOT IN ( SELECT StudentNo FROM Athletes )
When to Use Type I Nested Query • Why can’t we use “not equals” (<>) for this • e.g., “all of the students who are not athletes”SELECT FirstName, LastNameFROM Students, AthletesWHERE Students.StudentNo <>Athletes.StudentNo • Will return all students • effectively does a cross-product • compares Students.StudentNo to allAthletes.StudentNo • each student has an athlete that has a different StudentNo
Referencing Same Table In Nested Query • Use the rename operator AS • e.g., “Employees who are not managers”SELECT E1.EmpIDFROM Employees AS E1WHERE E1.EmpID NOT IN (SELECT E2. ManagerID FROM Employees AS E2)
Type II Nested Query • References column(s) from outer query • Executed once for every row in outer query • like a nested loop • e.g., “all of the students who are not athletes”SELECT FirstName, LastNameFROM StudentsWHERE NOT EXISTS ( SELECT * FROM Athletes WHERE Students.StudentNo =Athletes.StudentNo )
When to Use Type II Nested Query • More difficult difference problems
When to Use Type II Nested Query • e.g., “Sophmores that never took a class from Phil Park”SELECT FirstName, LastNameFROM Students SWHERE Class = ‘So’ AND NOT EXISTS (SELECT * FROM (Enrollment E JOIN Offering O USING OfferNo) JOIN Faculty F USING FacultyNo WHERE S.StudentNo = E.StudentNoAND F.FirstName = ‘Phil’ AND F.LastName = ‘Park’)
When to use Nested Queries • You need to query a query • we did not cover all possible examples • difference problems are most common • division problems also require nested queries • Then decide whether to use a Type I or II • Type I is cheaper
Division • Represented by / • Combines two relations (tables), A & B • A has two attributes (x,y) • B has one attribute (y) • A/B is all x’s such that (x,y) exists in A for all y’s in B • Result has only a single attribute (x)
Division Operator • For each x value in A • consider the set of y values x maps to in A • if this set contains all y values in B • then x is in A/B • e.g. supplier names that supply all parts • find all the parts (pids) • see if an sid maps to all of those pids • see if the catalog has a supplier that has a row for every part
Division Operator • Derive A/B • think about what doesn’t belong in result • get set of all x,y mappings • remove those mappings that appear in A • what’s left must be x’s that don’t map to all y’s • remove them from the set of x’s
Division Operator • Derive A/B • example derivation on chalk board
Division in SQL • How do we do this in SQL? • Use • Type I nested query • GROUP BY • HAVING • COUNT (*)
Division in SQL • Example: “sids of suppliers who supply all parts” • SELECT sidFROM CatalogGROUP BY sidHAVING COUNT(*) = ( SELECT COUNT(*) FROM Parts)
Division in SQL • Example: “sids of suppliers who supply all of the red parts” • SELECT sidFROM Catalog C JOIN Parts P1 USING pidWHERE P1.color = ‘red’GROUP BY sidHAVING COUNT(*) = ( SELECT COUNT(*) FROM Parts P2 WHERE P2.color = ‘red’)