650 likes | 803 Views
SQL: Queries, Programming, Triggers. Chapter 5. Example Schema. We will use these table definitions in our examples. Sailors ( sid : integer , sname : string , rating : integer , age : real ) Boats ( bid : integer , bname : string , color : string )
E N D
SQL: Queries, Programming, Triggers Chapter 5
Example Schema We will use these table definitions in our examples. Sailors(sid: integer, sname: string, rating: integer, age: real) Boats(bid: integer, bname: string, color: string) Reserves(sid: integer, bid: integer, day: date) Make reservations
Example Instances R1 • We will use these instances of the Sailors and Reserves relations in our examples. • If the key for the Reserves relation contained only the attributes sid and bid, how would the semantics differ? S1 S2
Projection S2 Projection on sname and rating
Cross-Product S1 R1 R1 × S1 Each row of R1 is paired with each row of S1.
Basic SQL Query SELECT [DISTINCT] target-list FROM relation-list WHERE qualification • relation-list A list of relation names • target-list A list of attributes of relations in relation-list • qualification Comparisons (“Attrop const” or “Attr1 op Attr2,” where op is one of ˂, ˃, ≤, ≥, =, ≠ ) combined using AND, OR, and NOT. • DISTINCT is an optional keyword indicating that the answer should not contain duplicates. (Default: noteliminated)
Semantics of SQL QUERY PROCESSOR SELECT [DISTINCT] target-list FROM relation-list WHERE qualification target-list Define search space R1 × R2 × R3 × · · · relation-list qualification ˂, ˃, ≤, ≥, =, ≠ Select rows Select columns Query Result Projection
Conceptual Evaluation Strategy • Semantics of an SQL query defined in terms of the following conceptual evaluation strategy: • Compute the cross-product of relation-list. • Discard resulting tuples if they fail qualifications. • Delete attributes that are not in target-list. • If DISTINCT is specified, eliminate duplicate rows. • This strategy is probably the least efficient way to compute a query! An optimizer will find more efficient strategies to compute the same answers.
Example of Conceptual Evaluation SELECTS.sname FROM Sailors S, Reserves R WHERE S.sid=R.sid AND R.bid=103 S×R Remove Irrelevant columns Disqualified Answer
Example of Conceptual Evaluation SELECT S.sname FROM Sailors S, Reserves R WHERE S.sid=R.sid AND R.bid=103 S×R
A Note on Range Variables Really needed only if the same relation appears twice in the FROM clause. The previous query can be written in two ways: Range variable SELECT S.sname FROM Sailors S, Reserves R WHERE S.sid=R.sid AND bid=103 It is good style, however, to use range variables always! SELECT sname FROM Sailors, Reserves WHERE Sailors.sid=Reserves.sid AND bid=103 OR
Find sailors who’ve reserved at least one boat SELECT S.sid FROM Sailors S, Reserves R WHERE S.sid = R.sid • Would adding DISTINCT to this query make a difference? Remove duplicate sid • What is the effect of replacing S.sid by S.sname in the SELECT clause? Since two sailors may have the same name, some sailor may have no reservation even his/her name is in the output
Find sailors who’ve reserved at least one boat Sailors(sid, sname, rating, age) Reserves(sid, bid, day) sid has a reservation SELECT S.sid FROM Sailors S, Reserves R WHERE S.sid = R.sid Given a sailor sid sid has a reservation
Arithmetic Expressions and Strings AS and = are two ways to name fields in result LIKE is used for string matching. ‘_’ stands for any one character and ‘%’ stands for 0 or more arbitrary characters. SELECT S.age, age1 = S.age-5, 2*S.ageAS age2 FROM Sailors S WHERE S.sname LIKE ‘B_%B’ Name begins and ends with ‘B’ and contains at least three characters
Find names of sailors who’ve reserved a red or a green boat Sailors(sid, sname, rating, age) Reserves(sid, bid, day) Boats(bid, bname, color) sid has a reservation for bid color = ‘red’ OR color = ‘green’ SELECT S.name FROM Sailors S, Boats B, Reserves R WHERE S.sid=R.sid AND R.bid=B.bid AND (B.color=‘red’ ORB.color=‘green’) bid is a boat
Find names of sailors who’ve reserved a red or a green boat SELECT S.name FROM Sailors S, Boats B, Reserves R WHERE S.sid=R.sid AND R.bid=B.bid AND B.color=‘red’ UNION SELECT S.name FROM Sailors S, Boats B, Reserves R WHERE S.sid=R.sid AND R.bid=B.bid AND B.color=‘green’ Name of sailors who’ve reserved red boats • UNION: Compute the union of any two union-compatible sets of tuples (which are themselves the result of SQL queries). Name of sailors who’ve reserved green boats
EXCEPT SELECT S.name FROM Sailors S, Boats B, Reserves R WHERE S.sid=R.sid AND R.bid=B.bid AND B.color=‘red’ EXCEPT SELECT S.name FROM Sailors S, Boats B, Reserves R WHERE S.sid=R.sid AND R.bid=B.bid AND B.color=‘green’ Name of sailors who’ve reserved red boats • What do we get if we replace UNION by EXCEPT? Find the sids of all sailors who have reserved red boats but not green boats Name of sailors who’ve reserved green boats
Find names of sailors who’ve reserved a red and a green boat SELECT S.name FROM Sailors S, Boats B1, Reserves R1, Boats B2, Reserves R2 WHERE S.sid=R1.sid AND R1.bid=B1.bid The sailor reserves 1st boat AND S.sid=R2.sid AND R2.bid=B2.bid The same sailor reserves 2nd boat AND (B1.color=‘red’ AND B2.color=‘green’) The 2nd boat is green The 1st boat is red
Find names of sailors who’ve reserved a red and a green boat Result is empty ! Can’t we say: SELECT S.name FROM Sailors S, Boats B, Reserves R WHERE S.sid=R.sid AND R.bid=B.bid AND (B.color=‘red’ AND B.color=‘green’)
Find names of sailors who’ve reserved a red and a green boat SELECT S.name FROM Sailors S, Boats B, Reserves R WHERE S.sid=R.sid AND R.bid=B.bid ANDB.color=‘red’ INTERSECT SELECT S.name FROM Sailors S, Boats B, Reserves R WHERE S.sid=R.sid AND R.bid=B.bid ANDB.color=‘green’ Name of sailors who’ve reserved red boats • INTERSECT: Can be used to compute the intersection of any two union-compatible sets of tuples. • Included in the SQL/92 standard, but some systems don’t support it. Name of sailors who’ve reserved green boats
Nested Queries Find names of sailors who’ve reserved boat #103 SELECT S.sname FROM Sailors S WHERE S.sid IN (SELECT R.sid FROM Reserves R WHERE R.bid=103) • A very powerful feature of SQL: a WHERE clause can itself contain an SQL query! (Actually, so can FROM and HAVING clauses.) • To understand semantics of nested queries, think of a nested loops evaluation: For each Sailors tuple, check the qualification by computing the subquery. • To find sailors who’ve not reserved #103, use NOT IN. All “boat 103” reservations Sailor S has at least one of these reservations
Nested Queries with Correlation (1) Query: Find names of sailors who’ve reserved boat #103 EXISTStests whether a set is nonempty. SELECT S.sname FROM Sailors S WHERE EXISTS(SELECT * FROM Reserves R WHERE S.sid=R.sidAND R.bid=103) S Sailor S reserves boat 103
NOT EXIST • Use NOT EXIST to find the names of sailors who have not reserved a red boat SELECT S.sname FROM Sailors S WHERE NOT EXISTS (SELECT * FROM Reserves R WHERE R.bid=103 AND S.sid=R.sid)
Nested Queries with Correlation (2) Query: Find names of sailors who reserve boat 103 at most once. • UNIQUE returns true if no row appears more than once. (Note: returns true if answer is empty) • Can we replace “SELECT R.bid” by “ SELECT * ” ? No, A sailor may reserve boat 103 on different days; and UNIQUE would return true SELECT S.sname FROM Sailors S WHERE UNIQUE (SELECT R.bid FROM Reserves R WHERE R.bid=103 AND S.sid=R.sid)
More on Set-Comparison Operators • Also available: opANY,opALL , where op: ˃, ˂, =, ≠, ≥, ≤ • Find sailors whose rating is greater than that of some sailor called Horatio: SELECT * FROM Sailors S WHERE S.rating > ANY (SELECT S2.rating FROM Sailors S2 WHERE S2.sname=‘Horatio’) The subquery must return a row that makes the comparison true, in order for S.rating > ANY … to return true
Rewriting INTERSECT Queries Using IN Find sid’s of sailors who’ve reserved both a red and a green boat: sid of sailors who’ve reserved red boats SELECT S.sid FROM Sailors S, Boats B, Reserves R WHERE S.sid=R.sid AND R.bid=B.bid ANDB.color=‘red’ (SELECT S2.sid FROM Sailors S2, Boats B2, Reserves R2 WHERE S2.sid=R2.sid AND R2.bid=B2.bid AND B2.color=‘green’) Similarly, we can rewrite EXCEPT queries using NOT IN. sid of sailors who’ve reserved green boats AND S.sid IN
Division B A/B A X X Y Y Useful for expressing queries like: Find sailors who have reserved all boats No yellow nor green No red nor green Division
Division Operations in SQL (1) Find names of sailors who’ve reserved all boat: SELECT S.sname FROM Sailors S WHERE NOT EXIST ((SELECT B.bid FROM Boats B) EXCEPT (SELECT R.bid FROM Reserves R WHERE R.sid = S.sid )) The sailor reserved all boats All boats Boats not reserved by the sailor All boats reserved by the sailor
Division Operations in SQL (2) Find names of sailors who’ve reserved all boat: Sailor S SELECT S.sname FROM Sailors S WHERE NOT EXIST ((SELECT B.bid FROM Boats B WHERE NOT EXISTS (SELECT R.bid FROM Reserves R WHERE R.bid = B.bid AND R.sid = S.sid)) such that there is no boat B without a reservation showing Sailor S such that … there is no boat B without … a Reserves tuple showing S reserved B. Sailor S reserved boat B
Aggregate Operators Significant extension of relational algebra
Aggregate Operators Find the name of sailors with the highest rating Count the number of sailors SELECT COUNT (*) FROM Sailors S SELECTS.sname FROM Sailors S WHERES.rating= (SELECT MAX(S2.rating) FROM Sailors S2) Find the average age of sailors with a rating of 10 Find the average of the distinct ages of sailors with a rating of 10 SELECT AVG (S.age) FROM Sailors S WHERE S.rating=10 Count the number of distinct ratings of sailors called “Bob” SELECT AVG (DISTINCT S.age) FROM Sailors S WHERES.rating=10 SELECT COUNT (DISTINCTS.rating) FROM Sailors S WHERE S.sname=‘Bob’
Aggregate Operators SELECT COUNT (*) FROM Sailors S SELECTS.sname FROM Sailors S WHERES.rating= (SELECT MAX(S2.rating) FROM Sailors S2) SELECT AVG (S.age) FROM Sailors S WHERE S.rating=10 SELECT AVG (DISTINCT S.age) FROM Sailors S WHERE S.rating=10 SELECT COUNT (DISTINCTS.rating) FROM Sailors S WHERE S.sname=‘Bob’
Find name and age of the oldest sailor(s) Comparing a number with a relation is allowed here SELECT S.sname FROM Sailors S WHERE S.rating = (SELECT MAX(S2.rating) FROM Sailors S2) Allowed in SQL/92 standard, but is not supported in some systems SELECT S.sname FROM Sailors S WHERE ( SELECT MAX (S2.rating) FROM Sailors S2 ) = S.rating
Find name and age of the oldest sailor(s) Only aggregate operations allowed If the SELECT clause uses an aggregate operation, then it must use only aggregate operations unless the query contains a GROUP BY clause (aggregate value for each group – discussed later.) Illegal SELECT S.sname, MAX (S.age) FROM Sailors S
GROUP BY and HAVING (1) • So far, we’ve applied aggregate operators to all (qualifying) tuples. • Sometimes, we want to apply them to each of several groups of tuples. Relation Aggregator Qualifier 32 12 Aggregator Group 1 Relation 9 Group 2 Aggregator 11 Aggregator Group 3
GROUP BY and HAVING (2) Consider: Find the age of the youngest sailor for each rating level. /* Min(age) for multiple groups • If we know that rating values go from 1 to 10, we can write 10 queries that look like this: • In general, we don’t know how many rating levels exist, and what the rating values for these levels are ! SELECT MIN (S.age) FROM Sailors S WHERE S.rating = i For i = 1, 2, ... , 10:
Queries With GROUP BYand HAVING SELECT [DISTINCT] target-list FROM relation-list WHERE qualification GROUP BY grouping-list HAVING group-qualification MIN(Attribute) GROUP BY Output a table HAVING 12 Qualifier selecting groups SELECT FROM WHERE Aggregator Group 1 9 Aggregator Group 2 Group 3
Queries With GROUP BYand HAVING SELECT [DISTINCT] target-list FROM relation-list WHERE qualification GROUP BY grouping-list HAVING group-qualification The target-list contains: (i) attribute names,(ii) terms with aggregate operations (e.g., MIN (S.age)). • Each answer tuplebelongs to a group. • The attribute list must be a subset of grouping-list. • A group is a set of tuples that have the same value for all attributes in grouping-list.
SELECT [DISTINCT] target-list FROM relation-list WHERE qualification GROUP BY grouping-list HAVING group-qualification Conceptual Evaluation • The cross-product of relation-list is computed • Tuples that fail qualification are discarded • `Unnecessary’ fields are deleted • The remaining tuples are partitioned into groups by the value of attributes in grouping-list. • The group-qualification is then applied to eliminate some groups • One answer tuple is generated per qualifying group
Find the age of the youngest sailor with age ≥ 18, for each rating with at least 2 such sailors Input relation SELECT S.rating, MIN (S.age) FROM Sailors S WHERE S.age>= 18 GROUP BY S.rating HAVING COUNT (*) > 1 Sailors age Answer Disqualify Only one group satisfies HAVING 4rating groups Only S.rating and S.age are mentioned in SELECT
“GROUP BYand HAVING” Examples Find the age of the youngest sailor with age ≥ 18 SELECT MIN (S.age) FROM Sailors S WHERE S.age >= 18 SELECT S.rating, MIN (S.age) FROM Sailors S WHERE S.age >= 18 GROUP BY S.rating Find the age of the youngest sailor with age ≥ 18, for each rating SELECT S.rating, MIN (S.age) FROM Sailors S WHERE S.age >= 18 GROUP BY S.rating HAVING COUNT (*) > 1 Find the age of the youngest sailor with age ≥ 18, for each ratingwith at least 2 such sailors
For each red boat, find the number of reservations for this boat 3) Count the number of reservations for each red-boat group SELECT B.bid, COUNT (*) AS scount FROM Boats B, Reserves R WHERE R.bid=B.bid AND B.color=‘red’ GROUP BY B.bid 1) Find all reservations for red boats 2) Group the reservations for red boats according to bid
Number of reservation for each boat Illegal Having Clause SELECT B.bid, COUNT (*) AS scount FROM Boats B, Reserves R WHERE R.bid=B.bid GROUP BY B.bid HAVING B.color=‘red’ B.Coloris not in the grouping-list • HAVING B.color=‘red’ Illegal ! Note: HAVING clause is to select groups ! GROUP BY “bid” Output a table HAVING “red” ? ? Qualifier selecting groups Aggregator Group 1 SELECT FROM WHERE ? Aggregator Group 2 Group 3
4 1 Find the age of the youngest sailor older than 18, for each rating with at least 2 sailors 2 3 Sailors Rating 1 3 Age>18 HAVING 1 ˂ (SELECT COUNT (*) FROM Sailors S2 WHERE S.rating = S2.rating) Size>1 Group S.rating 2 WHERE S.age > 18 26 Min. age 22 4 GROUP BY S.rating
Find the age of the youngest sailor older than 18, for each rating with at least 2 sailors 4) Find youngest age for each qualified group 1) Find all sailors older than 18 SELECTS.rating, MIN (S.age) FROM Sailors S WHERE S.age > 18 GROUP BY S.rating HAVING 1 ˂ (SELECT COUNT (*) FROM Sailors S2 WHERES.rating = S2.rating) 2) Group qualified sailors according to rating 3.2) Discard groups with less than two sailors 3.1) Count the number of sailors in a group Number of sailors with this rating
Find the age of the youngest sailor older than 18, for each rating level that has at least 2 sailors SELECTS.rating, MIN (S.age) FROM Sailors S WHERES.age > 18 GROUP BY S.rating HAVING 1 ˂ (SELECT COUNT (*) FROM Sailors S2 WHERES.rating= S2.rating) • Shows HAVING clause can also contain a subquery. • We can use S.rating inside the nested subquery because it has a single value for the current group of sailors. • What if HAVING clause is replaced by “HAVING COUNT(*) >1” • Find the age of the youngest sailor older than 18, for each rating level that has at least two such sailors. If that rating group has more than 1 sailor
Find the age of the youngest sailor older than 18, for each rating level that has at least 2 sailors Counting including sailors younger than 18 SELECTS.rating, MIN (S.age) FROM Sailors S WHERES.age > 18 GROUP BY S.rating HAVING 1 ˂ (SELECT COUNT (*) FROM Sailors S2 WHERES.rating= S2.rating) “age” is notmentioned in this subquery CAUTIOUS At least 2 sailors SELECTS.rating, MIN (S.age) FROM Sailors S WHERES.age > 18 GROUP BY S.rating HAVINGCOUNT (*) › 1 Counting include only adult sailors At least 2 such sailors, i.e., older than 18
Find those ratings for which the average age is the minimum over all ratings Aggregate operations cannot be nested SELECT S.rating FROM Sailors S WHERE S.age = (SELECT MIN (AVG (S2.age)) FROM Sailors S2)
Find those ratings for which the average age is the minimum over all ratings Correct solution (in SQL/92): Find average age for each rating group SELECT Temp.rating, Temp.avgage FROM (SELECT S.rating, AVG (S.age) AS avgage FROM Sailors S GROUP BY S.rating) AS Temp WHERE Temp.avgage = (SELECT MIN (Temp.avgage) FROM Temp) Average age for some rating group Minimum over all ratings
Null Values • Field values in a tuple are sometimes • unknown(e.g., a rating has not been assigned), or • inapplicable(e.g., no spouse’s name). • SQL provides a special value null for such situations.