1 / 26

Math in SQL

Math in SQL. Aggregation Operators. Operators on sets of tuples. Significant extension of relational algebra. SUM ( [DISTINCT] A) : the sum of all (unique) values in attribute A. AVG ( [DISTINCT] A): the average of all (unique) values in attribute A . SELECT AVG ( S.age )

devaki
Download Presentation

Math in SQL

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Math in SQL

  2. Aggregation Operators • Operators on sets of tuples. • Significant extension of relational algebra. • SUM ( [DISTINCT] A): the sum of all (unique) values in attribute A. • AVG ( [DISTINCT] A): the average of all (unique) values in attribute A. SELECT AVG (S.age) FROM Sailors S; SELECT AVG ( DISTINCT S.age) FROM Sailors S WHERES.rating=10;

  3. Aggregation Operators • Operators on sets of tuples. • Significant extension of relational algebra. • MAX (A): the maximum value in attribute A. • MIN (A): the minimum value in attribute A. SELECT MAX(rating) FROM Sailors; SELECTS.sname FROM Sailors S WHERES.rating= (SELECT MAX(S2.rating) FROM Sailors S2);

  4. Aggregation Operators • Operators on sets of tuples. • Significant extension of relational algebra. • COUNT (*): the number of tuples. SELECT COUNT (*) FROM Sailors S

  5. Aggregation Operators • Operators on sets of tuples. • Significant extension of relational algebra. • COUNT ( [DISTINCT] A): the number of (unique) values in attribute A. SELECT COUNT (DISTINCTS.rating) FROM Sailors S WHERE S.sname=‘Bob’;

  6. Aggregation Operators • Find name and age of the oldest sailor(s). • The first query looks correct, but is illegal. • Thoughts as to why? • The second query is a correct and legal solution. SELECTS.sname, MAX (S.age) FROM Sailors S; SELECTS.sname, S.age FROM Sailors S WHERES.age = (SELECT MAX (S2.age) FROM Sailors S2);

  7. GROUP BY and HAVING • So far, we’ve applied aggregation operators to all (qualifying) tuples. Sometimes, we want to apply them to each of several groups of tuples. • Find the age of the youngest sailor for each rating value. • Suppose we know that rating values go from 1 to 10; we can write ten (!) queries that look like this: • But in general, we don’t know how many rating values exist, and what these rating values are. • Plus, it’s a waste of time to write so many queries For i = 1, 2, ... , 10: SELECT MIN (S.age) FROM Sailors S WHERE S.rating = i;

  8. GROUP BY and HAVING SELECT [DISTINCT] target-list FROMrelation-list WHERE qualification GROUP BYgrouping-list HAVING group-qualification • A group is a set of tuples that have the same value for all attributes grouping-list. • The target-list contains • attribute names • terms with aggregation operations. • Attribute list must be a subset of grouping-list. • Each answer tuple corresponds to a group, and output attributes must have a single value per group. Notice the notation

  9. Conceptual Evaluation • Given: SELECT S.rating, MIN(S.age) as minage FROM Sailors S WHERE S.age >= 18 GROUP BY S.rating HAVING COUNT(*) > 1 • Step 1 • The cross-product of relation-list is computed • In this instance, it’s only Sailors

  10. Conceptual Evaluation • Given: SELECT S.rating, MIN(S.age) as minage FROM Sailors S WHERE S.age >= 18 GROUP BY S.rating HAVING COUNT(*) > 1 • Step 2 • Tuples that fail qualification are discarded • ‘unnecessary’ attributes are deleted

  11. Conceptual Evaluation • Given: SELECT S.rating, MIN(S.age) as minage FROM Sailors S WHERE S.age >= 18 GROUP BY S.rating HAVING COUNT(*) > 1 • Step 3 • Remaining tuples are partitioned intogroups by the value of attributes ingrouping-list

  12. Conceptual Evaluation • Given: SELECT S.rating, MIN(S.age) as minage FROM Sailors S WHERE S.age >= 18 GROUP BY S.rating HAVING COUNT(*) > 1 • Step 4 • The group-qualification is then applied toeliminate groups that do not satisfy thiscondition.

  13. Conceptual Evaluation • Given: SELECT S.rating, MIN(S.age) as minage FROM Sailors S WHERE S.age >= 18 GROUP BY S.rating HAVING COUNT(*) > 1 • Step 5 • One answer tuple is generated per qualifying group by applying the aggregation operator.

  14. GROUP BY and HAVING • Find the age of the youngest sailor with age 18, for each rating with at least 2 such sailors. SELECT S.rating, MIN (S.age) FROM Sailors S WHERE S.age >= 18 GROUP BY S.rating HAVING COUNT (*) > 1; • Only S.rating and S.age are mentioned in the SELECT, GROUP BY or HAVING clauses; other attributes `unnecessary’. • 2nd column of result is unnamed • What to do? Answer relation

  15. GROUP BY and HAVING • For each red boat, find the number of reservations for this boat. SELECT B.bid, COUNT (*) AS scount FROM Sailors S, Boats B, Reserves R WHERE S.sid=R.sid AND R.bid=B.bid AND B.color=‘red’ GROUP BY B.bid; • Grouping over a join of three relations. • What do we get if we remove B.color=‘red’ from the WHERE clause and add a HAVING clause with this condition? • What if we drop Sailors and the condition involving S.sid?

  16. GROUP BY and HAVING • Find the age of the youngest sailor with age > 18, for each rating with at least 2 sailors (of any age). SELECT S.rating, MIN (S.age) FROM Sailors S WHERE S.age > 18 GROUP BY S.rating HAVING 1 < (SELECT COUNT (*) FROM Sailors S2 WHERE S.rating=S2.rating); • Shows HAVING clause can also contain a subquery. • What if HAVING clause is replaced by: • HAVING COUNT(*) >1

  17. GROUP BY and HAVING • Find those ratings for which the average age is the minimum over all ratings. • Aggregation operations cannot be nested! • WRONG: SELECT S.rating FROM Sailors S WHERE S.age = (SELECT MIN (AVG (S2.age)) FROM Sailors S2); • Correct solution: SELECT Temp.rating, Temp.avgage FROM (SELECT S.rating, AVG (S.age) AS avgage FROM Sailors S GROUP BY S.rating) AS Temp WHERE Temp.avgage = (SELECT MIN (Temp.avgage) FROM Temp);

  18. ORDERING & TOP/BOTTOM

  19. ORDER BY • The ORDER BY keyword is used to sort the result-set by a specified column. • The ORDER BY keyword sort the records in ascending order by default. • If you want to sort the records in a descending order, you can use the DESC keyword.

  20. TOP/BOTTOM • The TOP clause is used to specify the number of records to return. • The TOP clause can be very useful on large tables with thousands of records • Returning a large number of records can impact on performance • Can ‘sample’ the table using TOP • Not all database systems support the TOP clause or implement it in different fashion

  21. TOP/BOTTOM SQL Server SELECT TOP number|percentcolumn_name(s)FROM table_name Ex: SELECT TOP 5 * FROM Persons MySQL SELECT column_name(s) FROM table_name LIMIT number Ex: SELECT * FROM Persons LIMIT 5

  22. TOP/BOTTOM Oracle SELECT column_name(s) FROM table_name WHERE ROWNUM <= number Ex: SELECT * FROM Persons WHERE ROWNUM <=5 DB2 SELECT column_name(s) FROM table_name FETCH FIRST number ROWS ONLY Ex: SELECT * FROM Persons FETCH FIRST 5 ROWS ONLY

  23. TOP/BOTTOM • Can specify • Fixed number • SELECT TOP 10 * … • A percent • SELECT TOP 10 PERCENT * …

  24. TOP/BOTTOM • How to return the oldest 5 rentals? • How to return the newest 5 rentals?

  25. TOP/BOTTOM • How to return the 3rd newest rental?

  26. Summary • SQL was an important factor in the early acceptance of the relational model; more natural than earlier, procedural query languages. • All queries that can be expressed in relational algebra can also be formulated in SQL. • In addition, SQL has significantly more expressive power than relational algebra, in particular aggregation operations and grouping. • Many alternative ways to write a query; query optimizer looks for most efficient evaluation plan. • In practice, users need to be aware of how queries are optimized and evaluated for most efficient results.

More Related