170 likes | 358 Views
CSE 103. Students: Read Day 16 in the textbook Finish queries from class and homework from Day 10 Others: Please save your work and log off by 10:10. Midterm SIRS. On-line SIRS See the Midterm SIRS link on the Site Map Must verify enrollment with your PID Answers are anonymous
E N D
CSE 103 • Students: • Read Day 16 in the textbook • Finish queries from class and homework from Day 10 • Others: • Please save your work and log off by 10:10. http://www.cse.msu.edu/~cse103 U:/msu/course/cse/103
Midterm SIRS • On-line SIRS • See the Midterm SIRS link on the Site Map • Must verify enrollment with your PID • Answers are anonymous • Help TA and Assistant TA improve • Save nasty comments about the course for end-of-term SIRS. • Complete before 10 p.m. on Monday, February 21 http://www.cse.msu.edu/~cse103 U:/msu/course/cse/103
Function Debrief • Find the current age of everyone who starred in The Ten Commandments (1956) and is still alive today. Order the list oldest down. [13] • SELECT FirstName, LastName, YEAR(NOW()) – YEAR(BirthDate) AS Age FROM (((tbl_Movies AS M INNER JOIN tbl_Roles AS R ON M.MovieID = R.MovieID) INNER JOIN tbl_People AS P ON R.ActorID = P.PersonID) INNER JOIN tbl_Births AS B ON P.PersonID = B.PersonID) LEFT OUTER JOIN tbl_Deaths AS D ON P.PersonID = D.PersonID WHERE MovieTitle = “Ten Commandments, The” AND Year = 1956 AND D.PersonID IS NULL ORDER BY Age DESC; http://www.cse.msu.edu/~cse103 U:/msu/course/cse/103
Homework Debrief • Create a query to find all the directors who were born in a winter month, ordered by birth year. List each director’s name as one field with last name, comma, space, first name [e.g., “Spielberg, Steven”]. • SELECT DISTINCT CONCAT(LastName, ", ", FirstName) AS DirectorName, MONTH(BirthDate) AS BirthMonth, YEAR(BirthDate) AS BirthYear FROM (tbl_People AS p INNER JOIN tbl_Births AS b ON p.PersonID = b.PersonID) INNER JOIN tbl_Credits AS c ON p.PersonID = c.DirectorID WHERE MONTH(BirthDate) IN (12, 1, 2) ORDER BY BirthYear[2565] • New SQL keyword: fieldName IN (v1, v2, v3…) • Means same thing as ((fieldName = v1) OR (fieldName = v2) OR (fieldName = v3) OR … ) • Solutions to other HW problems in Day 11 notes. http://www.cse.msu.edu/~cse103 U:/msu/course/cse/103
Questions from Readings • What are aggregate functions? • What SQL keyword helps find summary statistics about records having a common property? http://www.cse.msu.edu/~cse103 U:/msu/course/cse/103
Aggregate Functions: SQL MIN, MAX, SUM, COUNT, AVG • Five of the aggregate functions are used more often than the others: • MIN, MAX (may be used on text, finds first/last alphabetically) • SUM (must be applied to numeric fields) • COUNT (only counts non-null values) • AVG (uses arithmetic mean, i.e., SUM / COUNT) • These functions can be used in SELECT statements like: • SELECT MIN(DigitalFilesize) AS Smallest FROM tbl_DigitalTracks WHERE DigitalFilesize > 0 • Use aliases to name fields • To sort by these fields in MySQL, you MUST use an alias http://www.cse.msu.edu/~cse103 U:/msu/course/cse/103
Aggregates in MySQL • Answer these basic questions using MIN, MAX, SUM, AVG, and/or COUNT • Save your answers; we'll modify these later today. • How many movies were made in 2004? (Get just the number) [1 row, 2170] • What year did Jet Li first appear in an English-language movie? (Return just the year; don't try to get the title too.) [1 row, 1991] http://www.cse.msu.edu/~cse103 U:/msu/course/cse/103
Aggregates in Access • Save these using the qry_day11_X format. • Find in one query: • the number of MP3s, the total amount of space they take up, and the average filesize[1 row - 362,1292.5, 3.57] • Find the number of tracks which list actual secondary artists. [1 row - 9] • Find the durations (return these only) of the longest and shortest tracks in the database[1 row – 00:04, 22:34] • Even if Access displays them as times, it will work correctly http://www.cse.msu.edu/~cse103 U:/msu/course/cse/103
Grouping Records:SQL GROUP BY • Aggregate queries are most useful when records are grouped • GROUP BY collects all records sharing identical values in a list of fields • Example: • SELECT AlbumID, COUNT(TrackNumber) AS NumberOfTracks FROM tbl_Tracks INNER JOIN tbl_Albums ON tbl_Albums.AlbumID = tbl_Tracks.Album GROUP BY AlbumID • Notice that we don't display the AlbumTitle. More on why in a bit. http://www.cse.msu.edu/~cse103 U:/msu/course/cse/103
More on grouping records • Can group by multiple things at once • Example: • SELECT AVG(Height) FROM tbl_Products GROUP BY CategoryID, ColorID • All products having same Category AND Color • Order in GROUP BY line does not matter (since all fields must be equal) • GROUP BY always comes after WHERE (i.e., the WHERE restrictions are applied before the GROUP BY) http://www.cse.msu.edu/~cse103 U:/msu/course/cse/103
Criteria in GROUP BY:SQL HAVING • Criteria on aggregate fields are specified using the keyword HAVING followed by a condition • Example: • SELECT AlbumID, COUNT(TrackNumber) AS NumberOfTracks FROM tbl_Tracks INNER JOIN tbl_Albums ON tbl_Albums.AlbumID = tbl_Tracks.AlbumGROUP BY AlbumIDHAVING COUNT(TrackNumber) > 12 • Criteria on AGGREGATE functions: HAVINGCriteria on NORMAL fields/functions: WHERE http://www.cse.msu.edu/~cse103 U:/msu/course/cse/103
GROUP BY gotchas • Gotcha 1: In Access and most RDBSs, all displayed fields in aggregate queries must either: • Be GROUP(ed) BY • Be aggregate values (using one of the aggregate functions) • Illegal Example: • Goal: Find tallest item in each product category • SELECT ProdDesc, MAX(Height) FROM tbl_Products GROUP BY CategoryID • We'll return to this problem on Day 12 • Gotcha 2: Always GROUP BY key fields! • e.g., try to group by name… but two different John Smiths get grouped together incorrectly. • Related to Gotcha 1 http://www.cse.msu.edu/~cse103 U:/msu/course/cse/103
More GROUP BY gotchas • Gotcha 3: Difference between WHERE and HAVING: • SELECT LastName, AVG(Height) FROM tbl_People WHERE Height >= 84 GROUP BY LastName • SELECT LastName, AVG(Height) FROM tbl_People GROUP BY LastName HAVING AVG(Height) >= 84 • WHERE applies before the aggregate is considered (e.g., take out everyone under 84, then find the average) • HAVING applies to the aggregate condition (e.g., take the average for everyone, then remove those whose AVERAGE is under 84) • Check design view CAREFULLY! http://www.cse.msu.edu/~cse103 U:/msu/course/cse/103
GROUP BY in MySQL • Remember: SELECT, FROM, WHERE, GROUP BY, HAVING, ORDER BY • How many movies of each GenreType were made in 2004? Sort the list so the most common GenreType is first [21; first row is DRA/731] • Find the earliest year Jet Li starred in movies for each of the different languages his movies have been in (use LanguageCodes), considering only languages that he has done at least 3 movies in. Show only the Year and LanguageCode. Sort the list chronologically. [3; first row is 1979/CT] http://www.cse.msu.edu/~cse103 U:/msu/course/cse/103
GROUP BY in Access • Use Aggregate functions with GROUP BY to construct the following queries (music-day8): • Find the number of Albums on each label (LabelIDs are fine) (qry_day11_AlbumLabels) [29 / 27(xtra join)] • For each artist (ID), find the number of their MP3s, the total space they take up, and the average filesize. (qry_day11_ArtistMP3s) [21 or 20 (extra join)] • Find all albums (IDs) with only one disc and with total runtime longer than 45 minutes. Sort by total runtime. (qry_day11_LongAlbums) [10 records returned] http://www.cse.msu.edu/~cse103 U:/msu/course/cse/103
Homework • Check the HOMEWORK link from today’s classwork page • Highlights: • More practice with aggregate queries • Read Day 18 on using queries as a basis for other queries http://www.cse.msu.edu/~cse103 U:/msu/course/cse/103