140 likes | 153 Views
Learn about the use of INNER and OUTER JOINs, as well as subqueries in SQL. Discover how to solve complex questions and make virtual tables using nested SELECT statements.
E N D
CSE 103 • Students:Review INNER and OUTER JOINs, Subqueries. • Others:Please save your work and log out by 10:10. http://www.cse.msu.edu/~cse103 U:/msu/course/cse/103
Day 12 Review • What kinds of questions can we solve with subqueries we couldn't otherwise? • What kinds of questions can we solve with self-JOINs? • How can we use the same table more than once in the same query? http://www.cse.msu.edu/~cse103 U:/msu/course/cse/103
Nested SELECTs in MySQL • MySQL does not (currently) allow views (saved queries) • In effect, using a saved query is the same as embedding that entire SELECT statement in the parent (calling) query • All RDBMS support this idea of using nested SELECTs (sometimes called subqueries) http://www.cse.msu.edu/~cse103 U:/msu/course/cse/103
Nested SELECT example • We can use nested queries as an alternate to JOINs: • SELECT BirthDate FROM tbl_BirthsWHERE PersonID = (SELECT PersonID FROM tbl_People WHERE FirstName = "Charlton" AND LastName = "Heston") • Same as: SELECT BirthDate FROM tbl_Births AS b INNER JOIN tbl_People AS p ON b.PersonID = p.PersonID WHERE FirstName = "Charlton" AND LastName = "Heston" • Subqueries must be completely enclosed in parantheses () http://www.cse.msu.edu/~cse103 U:/msu/course/cse/103
More on nested queries • If the subquery returns exactly one row and one field, you can test with =, or >, or <, etc. (for example finding people of "above average height") • If the subquery returns multiple rows and/or columns, it is best to use a virtual table and a JOIN (more on this later) http://www.cse.msu.edu/~cse103 U:/msu/course/cse/103
Basic nested queries exercises • Use nested queries to find these answers: • Find all people who have above average height [8946] • Find the name of the person or people who have the earliest birthday in 1970 [6] • Find all movies made in the same year as "Harold and Maude" [4106] http://www.cse.msu.edu/~cse103 U:/msu/course/cse/103
Subqueries to make virtual tables: Using SELECT in the FROM clause • Recall: music DB - we found albums with total runtimes greater than average • First had to find total runtimes, then average • Used stored queries • We need a calculation based on another calculation, so we nest queries • We have to SELECT … FROM a virtual table we create with a subquery (see next) http://www.cse.msu.edu/~cse103 U:/msu/course/cse/103
Nested queries in FROM example • Problem: find people who are older than the average age of all those in our db • Have to find ages, then average ages, then… • SELECT FirstName, LastNameFROM tbl_People AS p INNER JOIN tbl_Births AS b ON p.PersonID = b.PersonIDWHERE (YEAR(NOW()) - YEAR(BirthDate)) >(SELECT AVG(Age) FROM(SELECT YEAR(NOW()) - YEAR(BirthDate) AS Age FROM tbl_Births)AS vr_tbl_1 ) http://www.cse.msu.edu/~cse103 U:/msu/course/cse/103
Nested queries in FROM clause Example analysis • Yellow text (outermost query): selects final fields for display and restriction: ages have to be greater than average • Don't know average yet; write subqueries to find it and compare to that • Could test with say, > 50 to verify syntax • Blue text (1st subquery): selects average age for comparison w/outer query • It returns AVG(Age) but we have no field called Age • AVG() only works on a field; not on something like an entire table (even if the table has only one field, we have to specify the field) • Its FROM line has a subquery… • Green text (second subquery): selects everyone's Ages • We create calculated field called Age for use by outer query • In a FROM line, the subquery returns a (virtual) data table to be used • All tables in a FROM line need a name… • Red text: Provides the virtual table with a name, just like any alias • The alias goes outside the () around the subquery • Alias name only matters if you would need to JOIN, etc. with it http://www.cse.msu.edu/~cse103 U:/msu/course/cse/103
Nicolas Cage Example • Who has costarred with Mr. Cage? • Step 1: Find Cage's ActorID • Step 2: Use that ActorID to find the MovieIDs he's been in • Step 3: Use those MovieIDs to find the ActorIDs in those movies • Step 4: Use those ActorIDs to find the names of the other actors • Can solve using self-JOINs and/or subqueries • Mix & match as you like; Some solutions will be faster SELF-JOINs generally are very slow • Use subqueries instead of repeating a table, where possible http://www.cse.msu.edu/~cse103 U:/msu/course/cse/103
Nicolas Cage, continued • One possible solution (there are many): • SELECT p1.FirstName, p1.LastName FROM ((tbl_People AS p1 INNER JOIN tbl_Roles AS r1 ON p1.PersonID = r1.ActorID)INNER JOIN(SELECT ActorID, MovieID FROM tbl_Roles AS r2 INNER JOIN tbl_People AS p2 ON r2.ActorID = p2.PersonID WHERE p2.FirstName = 'Nicolas' AND p2.LastName = 'Cage')AS vr_tbl_1 ON r1.MovieID = vr_tbl_1.MovieIDORDER BY LastName, FirstName • Blue text: subquery to find NC's Movies • See the use of INNER JOIN with the virtual table produced by the subquery (red text) • For further thought: without restricting another firstname/lastname, make Nicolas Cage not star with himself. http://www.cse.msu.edu/~cse103 U:/msu/course/cse/103
Nested queries in FROM exercise • Find all the genres which had a higher-than-average number of movies made in 2003 (compared only to the average per genre for 2003 movies). Show the GenreName and that Number. • E.g., if the average Genre in '03 has 300 movies made, and Action has 500 while Comedy has 200, your query should return Action, 500 http://www.cse.msu.edu/~cse103 U:/msu/course/cse/103
Upcoming Bridge Task • Everyone should come; get next BT you haven't passed • This is 5th opportunity out of 12 • If you take & pass the 1.5, you have 7 more attempts to pass 3 BTs, but if you fail the 1.5, you have only 7 attempts for 4 BTs (less than 2 tries each) • Review text, Web and homework • Get help from helprooms (best if you missed entire days), office hours (good for specific questions), email (specific questions after hours) • If taking 2.0, must have C&U database from passed 1.5 • Bring any books, notes, back-up disk, etc. & a photo ID • Last BT opportunity before Spring Break! http://www.cse.msu.edu/~cse103 U:/msu/course/cse/103
Bridge Task 2.0 • Covers materials primarily from Days 10-13 • Advanced queries of all sorts • Functions (Day 10), Aggregates (Day 11), Stored queries/views (Day 12),[Nested] Subqueries (Day 13) • Also be sure you understand INNER/OUTER JOIN and Calculated fields, as many advanced problems rely on those more basic concepts • Review notes, homework, classwork • Notes are on the Web pages http://www.cse.msu.edu/~cse103 U:/msu/course/cse/103