620 likes | 632 Views
SQL, Data Storage Technologies, and Web-Data Integration. Week 4. Today’s Agenda. Review Intro to SQL Continued SELECT, GROUP BY, HAVING, DELETE, UPDATE “Advanced” SQL Joins, Functions, Locking tables, Transactions. Week 3 Review. Physical database design Column options
E N D
SQL, Data Storage Technologies, and Web-Data Integration Week 4
Today’s Agenda • Review • Intro to SQL Continued • SELECT, GROUP BY, HAVING, DELETE, UPDATE • “Advanced” SQL • Joins, Functions, Locking tables, Transactions
Week 3 Review • Physical database design • Column options • NOT NULL, DEFAULT, AUTO_INCREMENT, PRIMARY KEY • Client/Server Architecture • Connecting to SQL • command line mysql client • Introduction to SQL • SHOW, USE, CREATE, INSERT
Selecting Data • Syntax • SELECT column_name,… FROM TableName • Use a “*” in place of the column_name list to retrieve all columns • Examples: • Show me all the data stored about our donors • mysql> SELECT * FROM Donor; • What are all the names of all our donors? • mysql> SELECT name FROM Donor;
Being More Specific • Syntax • SELECT column_name,… FROM TableName WHERE statement • Example • Show me all the donors with the name “Jake Johnson” • mysql> SELECT * FROM Donor WHERE name = ‘Jake Johnson’
The LIKE comparison • Uses wildcard characters to match column data • ‘_’ represent any one character • SELECT name FROM Donor WHERE name LIKE ‘_ob’ • Matches for “Bob”, “Rob”, “Job”, etc. • ‘%’ represents any number of characters • Select name FROM Fruit WHERE name LIKE ‘%apple’ • Matches for “Pineapple” and “Apple”
More LIKE • Find all donors whose name starts with a "J”: • mysql> SELECT * FROM Donor WHERE name LIKE ‘J%’; • Use “AND” or “OR” to add multiple restrictions in your WHERE clause • Find all donors whose name starts with a “J” and have a 206 area code • mysql> SELECT * FROM Donor WHERE name LIKE “J%” AND phone_number LIKE “206%”;
Other Constraints • GROUP BY • Fun with aggregates • HAVING
Ordering Your Data • Syntax • SELECT column_name,… FROM TableName ORDER BY column_name, … • Example • List all donors in alphabetical order • mysql> SELECT * FROM Donor ORDER BY name;
More Ordering • Feel free to combine with other constraints, such as WHERE • mysql> SELECT * FROM Donor WHERE name like ‘J%’ ORDER BY name; • You can order by more than one column • SELECT * FROM Donor ORDER BY lastname, firstname • Swap the order with DESC or ASC • mysql> SELECT * FROM Donor WHERE name like ‘J%’ ORDER BY name DESC;
Grouping your data • The GROUP BY clause groups data together so that aggregate functions can be performed on it. • Very common for reports and statistics • More interesting with large sets of data
Piping SQL commands to MySQL • Sometimes we have a big file of SQL commands that we want to run. • Quit your mysql client application • mysql> quit; • Retrieve, and then upload to your dante account: • https://courses.washington.edu/wtcampus/spring/examples/sql/donation.sql • Look at the big file of SQL command • $ less donation.sql • Use a Unix “pipe” to send the file of commands to MySQL • $ /usr/local/mysql-5.0.67-linux-i686/bin/mysql –u uwnetid –p uwnetid < donation.sql • The “<“ operator takes all the lines of text from donations.sql, and sends them to MySQL
Back to Grouping • What does the Donation table look like? • mysql> DESCRIBE Donation;
Group By • Syntax • SELECT column_name,… FROM TableName GROUP BY column_name, … • Example • What is the total amount donated by each donor? • mysql> SELECT donorid, SUM(amount) FROM Donation GROUP BY donorid;
GROUP BY with other constraints • GROUP BY must come after any WHERE clause • GROUP BY must come before any LIMIT or ORDER BY clause
GROUP BY • SELECT column_name (or aggregate function), … FROM TableName WHERE clause GROUP BY column_name, … • You can GROUP BY multiple columns • Example: • How many of our donors have the same name? • SELECT fname, lname, COUNT(*) FROM Donor GROUP BY fname, lname;
HAVING clause • Syntax • SELECT FROM column_name,… FROM TableName HAVING statement • “statement” is the same set of conditionals that the WHERE clause has • So what is the difference between HAVING and WHERE?
HAVING vs. WHERE • The WHERE clause happens as MySQL is looking through its table • The HAVING clause happens on the rows returned by the WHERE clause • mysql> SELECT * FROM Donor HAVING name = ‘Jake Johnson’; • Twice as slow! First scan the Donor table for all the rows, then scan all the rows again for names matching ‘Jake Johnson’.
HAVING • Let's say we're interested in sending a letter to our top donors – those who donated more than $150. • Use the GROUP BY clause and the SUM aggregate function to get a list of the total amounts. • Adding the HAVING clause we can further restrict the results. • mysql> SELECT donorid, SUM(amount) FROM Donation GROUP BY donorid HAVING SUM(amount) > 150;
Deleting rows • Syntax • DELETE FROM TableName [whereclause] • Example • mysql> DELETE FROM Donation;
Deleting is too Easy! • Rows are hard to create, easy to destroy! • Always use a WHERE clause! • Example: • mysql> DELETE FROM Donor WHERE name = ‘Jake Johnson’; • Best to write a SELECT first
Updating data • Syntax • UPDATE TableName SET column_name = value [where_clause] • Let’s learn our lesson from delete, and always use the WHERE clause • Example: • mysql> UPDATE DONOR SET address = ‘123 Home Lane’, phone_number = ‘555-1212’ WHERE Donorid = 1;
Practice: Using the Aggregates • How many donations has each donor made? • What is the maximum donation amount made by each donor? • What are the donorIDs for the top ten donors? • Of those who donated in 2003, what are the donorIDs of the ten worst donors in 2003? • What is the total amount of donations we’ve received in 2004? • What are the donorIDs for the ten best and ten worst donors?
“Advance” SQL • Joining tables • Inner vs. Outer • Built in functions • Table locking • Transactions
Joining • Our queries on the Donation table only return the DonorID. • Typically, we want to know the Donor’s name, not their ID. • We could do two selects and collate the data • SELECT donorid, SUM(amount) FROM Donation GROUP BY donorid; • SELECT donorid, name FROM Donor; • match these up in our code • Or, we could simply do this with one query
Joining • So far we’ve seen SELECTs on a single table • How is this any better than using a Berkley DB or text file on a local computer? • Joins allow us to select information from more than one table and model the relationships in the conceptual model • We don’t want to know the donorIDs, we want to know the donor names!
Simple Joining • SELECT * FROM Table1, Table2; • Listing multiple tables after “FROM” joins those tables together • This effectively creates a new schema, with new tuples. • Donor(donorid, name, address) • Donation(donationid, amount, donorid) • DonorDonation(donorid, name, address, donationid, amount, donorid)
Cartesian (Cross) Product • New “virtual” schema: • DonorDonation(donorid, name, address, donationid, amount, donorid) • New tuples: • (1, “Bob”, “123 St.”, 8, 50.00, 1) • (2, “Sue”, “345 Pl.”, 8, 50.00, 1) • (3, “Joe”, “678 Rd.”, 8, 50.00, 1) • Every row in table A is joined with every row in table B (A x B). • mysql> SELECT * FROM Donor, Donation; • 500 Donors x 3000 Donations = 1.5 Million rows!!
Enforcing the relationship • (1, “Bob”, “123 St.”, 8, 50.00, 1) • (2, “Sue”, “456 Pl.”, 8, 50.00, 1) • (3, “Joe”, “789 Rd.”, 8, 50.00, 1) • Knowing we have a Donation of (8, 50.00, 1), we are only interested in the row where the Donor was (1, “Bob”, “123 St.”) • Solution: Use a WHERE clause just like we did before
Enforcing the Relationship • Our new, new “virtual” schema: • DonorDonation(Donor.donorid, Donor.name, Donor.address, Donation.donationid, Donation.amount, Donation.donorid) • Select just the tuples that have matching donorIDs: • mysql> SELECT * FROM Donor, Donation WHERE Donor.donorid = Donation.donorid;
Enforcing the relation • Tuples now only have data where the donorIDs match • (1, “Bob”, “123 St.”, 8, 50.00, 1) • (2, “Sue”, “456 Pl.”, 19, 175.00, 2) • (2, “Sue”, “456 Pl.”, 33, 25.00, 2) • Our Donor - Donation relationship is now successfully modeled • One Donor (i.e.: “Sue”) has one or many Donations (i.e.: 175.00, 25.00)
Refining your join • Just like a single table SELECT statement, you can refine your multiple table SELECT statement • AND, OR, GROUP BY, HAVING, ORDER BY, LIMIT • Example: What are the names of the top five donors that have donated at least $150, and how much have they donated? SELECT Donor.name, SUM(Donation.amount) FROM Donor, Donation WHERE Donor.donorid = Donation.donorid GROUP BY Donor.donorid HAVING SUM(Donation.amount) > 150 ORDER BY SUM(Donation.amount) DESC LIMIT 5;
More than two tables • SELECT * FROM Donor, Donation, Processor WHERE Donor.donorid = Donation.donorid AND Donation.processorid = Processor.processorid; • Order of the tables in not important
The Equality Test • Typically you need an equality test for each extra table you add to the FROM clause. • The equality checks are almost always between the primary key and the foreign keys of tables. (That’s why the foreign keys are there!)
Outer Joins • The joins we’ve looked at only return a Donor who has made a Donation • What if we want to know which Donors have not made any Donations? • The solution is to use an Outer Join (MySQL supports this through the Left Join command)
Outer Joins • An Outer Join will take all the rows from the Left table (or the Right, depending on the SQL/RDBMS), without requiring a match on the other table.
Outer Joins • SELECT columns FROM Table1 LEFT JOIN Table2 ON equality_test [WHERE|GROUP BY|etc.] • Example: • mysql> SELECT Donor.name, Donation.donationid FROM Donor LEFT JOIN Donation ON Donor.donorid = Donation.donorid;
Outer Joins • An outer join will Null fill any columns from Table2 where the ON statement doesn’t match. • (1, “Bob”, “123 St.”, 8, 50.00, 1) • (2, “Sue”, “456 Pl.”, 19, 175.00, 2) • (3, “Joe”, “789 Rd.”, NULL, NULL, NULL) • If a tuple from Table1 can be joined with any tuple from Table2, it will not be Null filled.
Outer Joins • What if we want to know which Donors have not made any Donations? • mysql> SELECT Donor.name, Donation.amount FROM Donor LEFT JOIN Donation ON Donor.donorid = Donation.donorid WHERE Donation.amount IS NULL;
A Lot of Typing • SELECT Donor.name, SUM(Donation.amount), Processor.name FROM Donor LEFT JOIN Donation ON Donor.donorid = Donation.donorid LEFT JOIN Processor ON Donation.processorid = Processor.processorid GROUP BY Donor.donorid
Less Typing with Aliases • You can give your Tables nicknames: • SELECT Dr.name, SUM(Dn.amount), P.name FROM Donor AS Dr LEFT JOIN Donation AS Dn ON Dr.donorid = Dn.donorid LEFT JOIN Processor AS P ON Dn.processorid = P.processorid GROUP BY Dr.donorid
Renaming Output • You can give your selected columns nicknames too • SELECT Donor.name, SUM(Donation.amount) AS total, Processor.name FROM Donor LEFT JOIN Donation ON Donor.donorid = Donation.donorid LEFT JOIN ON Processor ON Donation.processorid = Processor.processorid GROUP BY Donor.donorid ORDER BY total • You can’t always use aggregate functions in your ORDER BY, and you can’t always use them in your HAVING clause
Joining to Other Databases • Sometimes you may want to share a database with other databases • Example: You have a “Users” database that is shared between two applications, each of which has its own database. • SELECT C.name, CN.nickname FROM Users.Customer AS C, CustomerNickname AS CN WHERE C.customerid = CN.customerid;
SQL Functions • MySQL provides a lot of functions to munge the results of a query • Example, returning a date • SELECT date FROM Donation WHERE donationid=1; • 2004-10-14 15:52:08 • Not very “pretty” for a user to see
SQL Functions • Use the FORMAT_DATE() function instead! • SELECT FORMAT_DATE(date, “%m/%d/%y”) FROM Donation WHERE donationid=1; • 10/13/04 • Prettier! • SELECT FORMAT_DATE(date, “%M %D, %Y”) FROM Donation WHERE donationid=1; • October 13th, 2004
Example using functions • Return a nicely formatted list of donation dates made in the last 10 days • mysql> SELECT DATE_FORMAT(date, ‘%m/%d/%Y’) FROM Donation WHERE date > DATE_SUB(NOW(), INTERVAL 10 DAY);
Practice: Joining Tables mysql> SOURCE non-profit.sql • Who Processed the most Donations? • Which Donors have made no Donations? • Which Division received the most money? • Which Donor gave the most to Healthcare?