280 likes | 740 Views
COUNTING SQL has the ability to count and aggregate values across tuples. However since each SQL expression in the WHERE clause refers to individual tuples, counting usually is the last step. Recall that an SQL expression returns a bag of tuples.
E N D
COUNTING • SQL has the ability to count and aggregate values across tuples. • However since each SQL expression in the WHERE clause refers to individual tuples, counting usually is the last step. • Recall that an SQL expression returns a bag of tuples. SELECT count(T.studid), count(DISTINCT T.sectionid) FROM Transcript T WHERE T.semester=‘Fall’ AND T.year=2002 AND T.grade=‘A’ AND T.crscode=‘CSCI4380’ How many tuples are returned by this expression? 4380 Database Systems - Fall 2002
GROUP BY • Suppose we want to find how generous each professor is, by counting the total number of ‘A’s they give every semester in each class, and even compare with the class size. • We would like to return a relation with schema: • ProfId, SpecificClass, TotalAs, TotalAsByPercentage which lists the total number of As by the specific faculty in a specific class. 4380 Database Systems - Fall 2002
GROUP BY First, match professors with the grades. SELECT C.instructorid, C.csrcode, C.semester, C.year, C.sectionNo, T.grade FROM Transcript T, Class C WHERE C.crscode = T.crscode AND C.semester = T.semester AND C.year = T.year AND C.sectionNo = T.sectionNo AND T.grade=‘A’ The next step is to count, but the following does not give us what we wanted: SELECT count(T.studId) FROM Transcript T, Class C WHERE C.crscode = T.crscode AND C.semester = T.semester AND C.year = T.year AND C.sectionNo = T.sectionNo AND T.grade 4380 Database Systems - Fall 2002
GROUP BY Instead, first group tuples into groups for each specific instructor and course that they taught. FROM Transcript T, Class C WHERE C.crscode = T.crscode AND C.semester = T.semester AND C.year = T.year AND C.sectionNo = T.sectionNo AND T.grade=‘A’ GROUP BY C.instructorid, C.crscode, C.semester, C.year, C.sectionNo Instructor 1, 4380, Fall 2002, Section 1 Instructor 2, 4380, Fall 2002, Section 2 Instructor 3, 4380, Spring 2002, Section 1 Instructor 1, 4380, Fall 2001, Section 1 4380 Database Systems - Fall 2002
GROUP BY For each group of tuples, we can now compute the necessary aggregates. We generate a new tuple for each different group. SELECT C.instructorid, C.crscode, C.semester, C.year, C.sectionNo, count(T.studId) TotalAs FROM Transcript T, Class C WHERE C.crscode = T.crscode AND C.semester = T.semester AND C.year = T.year AND C.sectionNo = T.sectionNo AND T.grade=‘A’ GROUP BY C.instructorid, C.crscode, C.semester, C.year, C.sectionNo Note that in this grouping, we cannot compare the total number of As with the total class size. We have already eliminated the students who did not get As. We will see how to do this next class. 4380 Database Systems - Fall 2002
GROUP BY For all students, find the total number of credits hours they have completed. SELECT T.studId, sum(C.credithours) TotalHrs, count(T.crscode)/count(DISTINCT T.crscode) Repeats FROM Transcript T, Course C WHERE C.crscode = T.crscode AND UPPER(T.grade) IN (‘A’,’B’,’C’,’D’) GROUP BY T.studId Note that this query does not take into account if the student took the same course more than once. Then, the credit hours should only count once. We will see how to eliminate these tuples next class. 4380 Database Systems - Fall 2002
GROUP BY / HAVING For all students, find the total number of credits hours they have completed. But, only return the students with at least 100 total credits hours. SELECT T.studId, sum(C.credithours) TotalHrs, count(T.crscode)/count(DISTINCT T.crscode) Repeats FROM Transcript T, Course C WHERE C.crscode = T.crscode AND UPPER(T.grade) IN (‘A’,’B’,’C’,’D’,’F’) GROUP BY T.studId HAVING sum(C.credithours) >= 100 4380 Database Systems - Fall 2002
GROUP BY / HAVING • Group by creates groups, each containing a bag of tuples. • The Having clause applies to each single group. • If the group satisfies the having condition, then the whole group passes. Otherwise, the whole group is eliminated. • For each group, create a new tuple in the select statement. You can include some or all of the grouping attributes, and aggregate conditions. • Each aggregate condition is applied to each group separately. 4380 Database Systems - Fall 2002
WRONG!!! WRONG!!! Find two things that are wrong with these statements!!! SELECT T.studId, T.sectionId, sum(C.credithours) TotalHrs FROM Transcript T, Course C WHERE C.crscode = T.crscode AND UPPER(T.grade) IN (‘A’,’B’,’C’,’D’,’F’) GROUP BY T.studId HAVING C.credithours >= 4 4380 Database Systems - Fall 2002
WRONG!!! WRONG!!! Find two things that are wrong with these statements!!! SELECT T.studId, T.sectionId, sum(C.credithours) TotalHrs FROM Transcript T, Course C WHERE C.crscode = T.crscode AND UPPER(T.grade) IN (‘A’,’B’,’C’,’D’,’F’) GROUP BY T.studId HAVING C.credithours >= 4 • T.sectionId is not a unique value for each group!!! The statement is ambiguous • C.credithours is not a unique value for each group!!! The condition C.credithours >= 4 belongs in the WHERE clause. 4380 Database Systems - Fall 2002
ORDER BY • The order by statement after select is used to order the tuples in the result. It is a simple sort statement. SQL complete: (SELECT […] FROM […] WHERE […] GROUP BY […] HAVING […] ) UNION […] UNION (SELECT […] FROM […] WHERE […] GROUP BY […] HAVING […] ) ORDER BY […] 4380 Database Systems - Fall 2002
NULL VALUES • A null value usually means there is no value for a specific value. The reasons may be: • Value does not exist (yet). The grade for a course in progress does not exist. • Value is not known. We may know that a person has a phone, but we do not know the phone number. • It is not known whether a value exists or not. A student may or may not have non-campus e-mail address. 4380 Database Systems - Fall 2002
NULL VALUES • To check whether a value is null or not, a specific predicate is used: WHERE T.grade IS NULL WHERE T.grade IS NOT NULL • For regular comparison conditions and other predicates, when the compared values are null, the condition evaluates to “unknown”. • WHERE T.grade = ‘A’ evaluates to unknown if T.grade is null • WHERE T.grade=T2.grade evaluates to unknown if either T.grade or T2.grade (or both) are null. 4380 Database Systems - Fall 2002
NULL VALUES • Furthermore, we are given the following: • UNKNOWN AND TRUE = UNKNOWN • UNKNOWN OR TRUE = TRUE • UNKNOWN AND FALSE = FALSE • UNKNOWN OR FALSE = UNKNOWN • NOT (UNKNOWN) = UNKNOWN • UNKNOWN OR UNKNOWN = UNKNOWN • UNKNOWN AND UNKNOWN = UNKNOWN • Note that a tuple is returned by the WHERE clause, only if the condition evaluates to true. 4380 Database Systems - Fall 2002
NULL VALUES SELECT T.studId, T.grade FROM Transcript T WHERE T.semester=‘Fall’ AND T.year=2002 AND NOT (T.grade IN (‘A’,’B’,’C’,’D’,’F’)) Does this select all null grades? Better select WHERE T.grade IS NULL or T.grade=‘I’ 4380 Database Systems - Fall 2002
INNER JOIN Find all faculty and the classes they are teaching in Spring 2002. If they are not teaching a course, then simply return a null value next to the faculty name. First try: SELECT F.name, C.crscode, C.sectionNo FROM Faculty F, Class C WHERE C.instructorId=F.id AND C.semester=‘Spring’ AND C.year=2002 Unfortunately, this eliminates all instructors who do not teach in this semester. 4380 Database Systems - Fall 2002
INNER JOIN Second try: SELECT F.name, C.crscode, C.sectionNo FROM Faculty F, Class C WHERE C.instructorId=F.id AND C.semester=‘Spring’ AND C.year=2002 UNION SELECT F.name FROM Faculty F WHERE (check if they have not taught courses in the Spring - we’ll see how next class) Too long, and the two relations are not union compatible. 4380 Database Systems - Fall 2002
OUTER JOIN • A JOIN B, inner join selects tuples that satisfy a join condition, eliminates all tuples that do not satisfy the join condition. A is called the left operand and B is the right operand of the join operation. • A LEFT OUTER JOIN B returns all tuples in the inner join as well as the tuples in A that do not join with any tuples in in B. • A RIGHT OUTER JOIN B returns all tuples in the inner join as well as the tuples in B that do not join with any tuples in in A. • A FULL OUTER JOIN B returns all tuples in the inner join as well as the tuples from A and B that do not participate in the inner join. 4380 Database Systems - Fall 2002
OUTER JOIN PARTSUPP JOIN SUPPLY ON Supplier=ID (3 tuples) PARTSUPP LEFT OUTER JOIN SUPPLY ON Supplier=Id 4380 Database Systems - Fall 2002
INNER JOIN 4380 Database Systems - Fall 2002
OUTER JOIN 4380 Database Systems - Fall 2002
OUTER JOIN 4380 Database Systems - Fall 2002
OUTER JOIN Find all faculty and the classes they are teaching in Spring 2002. If they are not teaching a course, then simply return a null value next to the faculty name. SELECT F.name, C.crscode, C.sectionNo FROM Faculty F LEFT OUTER JOIN Class C ON F.id = C.instructorId WHERE C.semester=‘Spring’ AND C.year=2002 If the faculty is not teaching any courses, then crscode and sectionNo fields will simply be null. 4380 Database Systems - Fall 2002
CREATE VIEW • To create a view, any select statement can be used. CREATE VIEW Spring2002Teaching(Faculty, CrsCode, SectionId) AS SELECT F.name, C.crscode, C.sectionNo FROM Faculty F LEFT OUTER JOIN Class C ON F.id = C.instructorId WHERE C.semester=‘Spring’ AND C.year=2002 4380 Database Systems - Fall 2002
CREATE VIEW • When a view is created, it is saved as a definition. Views are simply saved queries. • Views can be queried in the same way as an ordinary table. • The tuples in the view are constructed from the base tables whenever a view is queried. • Find all faculty who is not teaching a course in Spring 2002: SELECT S.faculty FROM Spring2002Teaching S WHERE S.crscode is null 4380 Database Systems - Fall 2002
INSERT • To insert a new tuple: INSERT INTO faculty(Id, Name, DeptId) VALUES (10, ‘Legolas’, ‘ELF’) All unnamed attributes will be appended NULL values for this tuple. • To insert a number of tuples, use a select statement: INSERT INTO STUDENT(Id, Name) SELECT 10000+F.Id, F.Name FROM Faculty F WHERE F.deptid = ‘CS’ 4380 Database Systems - Fall 2002
DELETE • Deleting tuples that satisfy a specific condition: DELETE FROM CLASS C WHERE C.year < 1998 • Delete all tuples: DELETE FROM CLASS C 4380 Database Systems - Fall 2002
UPDATE • Update values in the tuples that satisfy the where condition: UPDATE Transcript T SET T.grade = ‘I’ WHERE T.year=2002 AND T.semester=‘Spring’ AND T.grade is null 4380 Database Systems - Fall 2002