320 likes | 404 Views
Agenda for Class - 03/04/2014. Answer questions about HW#5 and HW#6 Review query syntax. Discuss group functions and summary output with the GROUP BY statement. Introduce sub-queries. Structure of the SELECT statement. SELECT [all or distinct] FROM (table) WHERE (condition)
E N D
Agenda for Class - 03/04/2014 • Answer questions about HW#5 and HW#6 • Review query syntax. • Discuss group functions and summary output with the GROUP BY statement. • Introduce sub-queries.
Structure of the SELECT statement SELECT[all or distinct] FROM (table) WHERE(condition) GROUP BY (grouping fields) HAVING (condition) ORDER BY (sort fields) Referred to as the “SELECT LIST” Order of Actual Execution: FROM WHERE GROUP BY HAVING SELECT ORDER BY When a SELECT statement is executed, the result is referred to as a “result table”. It is a memory-based table.
Different SELECT options SELECT DISTINCT deptno FROM emp1 SELECT TOP 1 salary FROM emp1 ORDER BY salary SELECT TOP 1 salary FROM emp1 ORDER BY salary SELECT TOP 10 PERCENT salary FROM emp1 ORDER BY salary desc
What is a Group Function? • A way to summarize data and provide more meaningful and informative output from the database. Sometimes referred to as “summary queries” or “group functions.” • Group functions differ from single row SELECT statements: • A SELECT statement processes every row in the underlying table. The result table (unless a WHERE clause is used) contains one row per row in the underlying table. • A group function collects data from multiple rows and produces summarized data in the result table. There should be one row in the result table per group.
What do group functions produce? • If a group function is run on the whole table, without grouping, it generates a single row result table. • If a group function is run with grouping (the GROUP BY statement) then it generates one row per group in the result table.
Why are group functions important? • They provide information rather than data for the end-users of technology. • They help programmers understand large data sets.
Calculating Averages SELECT AVG(salary) FROM emp1; SELECT ROUND(AVG(salary),2) FROM emp1; SELECT ROUND(AVG(salary),2) FROM emp1 WHERE deptno = 10; Use the ROUND function to perform both a mathematical rounding operation and truncate the result to a set number of digits after the decimal point
Counting Rows SELECT COUNT(*) FROM emp1; SELECT COUNT(*) FROM emp1 WHERE deptno = 10; SELECT COUNT(*) FROM emp1 WHERE salary > 2000 and deptno = 10; SELECT COUNT(DISTINCT deptno) FROM emp1;
Finding Minimum and Maximum Values SELECT MIN(hiredate) FROM emp1; SELECT MAX(hiredate) FROM emp1; SELECT MIN(ename) FROM emp1; SELECT MAX(hiredate) FROM emp1 WHERE deptno = 10;
Groups with Calculations/Functions SELECT MAX(salary + ISNULL(comm,0)) FROM emp1; SELECT MAX(DATEDIFF(mm, hiredate, GETDATE()) FROM emp1;
SELECT COUNT(salary), SUM(salary), MIN(salary) FROM emp1 WHERE deptno = 10 and salary < 4000; Combining group functions
Group function issue… Combining group functions with single row values - doesn't work!! SELECT deptno, COUNT(salary), SUM(salary) FROM emp1
Creating summary output by grouping SELECT deptno, SUM(salary) FROM emp1 GROUP BY deptno; SELECT deptno, SUM(salary) FROM emp1 WHERE salary > 2000 GROUP BY deptno; Eliminates rows before the grouping occurs
Summary output with conditions SELECT deptno, SUM(salary) FROM emp1 GROUP BY deptno HAVING SUM(salary) > 6000; SELECT deptno, SUM(salary) FROM emp1 GROUP BY deptno HAVING AVG(salary) > 2000; The HAVING statement uses group functions for the condition or grouped attributes
Multi-attribute grouping SELECT deptno, job, SUM(salary), AVG(salary) FROM emp1 GROUP BY deptno, job;
A group function isn’t a condition!! SELECT MAX(salary) FROM emp1 • The statement above does identify the maximum salary from the table. • Now try to add the name of the employee to that query: SELECTename, MAX(salary) FROM emp1
GROUP BY may not work exactly as you expect… Example 1: What if you want to see the employee name of the employee who has the maximum salary? SELECT ename, MAX(salary) FROM emp1 GROUP BY ename; This code does not accomplish that goal!
Thoughts about fixing the problem… SELECT ename, max(salary) FROM emp1 WHERE salary = MAX(salary) GROUP BY ename; SELECT ename, max(salary) FROM emp1 GROUP BY ename HAVING max(salary) = salary; They don’t work!
Could use the TOP 1 SELECT... SELECT TOP 1 ename, salary, deptno FROM emp1 ORDER BY salary desc But this works only in the SQL Server environment because TOP 1 is not ANSI-Standard SQL
ANSI–Standard SQLNeeds a sub-query to work correctly Example 1 answer: SELECT ename, salary, deptno FROM emp1 WHERE salary = (SELECT MAX(salary) FROM emp1)
What is a sub-query? • A sub-query is a query embedded inside another query. • The sub-query is executed in the normal operation of the query in which it is embedded. • The sub-query will return an “answer” result table to the query in which it is embedded. • A sub-query can be placed in the SELECT list, FROM statement, WHERE clause &/or HAVING clause. • Two types of sub-queries: non-correlated and correlated.
Types of sub-queries • Non-correlated • Inner and outer queries do not exchange data. • Correlated • Inner and outer queries do exchange data.
Example 2 of a non-correlated sub-query Which employee has been employed with the organization for the longest time? What is the name and salary of that employee? Or another way to look at it: What is the name and salary of the employee who was hired first in our organization Write a query to accomplish that goal!! (Use slide #22 as your example)
Let’s expand our sample data to include a new table for department
Relational operators in a single table What if we want to see all the employees who are in deptno 20? No problem... SELECT * FROM emp1 WHERE deptno = 20
What if we want to see all employees who have any department that is in the dept1 table? Example 3 of a non-correlated sub-query: SELECT * FROM emp1 WHERE deptno IN (SELECT deptno FROM dept1)
What if we want to see all employees who don’t have a department that is in the dept1 table? Example 4 of a non-correlated sub-query: SELECT * FROM emp1 WHERE deptno NOT IN (SELECT deptno FROM dept1)
Example of the need for a correlated sub-query • Example 1 of a correlated sub-query: which employees have a higher salary than the average salary for their department? SELECT deptno, AVG(salary) FROM emp1 GROUP BY deptno; SELECT empno, ename, deptno, salary FROM emp1 ORDER BY deptno
A Correlated Sub-Query SELECT empno, ename, deptno, salary FROM emp1 outquery WHERE salary > (SELECT AVG(salary) FROM emp1 inquery WHERE outquery.deptno= inquery.deptno) Requires that the tables have aliases
What if we also want to see the average salary in the SELECT list? SELECT empno, ename, deptno, salary, (SELECT AVG(salary) FROM emp1 squery WHERE squery.deptno = outquery.deptno) FROM emp1 outquery WHERE salary > (SELECT AVG(salary) FROM emp1 inquery WHERE outquery.deptno = inquery.deptno)