340 likes | 536 Views
Structured Query Language. Basic Structure. SQL is based on set and relational operations with certain modifications and enhancements. A typical SQL query has the form: SELECT A 1 , A 2 , ..., A n FROM r 1 , r 2 , ..., r m WHERE P; A i s represent attributes
E N D
Basic Structure • SQL is based on set and relational operations with certain modifications and enhancements. • A typical SQL query has the form:SELECT A1, A2, ..., AnFROMr1, r2, ..., rmWHERE P; • Ais represent attributes • ris represent relations • P is a predicate. • This query is equivalent to the relational algebra expression. A1, A2, ..., An(P (r1 x r2 x ... x rm)) • The result of an SQL query is a relation.
The SELECT Clause • The SELECT clause corresponds to the projection operation of the relational algebra. It is used to list the attributes desired in the result of a query. Query: Retrieve the SSN values of the employees. SELECT SSN FROM EMPLOYEE; • In the “pure” relational algebra syntax, the query would be: SSN(EMPLOYEE) • An asterisk in the select clause denotes “all attributes” SELECT * FROM EMPLOYEE;
The SELECT Clause (Cont.) • SQL allows duplicates in relations as well as in query results. • To force the elimination of duplicates, insert the keyword DISTINCT after SELECT. Query: Retrieve all distinct salary of employees. SELECT DISTINCT SALARYFROM EMPLOYEE; • The keyword ALL specifies that duplicates not be removed. Query: Retrieve the salary of every employee. SELECT ALL SALARYFROM EMPLOYEE;
The SELECT Clause (Cont.) • The SELECT clause can contain arithmetic expressions involving the operation, +, –, , and /, and operating on constants or attributes of tuples. Query: Show the resulting salaries if every employee is given a 10 percent raise. SELECT FNAME, LNAME, 1.1*SALARYFROM EMPLOYEE; would return the names of all employees and the attribute SALARYis multiplied by 110%.
The WHERE Clause • The WHERE clause corresponds to the selection predicate of the relational algebra. If consists of a predicate involving attributes of the relations that appear in the FROM clause. Query: Retrieve the birth date and address of the employee(s) whose name is ‘John B. Smith’.SELECT BDATE, ADDRESSFROM EMPLOYEEWHERE FNAME = ‘John’AND MINIT = ‘B’ AND LNAME = ‘Smith’; • Comparison results can be combined using the logical connectives AND, OR, and NOT. • Comparisons can be applied to results of arithmetic expressions.
The WHERE Clause (Cont.) • SQL includes a BETWEEN comparison operator in order to simplify WHERE clauses that specify that a value be less than or equal to some value and greater than or equal to some other value. Query: Retrieve all employees in department 5 whose salary is between $30000 and $40000. SELECT*FROM EMPLOYEEWHERE (SALARY BETWEEN 30000 AND 40000) AND DNO = 5; The condition (SALARY BETWEEN 30000 AND 40000) in the above query is equivalent to the condition (SALARY >=30000 AND SALARY <= 40000).
The FROM Clause • The FROM clause corresponds to the Cartesian product operation of the relational algebra. It lists the relations to be scanned in the evaluation of the expression. Query: Find the Cartesian product EMPLOYEE X DEPARTMENT. SELECT FROM EMPLOYEE, DEPARTMENT; Query: Retrieve the name and address of all employees who work for the ‘Research’ department.SELECT FNAME, LNAME, ADDRESSFROM EMPLOYEE, DEPARTMENTWHERE DNAME=‘Research’ AND DNUMBER=DNO; The above query is similar to a SELECT-PROJECT-JOIN sequence of relational algebra where the condition DNAME=‘Research’ is a selection condition and DNUMBER=DNO is a join condition.
The RENAME Operation • The SQL allows renaming relations and attributes using the AS clause: old-name AS new-name Query: Show the resulting salaries if every employee is given a 10 percent raise. SELECT FNAME, LNAME, 1.1*SALARY AS INCREASED_SALARY FROM EMPLOYEE;
Tuple Variables • Tuple variables are defined in the FROM clause via the use of the AS clause. Query: For each employee, retrieve the employee’s first and last name and the first and last name of his or her immediate supervisor. SELECT E.FNAME, E.LNAME, S.FNAME, S.LNAME FROM EMPLOYEE AS E, EMPLOYEE AS SWHERE E.SUPERSSN=S.SSN; • We can think of E and S as two different copies of the EMPLOYEE relation. It is possible to rename the relation attributes: EMPLOYEE AS E(FN, MI, LN, SSN, BD, ADDR, SEX, SAL, SSN, DNO)
String Operations • SQL includes a string-matching operator for comparisons on character strings. Patterns are described using two special characters: • percent (%). The % character matches any substring. • underscore (_). The _ character matches any character. Query: Retrieve all employees whose address is in Houston, Texas. SELECT FNAME, LNAMEFROM EMPLOYEEWHERE ADDRESSLIKE ‘%Houston, TX%’; Query: Find all employees who were born during the 1950s. SELECT FNAME, LNAMEFROM EMPLOYEEWHERE BDATELIKE ‘195_ _ _ _ _ _ _’;
String Operations • Match the name ‘AB_CD%EF’ LIKE‘AB\_CD\%EF’ESCAPE ‘\’ • SQL supports a variety of string operations such as • concatenation (using “||”) • converting from upper to lower case (and vice versa) • finding string length, extracting substrings, etc.
Ordering the Display of Tuples Query: Retrieve a list of employees and the projects they are working on, ordered by department and within each department, ordered alphabetically by last name, first name. SELECT DISTINCT DNAME, LNAME, FNAME, PNAMEFROM DEPARTMENT, EMPLOYEE, WORKS_ON, PROJECTWHERE DNUMBER=DNOAND SSN=ESSN AND PNO=PNUMBER ORDER BY DNAME, LNAME, FNAME; • We may specify DESC for descending order or ASC for ascending order, for each attribute; ascending order is the default. • E.g. ORDER BY DNAME DESC, LNAME ASC, FNAME ASC
Set Operations • The set operations UNION, INTERSECT, and EXCEPT operate on relations and correspond to the relational algebra operations • Each of the above operations automatically eliminates duplicates; to retain all duplicates use the corresponding multiset versions UNION ALL, INTERSECT ALL and EXCEPT ALL.Suppose a tuple occurs m times in r and n times in s, then, it occurs: • m + n times in r UNION ALL s • min(m,n) times in rINTERSECT ALL s • max(0, m – n) times in rEXCEPT ALL s
Set Operations Query: Make a list of all project numbers for projects that involve an employee whose last name is ‘Smith’, either as a worker or as a manager of the department that controls the project. (SELECTDISTINCT PNUMBER FROM PROJECT, DEPARTMENT, EMPLOYEE WHERE DNUM=DNUMBER AND MGRSSN=SSN AND LNAME=‘Smith’);UNION (SELECTDISTINCTPNUMBER FROMPROJECT, WORKS_ON, EMPLOYEE WHERE PNUMBER=PNO AND ESSN=SSN AND LNAME=‘Smith’);
Aggregate Functions • These functions operate on the multiset of values of a column of a relation, and return a value. AVG: average valueMIN: minimum valueMAX: maximum valueSUM: sum of valuesCOUNT: number of values
Aggregate Functions (Cont.) Query: Find the sum of the salaries of all employees, the maximum salary, the minimum salary, and the average. SELECT SUM(SALARY), MAX(SALARY), MIN(SALARY), AVG(SALARY),FROMEMPLOYEE; Query: Retrieve the total number of employees in the research department SELECT COUNT (*)FROM EMPLOYEE, DEPARTMENT WHERE DNO=DNUMBER AND DNAME=‘Research’; Query: Count the number of distinct salary values in the database. SELECT COUNT (DISTINCT SALARY)FROM EMPLOYEE;
Aggregate Functions – GROUP BY Query: For each department, retrieve the department number, the number of employees in the department and their salary. SELECT DNO, COUNT(*), AVG(SALARY)FROM EMPLOYEEGROUP BY DNO; Note: Attributes in SELECT clause outside of aggregate functions must appear in GROUP BY list. Query: For each project retrieve the project number, the project name, and the number of employees who work on that project. SELECT PNUMBER, PNAME, COUNT(*)FROM PROJECT, WORKS_ON WHERE PNUMBER=PNOGROUP BY PNUMBER, PNAME;
Aggregate Functions – HAVING Clause Query: For each project on which more than two employees work, retrieve the project number, the project name, and the number of employees who work on the project. SELECTPNUMBER, PNAME, COUNT(*)FROMPROJECT, WORKS_ON WHERE PNUMBER=PNO GROUP BYPNUMBER, PNAME HAVING COUNT(*) > 2; Note: predicates in the HAVING clause are applied after the formation of groups whereas predicates in the WHERE clause are applied before forming groups.
NULL Values • It is possible for tuples to have a null value, denoted by NULL, for some of their attributes. • NULL signifies an unknown value or that a value does not exist. • The predicate is NULL can be used to check for null values. Query: Retrieve the names of all employees who do not have supervisors. SELECTFNAME, LNAMEFROMEMPLOYEEWHERE SUPERSSNis NULL; • The result of any arithmetic expression involving NULL is NULL • E.g. 5 + NULL returns NULL. • All aggregate operations except COUNT(*) ignore tuples with null values on the aggregated attributes.
NULL Values and Three Valued Logic • Any comparison with NULL returns UNKNOWN • E.g. 5 < NULLorNULL <> NULLor NULL = NULL • Three-valued logic using the truth value UNKNOWN: • OR: (UNKNOWN OR TRUE) = TRUE, (UNKNOWN OR FALSE) = UNKNOWN (UNKNOWNORUNKNOWN) = UNKNOWN • AND: (TRUE AND UNKNOWN) = UNKNOWN, (FALSE AND UNKNOWN) = FALSE, (UNKNOWNANDUNKNOWN) = UNKNOWN • NOT: (NOTUNKNOWN) = UNKNOWN • “P is UNKNOWN” evaluates to true if predicate P evaluates to UNKNOWN • Result of WHERE clause predicate is treated as FALSEif it evaluates to UNKNOWN.
Nested Subqueries • SQL provides a mechanism for the nesting of subqueries. • A subquery is a SELECT-FROM-WHERE expression that is nested within another query. • A common use of subqueries is to perform tests for set membership, set comparisons, and set cardinality.
Example Query Query 1: Retrieve the name and address of all employees who work for the 'Research' department. SELECT FNAME, LNAME, ADDRESS FROM EMPLOYEE WHERE DNO IN (SELECT DNUMBER FROM DEPARTMENT WHERE DNAME='Research' ); • The nested query selects the number of the 'Research' department. • The outer query select an EMPLOYEE tuple if its DNO value is in the result of either nested query. • The comparison operator IN compares a value v with a set of values V, and evaluates to TRUE if v is one of the elements in V.
Example Query Query: Find the SSN of all employees who work the same (project, hours) combination on some project that employee whose SSN=‘123456789’ works on. SELECT DISTINCTESSNFROM WORKS_ONWHERE (PNO, HOURS)IN (SELECTPNO, HOURSFROMWORKS_ON WHERE SSN=‘123456789’);
Nested Subqueries • Correlated nested queries: • If a condition in the WHERE-clause of a nested query references an attribute of a relation declared in the outer query , the two queries are said to be correlated. • The result of a correlated nested query is different for each tuple (or combination of tuples) of the relation(s) the outer query.
Example Query Query: Retrieve the name of each employee who has a dependent with the same first name as the employee. SELECT E.FNAME, E.LNAME FROM EMPLOYEE AS E WHERE E.SSN IN (SELECT ESSN FROM DEPENDENT WHERE ESSN=E.SSN AND E.FNAME=DEPENDENT_NAME); • The nested query has a different result for each tuple in the outer query.
Example Query A query written with nested SELECT... FROM... WHERE... blocks and using the = or IN comparison operators can alwaysbe expressed as a single block query. The formulation above is simply to illustrate SQL features. For example, the previous query can be written in a much simpler manner: SELECT E.FNAME, E.LNAME FROM EMPLOYEE E, DEPENDENT D WHERE E.SSN=D.ESSN AND E.FNAME=D.DEPENDENT_NAME;
0 5 6 Definition of SOME Clause • F <comp> SOME r t r s.t. (F <comp> t)Where <comp> can be: (5< SOME ) = true (read: 5 < some tuple in the relation) 0 ) = false (5< SOME 5 0 ) = true (5 = SOME 5 0 (5 SOME ) = true (since 0 5) 5 (= SOME) IN However, ( SOME) NOT IN
0 5 6 Definition of ALL Clause • F <comp> ALL r t r (F <comp> t) (5< ALL ) = false 6 ) = true (5< ALL 10 4 ) = false (5 = ALL 5 4 (5 ALL ) = true (since 5 4 and 5 6) 6 (ALL) NOT IN However, (= ALL) IN
Set Comparison Query: Return the names of employees whose salary is greater than the salary of all the employees in department 5. SELECTLNAME, FNAMEFROM EMPLOYEEWHERE SALARY > ALL (SELECT SALARYFROM EMPLOYEEWHEREDNO = 5);
Test for Empty Relations • The EXISTS construct returns the value TRUE if the argument subquery is nonempty. • EXISTS r r Ø • NOT EXISTS r r = Ø
Example Query Query: Retrieve the name of each employee who has a dependent with the first name as the employee. SELECTE.LNAME, E.FNAMEFROM EMPLOYEEAS E WHERE EXISTS(SELECT *FROM DEPENDENTWHEREE.SSN = ESSN AND E.FNAME=DEPENDENT_NAME); Query: Retrieve the name of each employee who has no dependent. SELECTE.LNAME, E.FNAMEFROM EMPLOYEEAS E WHERE NOT EXISTS(SELECT *FROM DEPENDENTWHEREE.SSN = ESSN);
Example Query Query: Retrieve the name of each employee who works on all the projects controlled by department 5. SELECT FNAME, LNAMEFROM EMPLOYEEWHERE NOT EXISTS ( (SELECT PNUMBERFROM PROJECTWHERE DNUM=5 EXCEPT (SELECT PNOFROMWORKS_ONWHERE SSN=ESSN)); • Note that X – Y = Ø X Y • Note: Cannot write this query using= ALLand its variants.
Summary of SQL Queries • A query in SQL can consist of up to six clauses, but only the first two, SELECT and FROM, are mandatory. The clauses are specified in the following order: SELECT <attribute list> FROM <table list> [WHERE <condition>] [GROUP BY <grouping attribute(s)>] [HAVING <group condition>] [ORDER BY <attribute list>] • The SELECT-clause lists the attributes or functions to be retrieved • The FROM-clause specifies all relations (or aliases) needed in the query but not those needed in nested queries • The WHERE-clause specifies the conditions for selection and join of tuples from the relations specified in the FROM-clause • GROUP BY specifies grouping attributes • HAVING specifies a condition for selection of groups • ORDER BY specifies an order for displaying the result of a query • A query is evaluated by first applying the WHERE-clause, then GROUP BY and HAVING, and finally the SELECT-clause.