690 likes | 1.42k Views
SQL. Overview Data Definition Basic Queries Set Operations Null Values Aggregate Functions Nested Subqueries Modification of the Database. History. IBM Sequel language developed as part of System R project at the IBM San Jose Research Laboratory Renamed Structured Query Language (SQL)
E N D
SQL • Overview • Data Definition • Basic Queries • Set Operations • Null Values • Aggregate Functions • Nested Subqueries • Modification of the Database
History • IBM Sequel language developed as part of System R project at the IBM San Jose Research Laboratory • Renamed Structured Query Language (SQL) • ANSI and ISO standard SQL: • SQL-86 • SQL-89 • SQL-92 • SQL:1999 (language name became Y2K compliant!) • SQL:2003…., 2008, 2011. • Not all examples here may work on your particular system.
Domain Types in SQL • char(n). • varchar(n). • int. • smallint. • numeric(p,d). • real, double precision. • float(n). • Can define your own as well.
Creating Tables • Example: create tableinstructor (IDchar(5),name varchar(20) not null,dept_name varchar(20),salarynumeric(8,2)) • insert into instructor values (‘10211’, ’Smith’, ’Biology’, 66000); • insert into instructor values (‘10211’, null, ’Biology’, 66000);
Integrity Constraints in Create Table Example: create tableinstructor (IDchar(5),name varchar(20) not null,dept_name varchar(20),salarynumeric(8,2),primary key (ID),foreign key (dept_name) references department) Department(dept_name, address,….) primary key declaration on an attribute automatically ensures not null
Drop and Alter Table Constructs • drop table • alter table • alter table r add A D • where A is the name of the attribute to be added to relation r and D is the domain of A. • All tuples in the relation are assigned null as the value for the new attribute. • alter table r drop A • where A is the name of an attribute of relation r • Dropping of attributes not supported by many databases.
Basic Query • A typical SQL query has the form:select A1, A2, ..., Anfromr1, r2, ..., rmwhere P • Ai represents an attribute • ri represents a relation • P is a predicate. • The result of an SQL query is a relation. • Case insensitive • What are the relational algebra equivalents?
The select Clause (Cont.) • SQL allows duplicates in relations as well as in query results. • To force the elimination of duplicates, insert the keyword distinctafter select. • Find the names of all departments with instructor, and remove duplicates select distinct dept_namefrom instructor • The keyword all specifies that duplicates not be removed. select alldept_namefrom instructor
The select Clause (Cont.) • An asterisk in the select clause denotes “all attributes” select *from instructor • Arithmetic expressions involving +, –, , and /, and operating on attributes. • The query: selectID, name, salary/12from instructor
The where Clause • To find all instructors in Comp. Sci. dept with salary > 80000 select namefrom instructorwhere dept_name =‘Comp. Sci.'and salary > 80000 • Can use logical connectives and, or, and not • Comparisons can be applied to results of arithmetic expressions.
The from Clause • Find the Cartesian product instructor X teaches select from instructor, teaches
Joins • For instructors who have taught courses, find their names and the course ID of the courses they taught. select name, course_idfrom instructor, teacheswhere instructor.ID = teaches.ID • Find the course ID, semester, year and title of each course offered by the Comp. Sci. department select section.course_id, semester, year, titlefrom section, coursewhere section.course_id = course.course_id anddept_name =‘Comp. Sci.’ Instructor (ID, name, dept_name, salary) Teaches (ID, course_id, section_id, semester, year)
Try Writing Some Queries in SQL • Find departments that are housed in the ‘Taylor’ building. • Department(dept_name, building, budget) • Find course sections taught in rooms that can hold at least 100 students. • Section(course_id, sec_id, semester, year, building, room#, time_slot_id) • Classroom(building, room#, capacity) • Find all advisors of students in Biology. • Student(ID, name, dept_name, tot_cred) • Advisor(S_ID, I_ID) • Instructor(ID, name, dept_name, salary)
Natural Join • Natural join matches tuples with the same values for all common attributes, and retains only one copy of each common column • select *from instructor natural join teaches; Instructor (ID, name, dept_name, salary) Teaches (ID, course_id, section_id, semester, year)
Natural Join (Cont.) • Beware of attributes with same name which get equated incorrectly • List names of instructors along with titles of courses that they teach • Instructor(ID, name, dept_name, salary) • Teaches(ID, course_id, sec_id, semester, year) • Course(course_id, title, dept_name,credits) • Incorrect version (equates course.dept_name with instructor.dept_name) • select name, titlefrom instructor natural join teaches natural join course; • Correct version • select name, titlefrom instructor natural join teaches, coursewhere teaches.course_id= course.course_id; • Another correct version • select name, titlefrom (instructor natural join teaches) join course using(course_id);
The Rename Operation • select ID, name, salary/12 as monthly_salaryfrom instructor • Find the names of all instructors who have a higher salary than some instructor in ‘Comp. Sci’. • select distinct T. namefrom instructor as T, instructor as Swhere T.salary > S.salary and S.dept_name = ‘Comp. Sci.’ • Keyword as is optional and may be omittedinstructor as T ≡ instructorT
String Operations • percent (%). The % character matches any substring. • underscore (_). The _ character matches any character. • Find names of all instructors whose name includes the substring “dar”. select namefrom instructorwherename like '%dar%' • Match the string “100 %” like ‘100 \%' escape '\’
String Operations • SELECT CONCAT(name,dept_name) FROM Instructor • SELECT UPPER(dept_name) FROM Instructor • SELECT length(dept_name) FROM Instructor • SELECT SUBSTR(dept_name,2,4) • FROM Instructor • WHERE dept_name= ’Biology'; • Answer = iolo
Ordering the Display of Tuples • List in alphabetic order the names of all instructors select distinct namefrom instructororder by name order bynamedesc order by dept_name, name
Set Operations • Find courses that ran in Fall 2009 or in Spring 2010 (selectcourse_id from section where sem = ‘Fall’and year = 2009)union(selectcourse_id from section where sem = ‘Spring’and year = 2010) • Find courses that ran in Fall 2009 and in Spring 2010 (selectcourse_id from section where sem = ‘Fall’and year = 2009)intersect(selectcourse_id from section where sem = ‘Spring’and year = 2010) • Find courses that ran in Fall 2009 but not in Spring 2010 (selectcourse_id from section where sem = ‘Fall’and year = 2009)except(selectcourse_id from section where sem = ‘Spring’and year = 2010)
Set Operations • Set operations union, intersect, and except • Each of the above operations automatically eliminates duplicates • To retain all duplicates use the corresponding multiset versions union all, intersect alland except all. • Suppose a tuple occurs m times in r and n times in s, then, it occurs: • m + n times in r union all s • min(m,n) times in rintersect all s • max(0, m – n) times in rexcept all s
Null Values • It is possible for tuples to have a null value, denoted by null, for some of their attributes • null signifies an unknown value or that a value does not exist. • The result of any arithmetic expression involving null is null • Example: 5 + null returns null • The predicate is null can be used to check for null values. • Example: Find all instructors whose salary is null. select namefrom instructorwhere salary is null
Null Values and Three Valued Logic • Any comparison with null returns unknown • Example: 5 < null or null <> null or null = null • Three-valued logic using the truth value unknown: • OR: (unknownortrue) = true, (unknownorfalse) = unknown (unknown or unknown) = unknown • AND: (true and unknown) = unknown, (false and unknown) = false, (unknown and unknown) = unknown • NOT: (not unknown) = unknown • “P is unknown”evaluates to true if predicate P evaluates to unknown • Result of where clause predicate is treated as false if it evaluates to unknown
Aggregate Functions • These functions operate on the multiset of values of a column of a relation, and return a value avg: average valuemin: minimum valuemax: maximum valuesum: sum of valuescount: number of values
Aggregate Functions (Cont.) • select avg (salary)from instructorwhere dept_name= ’Comp. Sci.’; • select count (distinct ID)from teacheswhere semester = ’Spring’and year = 2010 • select count (*)from course; • Instructor(ID, name, dept_name, salary) • Teaches (ID, course_id, section_id, semester, year) • Courses (course_id, dept_name)
Aggregate Functions – Group By • Find the average salary of instructors in each department • select dept_name, avg (salary)from instructorgroup by dept_name avg_salary
Aggregation (Cont.) • Attributes in select clause outside of aggregate functions must appear in group by list • /* erroneous query */select dept_name, ID, avg (salary)from instructorgroup by dept_name;
Aggregate Functions – Having Clause • Find the names and average salaries of all departments whose average salary is greater than 42000 select dept_name, avg (salary) from instructor group by dept_name having avg (salary) > 42000; lect dept_name, avg (salary) from instructor where age > 35 group by dept_name having avg (salary) > 42000; Note: predicates in the having clause are applied after the formation of groups whereas predicates in the where clause are applied before forming groups
Null Values and Aggregates • Total all salaries select sum (salary )from instructor • Above statement ignores null amounts • Result is null if there is no non-null amount • All aggregate operations except count(*) ignore tuples with null values on the aggregated attributes • What if collection has only null values? • count returns 0 • all other aggregates return null
Nested Subqueries • SQL provides a mechanism for the nesting of subqueries. • A subquery is a select-from-where expression that is nested within another query. • A common use of subqueries is to perform tests for set membership, set comparisons, and set cardinality.
Example Query • Find courses offered in Fall 2009 and in Spring 2010 select distinct course_id from section where semester = ’Fall’and year= 2009 and course_id in (select course_id from section where semester = ’Spring’and year= 2010); • Find courses offered in Fall 2009 but not in Spring 2010 select distinct course_id from section where semester = ’Fall’and year= 2009 and course_id not in (select course_id from section where semester = ’Spring’and year= 2010);
Example Query • Find the total number of (distinct) students who have taken course sections taught by the instructor with ID 10101 select count (distinct ID) from takes where (course_id, sec_id, semester, year) in (select course_id, sec_id, semester, year from teaches where teaches.ID= 10101); • Note: Above query can be written in a much simpler manner. The formulation above is simply to illustrate SQL features.
Set Comparison • Find names of instructors with salary greater than that of some (at least one) instructor in the Biology department. select distinct T.name from instructor as T, instructor as S where T.salary > S.salary and S.dept name = ’Biology’; • Same query using > some clause select name from instructor where salary > some (select salary from instructor where dept name = ’Biology’); Mysql: any, some, all should be possible
0 5 6 Definition of Some Clause • F <comp> some r t r such that (F <comp> t )Where <comp> can be: (5 < some ) = true (read: 5 < some tuple in the relation) 0 (5 < some ) = false 5 0 ) = true (5 = some 5 0 (5 some ) = true (since 0 5) 5 (= some) in However, ( some) not in
Example Query • Find the names of all instructors whose salary is greater than the salary of all instructors in the Biology department. select name from instructor where salary > all (select salary from instructor where dept name = ’Biology’);
0 5 6 Definition of all Clause • F <comp> all r t r (F <comp> t) (5 < all ) = false 6 ) = true (5 < all 10 4 ) = false (5 = all 5 4 ) = true (since 5 4 and 5 6) (5 all 6 (all) not in However, (= all) in
Test for Empty Relations • The exists construct returns the value true if the argument subquery is nonempty. • exists r r Ø • not exists r r = Ø
Correlation Variables • Yet another way of specifying the query “Find all courses taught in both the Fall 2009 semester and in the Spring 2010 semester” select course_idfrom section as Swhere semester = ’Fall’and year= 2009 and exists (select *from section as Twhere semester = ’Spring’and year= 2010 and S.course_id= T.course_id); • Correlated subquery • Correlation name or correlation variable
Not Exists • Find all students who have taken all courses offered in the Biology department. select distinct S.ID, S.name from student as S where not exists ( select course_id from course where dept_name = ‘Biology’ AND course_id NOT IN (select T.course_id from takes as T where S.ID = T.ID)); Nested query: (Biology courses – courses taken by student) • Note that X – Y = Ø X Y • Note: Cannot write this query using= alland its variants
Test for Absence of Duplicate Tuples • The unique construct tests whether a subquery has any duplicate tuples in its result. • Find all courses that were offered at most once in 2009 select T.course_idfrom course as Twhere unique (select R.course_idfrom section as Rwhere T.course_id= R.course_id and R.year = 2009); Mysql: ??
Subqueries in from clause • Find the average instructors’ salaries of those departments where the average salary is greater than $42,000.” select dept_name, avg_salaryfrom (select dept_name, avg (salary) as avg_salaryfrom instructorgroup by dept_name)where avg_salary > 42000; • We already know how to do this with the having clause • select dept_name, avg_salaryfrom instructorgroup by dept_name)havingavg_salary > 42000;
Scalar Subquery select dept_name, (select count(*) from instructor where department.dept_name = instructor.dept_name)as num_instructorsfrom department; Scalar: a single value, i.e., a single row and single column. Count, average, … Mysql: yes.
Outer Join • An extension of the join operation that avoids loss of information. • Computes the join and then adds tuples form one relation that does not match tuples in the other relation to the result of the join. • Uses null values.
Left Outer Join • coursenatural left outer joinprereq Course(course_id, title, dept_name, credits) Prereq(course_id, prereq_id) /* note in above that prereq_id refers to course_id in Course
Right Outer Join • coursenatural right outer joinprereq Course(course_id, title, dept_name, credits) Prereq(course_id, prereq_id) /* note in above that prereq_id refers to course_id in Course
Full Outer Join • coursenatural full outer joinprereq Course(course_id, title, dept_name, credits) Prereq(course_id, prereq_id) /* note in above that prereq_id refers to course_id in Course
Varieties of Join • R1 cross join R2: cross product • R1, R2: cross product • R1 Join R2: cross product (so you must have a Where clause to specify the join condition) • R1 Natural Join R2 (enforces equality on all identically named attributes). Also removes duplicate columns • R1 Natural Left Outer Join R2 (null values on R1 side as appropriate) • R1 Natural Right Outer Join R2 (null values on R2 side as appropriate) • R1 Inner Join R2 ON (R1.attx = R2.attM AND R1.att3 <> R2.att3) • R1 Inner Join R2 Using (att2) ….. Meaning you want equality on att2 values • R1 Left Join R2 ON (…) (null values on R1 sides as appropriate) • R1 Right Join R2 ON (….) (null values on R2 sides as appropriate) • -- apply somewhat specifically to mysql.
Views • In some cases, it is not desirable for all users to see the entire logical model (that is, all the actual relations stored in the database.) • Consider a person who needs to know an instructors name and department, but not the salary. This person should see a relation described, in SQL, by select ID, name, dept_namefrom instructor • A view provides a mechanism to hide certain data from the view of certain users. • Any relation that is not of the conceptual model but is made visible to a user as a “virtual relation” is called a view. • Instructor(ID, name, dept_name, salary)
View Definition • A view is defined using the create view statement which has the form create view v as < query expression > where <query expression> is any legal SQL expression. The view name is represented by v. • Once a view is defined, the view name can be used to refer to the virtual relation that the view generates. • View definition is not the same as creating a new relation by evaluating the query expression • Rather, a view definition causes the saving of an expression; the expression is substituted into queries using the view.
Example Views • A view of instructors without their salarycreate view faculty asselect ID, name, dept_namefrom instructor • Find all instructors in the Biology departmentselect namefrom facultywhere dept_name = ‘Biology’ • Create a view of department salary totalscreate view departments_total_salary(dept_name, total_salary) as select dept_name, sum (salary)from instructorgroup by dept_name; • Instructor(ID, name, dept_name, salary)