540 likes | 626 Views
Query Languages: How to build or interrogate a relational database. Structured Query Language (SQL). SQL. SQL is a query language for relational databases. Contains: Data Definition Language to define databases Data Manipulation Language to manipulate databases.
E N D
Query Languages: How to build or interrogate a relational database Structured Query Language (SQL)
SQL • SQL is a query language for relational databases. • Contains: • Data Definition Language to define databases • Data Manipulation Language to manipulate databases. • SQL is widely accepted and is used by most relational DBMSs. • Is being standardized.
The importance of SQL • Since SQL is used in almost all relational databases, once you know SQL you can probably construct and manipulate databases in all RDBMs. • Knowing SQL makes you a (beginning) ORACLE, Informix, SyBase, AdaBas, and so on programmer!
Functionalities of SQL • SQL provides • On-line and embedded use. • Precompilation of embedded queries. • Dynamic database definition and alteration. • Maintenance of indexes • View mechanism • Authorization mechanism • Automatic concurrency control • Logging and database recovery • Report formatting
Tables in SQL • SQL recognizes • Base Tables • real tables that physically exist in the database. There are physically stored records and possibly physically stored indexes directly corresponding to the table • Views • virtual tables that do not physically exist but look to the user as if they do
Data Definition • An SQL database consist of • Database Spaces • Base tables • Indexes • Views
Database Spaces • DBSpace is a section of physical disk. • It consists of • Base tables • Indices • Views • All can be dynamically dropped from DBSpaces. • DBSpaces allow the DB administrator to distribute data accesses over different disks.
Indexes • As we know, indexes can improve search performance. • Cost: more space needed and slower insertion. • Indexes can be defined over any combination of attributes in a base table. • Automatically maintained in SQL. • Users never directly use an index.
Views • Correspond to external schemas. • Derived from one or more base tables or views. • Computed dynamically.
Operations in SQL • For tables: • CREATE, ALTER, DROP • For indexes • CREATE, DROP • For views: • CREATE, DROP
Creating tables CREATE <table name> (<coldecl> [,<coldecl>*], [, <pkdef> [, <fkdef>*]); <coldecl> := <col><type>[NOT NULL] <type> := integer|smallint|float(p)| decimal(p,q)|char(n)| varchar(n)|long varchar| date|time
Creating tables continued <pkdef> := PRIMARY KEY (<colname> [,<colname>*] <fkdef> := FOREIGN KEY (<colname>[,<colname>*]) REFERENCES <table> [ ON DELETE <effect>]
More on creating tables <effect> := RESTRICT | CASCADE | SET NULL • What happens when the tuple in the referenced table with that value is deleted • RESTRICT: Do not delete as long as there tuples in other table with that foreign key value • CASCADE: Delete all tuples with that foreign key value • SET NULL: Set value of foreign key to NULL. (Note violates referential integrity).
Example 1 CREATE TABLE Student (sid CHAR(5) NOT NULL, sname VARCHAR(20), address VARCHAR(70), PRIMARY KEY (sid)); OR CREATE TABLE Student (sid CHAR(5) PRIMARY KEY, sname VARCHAR(20), address VARCHAR(70));
Example 2 CREATE TABLE Enrol (sid CHAR(5) NOT NULL, cid CHAR(5) NOT NULL, grade INT, PRIMARY KEY(sid, cid), FOREIGN KEY (sid) REFERENCES Student ON DELETE CASCADE FOREIGN KEY (cid) REFERENCES Course ON DELETE RESTRICT);
Altering tables I ALTER TABLE <table name> ADD {<coldecl>| <pkdef>| <fkdef>}; ALTER TABLE Enrol ADD edate DATE; • adds a new column to the table grade. For existing tuples, the value is set to NULL.
Altering tables II ALTER TABLE <table name> DROP {PRIMARY KEY| <fkname>}; • Note that care must be taken when dropping columns.
Dropping tables • Tables can be dropped at any time. • Dropping a table deletes both the definition and data. • Also, all views, indexes and foreign key definitions referring to this table are dropped. DROP TABLE <table name>;
Creating indexes CREATE [UNIQUE] INDEX <index> ON <table> (<colname> [<order>] [,<colname> [<order>]*]); <order>:= ASC | DESC • Creates an index on named columns. With UNIQUE, no two tuples can have the same values for the indexes columns. • Example: CREATE INDEX Course ON Enrol (cid);
Data manipulation • Having created the tables, indexes and views, we now need to populate the database and retrieve information from it. • In other words, we want to manipulate the data.
Retrieval SELECT [DISTINCT] <items> FROM <table> [, <table>*] [WHERE <pred>] [GROUP BY <attrs> [HAVING <pred>]] [ORDER BY <attrs> ]; • Corresponds to a JOIN-SELECT-PROJECT expression in relational algebra.
Predicates • The predicate <pred> is a condition formed by parentheses and boolean operators AND, OR and NOT. • A condition has the form • <attr><op>{<value>|<attr>} • and an operator is one of • < | =< | > | >= | = | !=
WHERE clauses • In general, WHERE clauses are constructed as in relational algebra, but with some additions • LIKE string • May contain wildcard characters %, which matches any string, and _, which matches a single character. • IN (set of values) • Tests for set membership • BETWEEN c1 AND c2
Example • Find Student IDs and grades for those students who read CS51T SELECT sid, grade FROM Enrol WHERE cid = ‘CS51T’; • Compare psid, grade (scid = ‘CS51T’(Enrol))
Example continued • We can embellish the way in which the result appears by including format strings in the SELECT • Example SELECT Student as sid, grade FROM Enrol WHERE cid = ‘CS51T’;
DISTINCT • DISTINCT is used to make sure that we do not get any duplicate values. • Example SELECT DISTINCT cid FROM Enrol WHERE grade > 70; • First, find the various course numbers that qualify and then remove duplicates.
More examples • The use of * in the SELECT returns all attributes SELECT * FROM Enrol WHERE cid = ‘CS51T’; • Find all students who obtained 60 or more for CS51T SELECT sid FROM Enrol WHERE cid = ‘CS51T’ AND grade >= 60;
Yet more examples • Find all results for either or CS51T or CS51S SELECT * FROM Enrol WHERE cid IN (‘CS51S’, ‘CS51T’); • Find results for CS courses SELECT * FROM Enrol WHERE cid LIKE ‘CS%’;
Ordering results • Get all results for CS51S and CS51T but order them by result SELECT sid, cid, grade FROM Enrol WHERE cid IN (‘CS51S’, ‘CS51T’) ORDER BY grade DESC;
Subqueries • Notice that the result of a SELECT clause is a table which can be used in another WHERE clause. • Find course titles of the courses for which 123 was registered SELECT title FROM Course WHERE cid IN (SELECT cid FROM Enrol WHERE sid = ‘123’);
Table labels • Sometimes we need to interrogate the same table twice. • We use table labels • Example: Get IDs from those students who did both CS51S and CS51T SELECT DISTINCT sid FROM Enrol as E1, Enrol as E2 WHERE E1.Sid = E2.Sid AND E1.Cid = ‘CS51S’ AND E2.Cid = ‘CS51T’;
Table labels can usually be avoided • We could formulate the same query as SELECT sid FROM Enrol WHERE cid = ‘CS51S’ AND sid IN (SELECT sid FROM Enrol WHERE cid = ‘CS51T’);
Use of ALL in WHERE clauses • Queries that look at all tuples satisfying a particular predicate. • Get the IDs of the students all of whose results are over 70. SELECT sid FROM Enrol as E1 WHERE 70 < ALL (SELECT grade FROM Enrol as E2 WHERE E1.sid = E2.sid); • Forms of ALL: < ALL, <= ALL, = ALL, >= ALL, > ALL
Union • Union allows one to union tuples from different tables. • Get Student IDs for all students whose name starts with a ‘J’ or who obtained an A for CS51T. SELECT sid FROM Student WHERE sname LIKE ‘J%’ UNION SELECT sid FROM Enrol WHERE cid = ‘CS51T’ AND grade > 70;
Intersect • Allows one to intersect • Get all IDs for students whose name begins with a ‘J’ and who obtained an A for CS51S SELECT sid FROM Student WHERE sname LIKE ‘J%’ UNION SELECT sid FROM Enrol WHERE cid = ‘CS51S’ AND grade > 70;
EXISTS and NOT EXISTS • Counterpart of ALL • Find name of students who have not obtained an A for any course SELECT sname FROM Student WHERE NOT EXISTS (SELECT * FROM Enrol WHERE sid = Student.sid AND grade > 70);
Analysis of data • In order to help do some primitive analysis of data, SQL has some built-in functions • COUNT(*) • COUNT(DISTINCT <attr>) • SUM([DISTINCT]<item>) • where <item> may be an abstraction and does not need to be a single attribute. • AVG([DISTINCT]<item>) • MAX(<item>) • MIN(<item>)
Some simple examples of data analysis in SQL • How many students are registered for at least one course SELECT COUNT(DISTINCT sid) FROM Enrol; • Find the average grade for CS51S SELECT AVG(grade) FROM Enrol WHERE cid = ‘CS51S’;
Another example • How many students were above the average for CS51T? SELECT COUNT(*) FROM Enrol WHERE grade > (SELECT AVG(grade) FROM Enrol WHERE cid = ‘CS51T’);
Yet another example • What is the name of the student who got the best mark for CS51T? SELECT sname FROM Student WHERE sid IN (SELECT sid FROM Enrol WHERE grade = (SELECT MAX(grade) FROM Enrol WHERE cid = ‘CS51T’));
GROUP BY • A relation can be partitioned into groups according to some value. Analysis can then be done on these groups. • What are the averages for the various courses? SELECT cid, AVG(grade) FROM Enrol GROUP BY cid;
HAVING • After partitioning, we can disqualify groups. • What is average results for courses with enrollment of more than 10? SELECT cid, AVG(grade) FROM Enrol GROUP BY cid HAVING COUNT(*) > 10; • COUNT is applied to each group separately.
Insertion INSERT INTO {<table>|<view>} [(<attr>] [,<attr>*])] {VALUES (<items>| <select statement>)}; • Example INSERT INTO Enrol (cid, sid, grade) VALUES (‘CS51T’, ‘123’, 67);
Insertion through a SELECT statement • For each course, get the average and insert into a RES table INSERT INTO RES (cid, average) SELECT cid, AVG(grade) FROM Enrol GROUP BY cid;
Deletion DELETE FROM <table> [<WHERE clause>]; • Example DELETE FROM Enrol WHERE cid = ‘CS51T’; • Difference between DELETE and DROP DELETE FROM Enrol; • DELETE empties the table but leaves the table and indexes.
Updating tables UPDATE <table> SET <attr> = <expr> [, <attr> = <expr>*] [<WHERE CLAUSE>]; • Example: Give everybody 10 extra marks for CS35A UPDATE Enrol SET grade = grade + 10 WHERE cid = ‘CS51T’;
Views • Views are derived tables whose definition is stored and whose content is computed. • Can be used as base table for retrieval and view definition. • Exact condition for updating an open problem. • Currently only update iff • derived form single base table • and, has rows and attributes corresponding to a unique and distinct row in base table.
Advantages of views • Views are SQL’s external schemas. They are useful • Users are immune to database growth • Users are immune to database restructuring (logical data independence) • Simplified user perception • Different views of same data for different users • Automatic security for hidden data.
Creation and deletion of views CREATE VIEW <view> [(<colname>[,<colname>*])] AS select-statement; • Example CREATE VIEW CourseAvg (Cid,Average) AS SELECT cid, Avg(grade) FROM Enrol GROUP BY cid; • Deletion DELETE VIEW <view>; DELETE VIEW CourseAvg;
The view update problem • The view CourseAvgas defined above cannot be updated, as any updates cannot be translated into the base table. • The DB administrator should decide whether a view is updatable.