480 likes | 600 Views
C20.0046: Database Management Systems Lecture #14. M.P. Johnson Stern School of Business, NYU Spring, 2005. Summary: SQL queries. Only SELECT , FROM required Can’t have HAVING without GROUP BY Can have GROUP BY without HAVING Any clauses used must appear in this order:.
E N D
C20.0046: Database Management SystemsLecture #14 M.P. Johnson Stern School of Business, NYU Spring, 2005 M.P. Johnson, DBMS, Stern/NYU, Spring 2005
Summary: SQL queries • Only SELECT, FROM required • Can’t have HAVING without GROUP BY • Can have GROUP BY without HAVING • Any clauses used must appear in this order: SELECT LFROM Rs WHERE s GROUP BY L2 HAVING s2 ORDER BY L3 M.P. Johnson, DBMS, Stern/NYU, Spring 2005
New topic: Modifications • Three kinds of modifications • Insertions • Deletions • Updates • Sometimes “update” used as a synonym for “modification” M.P. Johnson, DBMS, Stern/NYU, Spring 2005
Insertions General form: INSERT INTO R(A1,…., An) VALUES(v1,….,vn) Example: Insert a new purchase to the database: INSERT INTO Knights(name, britnatl, title) VALUES('Bill Gates', 'n', 'KBE') Missing attribute NULL (or other default value) M.P. Johnson, DBMS, Stern/NYU, Spring 2005
Insertions • If we’re sure we have all values in the right order, can just say: • Only do this if you’re sure of order in which the table fields were defined INSERT INTO R VALUES(v1,….,vn) INSERT INTO Knights VALUES('R. Giuliani', 'n', 'KBE'); INSERT INTO Knights VALUES('Bernard Kerik', 'n', 'CBE'); M.P. Johnson, DBMS, Stern/NYU, Spring 2005
Insertions • Can insert the result of a query; Scenario: • Product(name, etc.) • Purchase(buyerssn, prodName, etc.) • Maybe some purchases name missing products • add those to the Product table • Subquery replaces VALUES INSERT INTO R(As) (query) M.P. Johnson, DBMS, Stern/NYU, Spring 2005
Insertion example • Product(name, listPrice, category) • Purchase(prodName, buyerName, price) • Premise: data corruption lose some Product data • every product referred to in Purchase should exist in Product, but some are missing Product Purchase M.P. Johnson, DBMS, Stern/NYU, Spring 2005
Insertion example INSERT INTO Product(name) SELECT prodName FROM Purchase WHERE prodName NOT IN (SELECT name FROM Product) Purchase Product Product’ Q: Or do we get: A: Depends on implementation! M.P. Johnson, DBMS, Stern/NYU, Spring 2005
Deletions DELETE FROM Table WHERE condition • General form: • E.g.: • As usual, WHERE can contain subqueries • Depending on the DBMS • Q: How do you delete just one row with SQL simpliciter? • Oracle has the ROWID/ROWNUM pseudo-field… INSERT INTO Knights VALUES('R. Giuliani', 'n', 'KBE'); INSERT INTO Knights VALUES('Bernard Kerik', 'n', 'CBE'); DELETE FROM Knights WHERE name = 'Bernard Kerik'; M.P. Johnson, DBMS, Stern/NYU, Spring 2005
Updates UPDATE Product SET field1 = value1, field2 = value2 WHERE condition • General form: • Example: • As usual, WHERE can contain subqueries UPDATE Product SET price = price/2 WHERE Product.name IN (SELECT product FROM Purchase WHERE Date = DATE'Oct, 25, 1999') M.P. Johnson, DBMS, Stern/NYU, Spring 2005
New topic: Defining schemata • So far, have done queries and data manipulation • Now doing data definition • Recall data types: • INT or INTEGER (variant: SHORTINT) • FLOAT or REAL: floating-point numbers • DOUBLE PRECISION: • DECIMAL(n,d): • E.g. decimal(5,2): five decimal digits, with the decimal point two positions from the right: e.g. 123.45 • DATE and TIME • Character strings • Fixed length: CHAR(n) • Variable length: VARCHAR(n) M.P. Johnson, DBMS, Stern/NYU, Spring 2005
Creating tables CREATE TABLE Table-name ( field1 field-type, field2 field-type, … fieldn field-type ) • Form: • E.g.: No comma! CREATE TABLE People ( name VARCHAR(30), ssn CHAR(9), age INT, city VARCHAR(30), gender CHAR(1), dob DATE ) M.P. Johnson, DBMS, Stern/NYU, Spring 2005
Default Values • Specify defaults when creating table: • The default default: NULL CREATE TABLE People ( name VARCHAR(30), ssn CHAR(9), age SHORTINT DEFAULT 100, city VARCHAR(30) DEFAULT 'New York', gender BIT(1), dob DATE DEFAULT DATE '1900-01-01' ) M.P. Johnson, DBMS, Stern/NYU, Spring 2005
Deleting and modifying schemata • Delete data, indices, schema: • Delete data and indices: • Either way, exercise extreme caution! • Add or delete attributes: DROP TABLE Person TRUNCATE TABLE Person ALTER TABLE Person ADD phone CHAR(12) Q: What’s put in the new fields? ALTER TABLE Person DROP age M.P. Johnson, DBMS, Stern/NYU, Spring 2005
New topic: Indices • Veryimportant speeding up query processing • Index on field(s) = data structure that makes searches/comparisons on those fields fast • Suppose we have a relation • Person (name, age, city) • Sequential scan of the whole Person file may take a very long time SELECT * FROM Person WHERE name = 'Waksal, Sam' M.P. Johnson, DBMS, Stern/NYU, Spring 2005
Creating Indices • Syntax: • Here: • No searching by name is much faster • How much faster? • Log-time, say • Base-what? Doesn’t matter, but say 2 • If all New Yorkers, #comparisons: • 8000000 log2(8000000) ~= 23 • (i.e., 223 ~= 8000000) CREATE INDEX index-name ON R(fields) CREATE INDEX nameIndex ON Person(name) M.P. Johnson, DBMS, Stern/NYU, Spring 2005
How do indices work? • What the data structure? • Different possibilities • 1st intuition: index on field f is an ordered list of all values in the table’s f field • each item has address (“rowid”) of its row • Where do we get the ordered list? • 2nd intuition: put all f values in a BST • searching BST take log time (why?) • DBMSs actually use a variant: B+Tree • See Ullman’s book or data structures texts… M.P. Johnson, DBMS, Stern/NYU, Spring 2005
Creating Indices • Indexes can be useful in range queries too: CREATE INDEX ageIndex ON Person (age) SELECT * FROM Person WHERE age > 25 M.P. Johnson, DBMS, Stern/NYU, Spring 2005
Using indices • Indices can be created on multiple attributes: • Helps in: • And in: • But not in: CREATE INDEX doubleNdx ON Person (lname, fname) SELECT * FROM Person WHERE fname='Sam' AND lname = 'Waksal' SELECT * FROM Person WHERE lname='Waksal' Idea: our sorted list is sorted on age;city, not city;age SELECT * FROM Person WHERE fname='Sam' Q: In Movie, should index be on year;title or title;year? M.P. Johnson, DBMS, Stern/NYU, Spring 2005
The Index Selection Problem • Big Q: Why not just index all (sequences of) fields? • how does the list/B+Tree stay up to date? • We are given a workload: a set of SQL queries and their frequencies • Q is: What indices should we build to speed up the workload? • Answer: • Attributes in WHERE clauses (queries) favor an index • Attributes in INSERT/UPDATE/DELETEs discourage an index • In many DBMSs: your primary key fields get indexed automatically (why?) M.P. Johnson, DBMS, Stern/NYU, Spring 2005
New topic: Views • Stored relations physically exist and persist • Views are relations that don’t • in some texts, “table” = stored relation = “base table” • Basically names/references given to queries • maybe a relevant subset of a table • Employee(ssn, name, department, project, salary) • Payroll has access to Employee, others only to Developers CREATE VIEW Developers AS SELECT name, project FROM Employee WHERE department = 'Dev' M.P. Johnson, DBMS, Stern/NYU, Spring 2005
A Different View • Person(name, city) • Purchase(buyer, seller, product, store) • Product(name, maker, category) • We have a new virtual table: NYCview(buyer, seller, product, store) CREATE VIEW NYCview AS SELECT buyer, seller, product, store FROM Person, Purchase WHERE Person.city = 'New York' AND Person.name = Purchase.buyer M.P. Johnson, DBMS, Stern/NYU, Spring 2005
A Different View • Now we can query the view: CREATE VIEW NYCview AS SELECT buyer, seller, product, store FROM Person, Purchase WHERE Person.city = 'New York' AND Person.name = Purchase.buyer SELECT name, NYCview.store FROM NYCview, Product WHERE NYCview.product = Product.name AND Product.category = 'Camera' M.P. Johnson, DBMS, Stern/NYU, Spring 2005
What happens when we query a view? SELECT name, NYCview.store FROM NYCview, Product WHERE NYCview.product = Product.name AND Product.category = 'Camera' SELECT name, Purchase.store FROM Person, Purchase, Product WHERE Person.city = 'New York' AND Person.name = Purchase.buyer AND Purchase.poduct = Product.name AND Product.category = 'Camera' M.P. Johnson, DBMS, Stern/NYU, Spring 2005
Can rename view fields CREATE VIEW NYCview(NYCbuyer, NYCseller, prod, store) AS SELECT buyer, seller, product, store FROM Person, Purchase WHERE Person.city = 'New York' AND Person.name = Purchase.buyer M.P. Johnson, DBMS, Stern/NYU, Spring 2005
Types of Views • Views discussed here: • Used in databases • Computed only on-demand – slow at runtime • Always up to date • Sometimes talk about “materialized” views • Used in data warehouses • Pre-computed offline – fast at runtime • May have stale data • Maybe more later… M.P. Johnson, DBMS, Stern/NYU, Spring 2005
Updating Views How to insert a tuple into a table that doesn’t exist? Employee(ssn, name, department, project, salary) CREATE VIEW Developers AS SELECT name, project FROM Employee WHERE department = 'Development' If we make the following insertion: INSERT INTO Developers VALUES('Bill', ‘Word') It becomes: INSERT INTO Employee(ssn, name, dept, project, sal) VALUES(NULL, 'Bill', NULL, 'Word', NULL) M.P. Johnson, DBMS, Stern/NYU, Spring 2005
Non-Updatable Views • Person(name, city) • Purchase(buyer, seller, product, store) • How can we add the following tuple to the view? ('NYC', 'The Wiz') • We don’t know the name of the person who made the purchase • cannot set to NULL (why?) CREATE VIEW CityStore AS SELECT Person.city, Purchase.store FROM Person, Purchase WHERE Person.name = Purchase.buyer M.P. Johnson, DBMS, Stern/NYU, Spring 2005
Constraints in SQL • A constraint = a property that we’d like our database to hold • The system will enforce the constraint by taking some actions: • forbid an update • or perform compensating updates M.P. Johnson, DBMS, Stern/NYU, Spring 2005
Constraints in SQL simplest Constraints in SQL: • Keys, foreign keys • Attribute-level constraints • Tuple-level constraints • Global constraints: assertions The more complex the constraint, the harder it is to check and to enforce Mostcomplex M.P. Johnson, DBMS, Stern/NYU, Spring 2005
Keys Or: CREATE TABLE Product ( name CHAR(30) PRIMARY KEY, category VARCHAR(20) ) CREATE TABLE Product ( name CHAR(30), category VARCHAR(20) PRIMARY KEY (name) ) M.P. Johnson, DBMS, Stern/NYU, Spring 2005
Keys with Multiple Attributes CREATE TABLE Product ( name CHAR(30), category VARCHAR(20), price INT, PRIMARY KEY (name, category) ) M.P. Johnson, DBMS, Stern/NYU, Spring 2005
Other Keys • There is at most one PRIMARY KEY; there can be many UNIQUE • Primary key v. candidate keys CREATE TABLE Product ( productID CHAR(10), name CHAR(30), category VARCHAR(20), price INT, PRIMARY KEY (productID), UNIQUE (name, category) ) M.P. Johnson, DBMS, Stern/NYU, Spring 2005
Foreign Key Constraints • prodName is a foreign key to Product(name) • name should be a key in Product • Purchase ~ Product is many-one • NB: referenced field specified with parentheses, not dot Referentialintegrityin SQL CREATE TABLE Purchase ( prodName CHAR(30) REFERENCES Product(name), date DATETIME ) M.P. Johnson, DBMS, Stern/NYU, Spring 2005
Product Purchase M.P. Johnson, DBMS, Stern/NYU, Spring 2005
Foreign Key Constraints • Or: • (name, category) must be a key (primary/unique) in Product (why?) CREATE TABLE Purchase ( prodName CHAR(30), category VARCHAR(20), date DATETIME, FOREIGN KEY (prodName, category) REFERENCES Product(name, category) M.P. Johnson, DBMS, Stern/NYU, Spring 2005
What happens during updates? Types of updates: • In Purchase: insert/update • In Product: delete/update Product Purchase M.P. Johnson, DBMS, Stern/NYU, Spring 2005
What happens during updates? • SQL has three policies for maintaining referential integrity: • Reject: violating modifications (default) • Cascade: after a delete/update do a delete/update • Set-null: set foreign-key field to NULL M.P. Johnson, DBMS, Stern/NYU, Spring 2005
Constraints on Attributes and Tuples • Constraints on attributes: • NOT NULL -- obvious meaning... • CHECK condition -- any condition on row itself • Some DBMS support subqueries here, but many don’t • Constraints on tuples • CHECK condition M.P. Johnson, DBMS, Stern/NYU, Spring 2005
How is this different from aForeign-Key? CREATE TABLE Purchase ( prodName CHAR(30) CHECK (prodName IN SELECT Product.name FROM Product), date DATETIME NOT NULL ) M.P. Johnson, DBMS, Stern/NYU, Spring 2005
General Assertions • Supported in SQL standard: • Implemented/approximated in MySQL and Oracle as stored procedures • PL/SQL in Oracle CREATE ASSERTION myAssert CHECK (NOT EXISTS( SELECT Product.name FROM Product, Purchase WHERE Product.name = Purchase.prodName GROUP BY Product.name HAVING count(*) > 200) ) M.P. Johnson, DBMS, Stern/NYU, Spring 2005
Final Comments on Constraints • Can give them names, and alter later • We need to understand exactly when they are checked • We need to understand exactly what actions are taken if they fail M.P. Johnson, DBMS, Stern/NYU, Spring 2005
Example with nulls • look at emp table • get names, salaries, commissions, total salaries • What if commission is null? • nvl in Oracle, ifnull in MySQL M.P. Johnson, DBMS, Stern/NYU, Spring 2005
Live examples • Inner joins require an ON clause • Like a where clause • Arbitrary boolean expression • If always true (1=1), reduces to cross join • New compar op: BETWEEN • a between 5 and 10 a >= 5 and a <= 10 • Q: produce a list of employees with their salary grades • emp, salgrade M.P. Johnson, DBMS, Stern/NYU, Spring 2005
Live examples • Q: produce a list of employees and their bosses • What if no boss? Or no subordinate? • Joins on emp, emp man: • Comma-based • Cross • Natural • Inner • Must use INNER JOIN in MySQL • If want non-managers, do outer join… • No FULL OUTER JOIN in MySQL yet M.P. Johnson, DBMS, Stern/NYU, Spring 2005
Live examples • Q: produce list of bosses and underling-counts, for bosses with >1 underling • Just add HAVING clause… M.P. Johnson, DBMS, Stern/NYU, Spring 2005
Finally: R.A./SQL has limitations • Can easily find Alice’s direct subordinates: • But: find all of King’s underlings • Cannot compute “transitive closure” • Cannot express in R.A./SQL! • SQL is not “Turing-Complete” M.P. Johnson, DBMS, Stern/NYU, Spring 2005
Live examples • Examples from sqlzoo.net SELECT L FROM R1, …, Rn WHERE C PL(sC(R1 x … Rn) M.P. Johnson, DBMS, Stern/NYU, Spring 2005