1 / 48

C20.0046: Database Management Systems Lecture #14

C20.0046: Database Management Systems Lecture #14. M.P. Johnson Stern School of Business, NYU Spring, 2005. Summary: SQL queries. Only SELECT , FROM required Can’t have HAVING without GROUP BY Can have GROUP BY without HAVING Any clauses used must appear in this order:.

lynna
Download Presentation

C20.0046: Database Management Systems Lecture #14

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. C20.0046: Database Management SystemsLecture #14 M.P. Johnson Stern School of Business, NYU Spring, 2005 M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  2. Summary: SQL queries • Only SELECT, FROM required • Can’t have HAVING without GROUP BY • Can have GROUP BY without HAVING • Any clauses used must appear in this order: SELECT LFROM Rs WHERE s GROUP BY L2 HAVING s2 ORDER BY L3 M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  3. New topic: Modifications • Three kinds of modifications • Insertions • Deletions • Updates • Sometimes “update” used as a synonym for “modification” M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  4. Insertions General form: INSERT INTO R(A1,…., An) VALUES(v1,….,vn) Example: Insert a new purchase to the database: INSERT INTO Knights(name, britnatl, title) VALUES('Bill Gates', 'n', 'KBE') Missing attribute NULL (or other default value) M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  5. Insertions • If we’re sure we have all values in the right order, can just say: • Only do this if you’re sure of order in which the table fields were defined INSERT INTO R VALUES(v1,….,vn) INSERT INTO Knights VALUES('R. Giuliani', 'n', 'KBE'); INSERT INTO Knights VALUES('Bernard Kerik', 'n', 'CBE'); M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  6. Insertions • Can insert the result of a query; Scenario: • Product(name, etc.) • Purchase(buyerssn, prodName, etc.) • Maybe some purchases name missing products •  add those to the Product table • Subquery replaces VALUES INSERT INTO R(As) (query) M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  7. Insertion example • Product(name, listPrice, category) • Purchase(prodName, buyerName, price) • Premise: data corruption  lose some Product data • every product referred to in Purchase should exist in Product, but some are missing Product Purchase M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  8. Insertion example INSERT INTO Product(name) SELECT prodName FROM Purchase WHERE prodName NOT IN (SELECT name FROM Product) Purchase Product Product’ Q: Or do we get: A: Depends on implementation! M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  9. Deletions DELETE FROM Table WHERE condition • General form: • E.g.: • As usual, WHERE can contain subqueries • Depending on the DBMS • Q: How do you delete just one row with SQL simpliciter? • Oracle has the ROWID/ROWNUM pseudo-field… INSERT INTO Knights VALUES('R. Giuliani', 'n', 'KBE'); INSERT INTO Knights VALUES('Bernard Kerik', 'n', 'CBE'); DELETE FROM Knights WHERE name = 'Bernard Kerik'; M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  10. Updates UPDATE Product SET field1 = value1, field2 = value2 WHERE condition • General form: • Example: • As usual, WHERE can contain subqueries UPDATE Product SET price = price/2 WHERE Product.name IN (SELECT product FROM Purchase WHERE Date = DATE'Oct, 25, 1999') M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  11. New topic: Defining schemata • So far, have done queries and data manipulation • Now doing data definition • Recall data types: • INT or INTEGER (variant: SHORTINT) • FLOAT or REAL: floating-point numbers • DOUBLE PRECISION: • DECIMAL(n,d): • E.g. decimal(5,2): five decimal digits, with the decimal point two positions from the right: e.g. 123.45 • DATE and TIME • Character strings • Fixed length: CHAR(n) • Variable length: VARCHAR(n) M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  12. Creating tables CREATE TABLE Table-name ( field1 field-type, field2 field-type, … fieldn field-type ) • Form: • E.g.: No comma! CREATE TABLE People ( name VARCHAR(30), ssn CHAR(9), age INT, city VARCHAR(30), gender CHAR(1), dob DATE ) M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  13. Default Values • Specify defaults when creating table: • The default default: NULL CREATE TABLE People ( name VARCHAR(30), ssn CHAR(9), age SHORTINT DEFAULT 100, city VARCHAR(30) DEFAULT 'New York', gender BIT(1), dob DATE DEFAULT DATE '1900-01-01' ) M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  14. Deleting and modifying schemata • Delete data, indices, schema: • Delete data and indices: • Either way, exercise extreme caution! • Add or delete attributes: DROP TABLE Person TRUNCATE TABLE Person ALTER TABLE Person ADD phone CHAR(12) Q: What’s put in the new fields? ALTER TABLE Person DROP age M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  15. New topic: Indices • Veryimportant speeding up query processing • Index on field(s) = data structure that makes searches/comparisons on those fields fast • Suppose we have a relation • Person (name, age, city) • Sequential scan of the whole Person file may take a very long time SELECT * FROM Person WHERE name = 'Waksal, Sam' M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  16. Creating Indices • Syntax: • Here: • No searching by name is much faster • How much faster? • Log-time, say • Base-what? Doesn’t matter, but say 2 • If all New Yorkers, #comparisons: • 8000000  log2(8000000) ~= 23 • (i.e., 223 ~= 8000000) CREATE INDEX index-name ON R(fields) CREATE INDEX nameIndex ON Person(name) M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  17. How do indices work? • What the data structure? • Different possibilities • 1st intuition: index on field f is an ordered list of all values in the table’s f field • each item has address (“rowid”) of its row • Where do we get the ordered list? • 2nd intuition: put all f values in a BST • searching BST take log time (why?) • DBMSs actually use a variant: B+Tree • See Ullman’s book or data structures texts… M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  18. Creating Indices • Indexes can be useful in range queries too: CREATE INDEX ageIndex ON Person (age) SELECT * FROM Person WHERE age > 25 M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  19. Using indices • Indices can be created on multiple attributes: • Helps in: • And in: • But not in: CREATE INDEX doubleNdx ON Person (lname, fname) SELECT * FROM Person WHERE fname='Sam' AND lname = 'Waksal' SELECT * FROM Person WHERE lname='Waksal' Idea: our sorted list is sorted on age;city, not city;age SELECT * FROM Person WHERE fname='Sam' Q: In Movie, should index be on year;title or title;year? M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  20. The Index Selection Problem • Big Q: Why not just index all (sequences of) fields? • how does the list/B+Tree stay up to date? • We are given a workload: a set of SQL queries and their frequencies • Q is: What indices should we build to speed up the workload? • Answer: • Attributes in WHERE clauses (queries)  favor an index • Attributes in INSERT/UPDATE/DELETEs  discourage an index • In many DBMSs: your primary key fields get indexed automatically (why?) M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  21. New topic: Views • Stored relations physically exist and persist • Views are relations that don’t • in some texts, “table” = stored relation = “base table” • Basically names/references given to queries • maybe a relevant subset of a table • Employee(ssn, name, department, project, salary) • Payroll has access to Employee, others only to Developers CREATE VIEW Developers AS SELECT name, project FROM Employee WHERE department = 'Dev' M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  22. A Different View • Person(name, city) • Purchase(buyer, seller, product, store) • Product(name, maker, category) • We have a new virtual table: NYCview(buyer, seller, product, store) CREATE VIEW NYCview AS SELECT buyer, seller, product, store FROM Person, Purchase WHERE Person.city = 'New York' AND Person.name = Purchase.buyer M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  23. A Different View • Now we can query the view: CREATE VIEW NYCview AS SELECT buyer, seller, product, store FROM Person, Purchase WHERE Person.city = 'New York' AND Person.name = Purchase.buyer SELECT name, NYCview.store FROM NYCview, Product WHERE NYCview.product = Product.name AND Product.category = 'Camera' M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  24. What happens when we query a view? SELECT name, NYCview.store FROM NYCview, Product WHERE NYCview.product = Product.name AND Product.category = 'Camera' SELECT name, Purchase.store FROM Person, Purchase, Product WHERE Person.city = 'New York' AND Person.name = Purchase.buyer AND Purchase.poduct = Product.name AND Product.category = 'Camera' M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  25. Can rename view fields CREATE VIEW NYCview(NYCbuyer, NYCseller, prod, store) AS SELECT buyer, seller, product, store FROM Person, Purchase WHERE Person.city = 'New York' AND Person.name = Purchase.buyer M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  26. Types of Views • Views discussed here: • Used in databases • Computed only on-demand – slow at runtime • Always up to date • Sometimes talk about “materialized” views • Used in data warehouses • Pre-computed offline – fast at runtime • May have stale data • Maybe more later… M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  27. Updating Views How to insert a tuple into a table that doesn’t exist? Employee(ssn, name, department, project, salary) CREATE VIEW Developers AS SELECT name, project FROM Employee WHERE department = 'Development' If we make the following insertion: INSERT INTO Developers VALUES('Bill', ‘Word') It becomes: INSERT INTO Employee(ssn, name, dept, project, sal) VALUES(NULL, 'Bill', NULL, 'Word', NULL) M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  28. Non-Updatable Views • Person(name, city) • Purchase(buyer, seller, product, store) • How can we add the following tuple to the view? ('NYC', 'The Wiz') • We don’t know the name of the person who made the purchase • cannot set to NULL (why?) CREATE VIEW CityStore AS SELECT Person.city, Purchase.store FROM Person, Purchase WHERE Person.name = Purchase.buyer M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  29. Constraints in SQL • A constraint = a property that we’d like our database to hold • The system will enforce the constraint by taking some actions: • forbid an update • or perform compensating updates M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  30. Constraints in SQL simplest Constraints in SQL: • Keys, foreign keys • Attribute-level constraints • Tuple-level constraints • Global constraints: assertions The more complex the constraint, the harder it is to check and to enforce Mostcomplex M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  31. Keys Or: CREATE TABLE Product ( name CHAR(30) PRIMARY KEY, category VARCHAR(20) ) CREATE TABLE Product ( name CHAR(30), category VARCHAR(20) PRIMARY KEY (name) ) M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  32. Keys with Multiple Attributes CREATE TABLE Product ( name CHAR(30), category VARCHAR(20), price INT, PRIMARY KEY (name, category) ) M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  33. Other Keys • There is at most one PRIMARY KEY; there can be many UNIQUE • Primary key v. candidate keys CREATE TABLE Product ( productID CHAR(10), name CHAR(30), category VARCHAR(20), price INT, PRIMARY KEY (productID), UNIQUE (name, category) ) M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  34. Foreign Key Constraints • prodName is a foreign key to Product(name) • name should be a key in Product • Purchase ~ Product is many-one • NB: referenced field specified with parentheses, not dot Referentialintegrityin SQL CREATE TABLE Purchase ( prodName CHAR(30) REFERENCES Product(name), date DATETIME ) M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  35. Product Purchase M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  36. Foreign Key Constraints • Or: • (name, category) must be a key (primary/unique) in Product (why?) CREATE TABLE Purchase ( prodName CHAR(30), category VARCHAR(20), date DATETIME, FOREIGN KEY (prodName, category) REFERENCES Product(name, category) M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  37. What happens during updates? Types of updates: • In Purchase: insert/update • In Product: delete/update Product Purchase M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  38. What happens during updates? • SQL has three policies for maintaining referential integrity: • Reject: violating modifications (default) • Cascade: after a delete/update do a delete/update • Set-null: set foreign-key field to NULL M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  39. Constraints on Attributes and Tuples • Constraints on attributes: • NOT NULL -- obvious meaning... • CHECK condition -- any condition on row itself • Some DBMS support subqueries here, but many don’t • Constraints on tuples • CHECK condition M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  40. How is this different from aForeign-Key? CREATE TABLE Purchase ( prodName CHAR(30) CHECK (prodName IN SELECT Product.name FROM Product), date DATETIME NOT NULL ) M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  41. General Assertions • Supported in SQL standard: • Implemented/approximated in MySQL and Oracle as stored procedures • PL/SQL in Oracle CREATE ASSERTION myAssert CHECK (NOT EXISTS( SELECT Product.name FROM Product, Purchase WHERE Product.name = Purchase.prodName GROUP BY Product.name HAVING count(*) > 200) ) M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  42. Final Comments on Constraints • Can give them names, and alter later • We need to understand exactly when they are checked • We need to understand exactly what actions are taken if they fail M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  43. Example with nulls • look at emp table • get names, salaries, commissions, total salaries • What if commission is null? • nvl in Oracle, ifnull in MySQL M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  44. Live examples • Inner joins require an ON clause • Like a where clause • Arbitrary boolean expression • If always true (1=1), reduces to cross join • New compar op: BETWEEN • a between 5 and 10 a >= 5 and a <= 10 • Q: produce a list of employees with their salary grades • emp, salgrade M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  45. Live examples • Q: produce a list of employees and their bosses • What if no boss? Or no subordinate? • Joins on emp, emp man: • Comma-based • Cross • Natural • Inner • Must use INNER JOIN in MySQL • If want non-managers, do outer join… • No FULL OUTER JOIN in MySQL yet M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  46. Live examples • Q: produce list of bosses and underling-counts, for bosses with >1 underling • Just add HAVING clause… M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  47. Finally: R.A./SQL has limitations • Can easily find Alice’s direct subordinates: • But: find all of King’s underlings • Cannot compute “transitive closure” • Cannot express in R.A./SQL! • SQL is not “Turing-Complete” M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  48. Live examples • Examples from sqlzoo.net SELECT L FROM R1, …, Rn WHERE C PL(sC(R1 x … Rn) M.P. Johnson, DBMS, Stern/NYU, Spring 2005

More Related