• 510 likes • 678 Views
Chapter 3: Basic SQL. -- Introduction to database principles Maoying Wu ( bi203.sjtu@gmail.com ) March 11, 2013. Outline. Data Definition Language (DDL) Basic Query Structure Set Operations Aggregate functions Null values Nested subqueries Complex queries Views
E N D
Chapter 3: Basic SQL -- Introduction to database principles Maoying Wu (bi203.sjtu@gmail.com) March 11, 2013
Outline • Data Definition Language (DDL) • Basic Query Structure • Set Operations • Aggregate functions • Null values • Nested subqueries • Complex queries • Views • Modification of the Database • Join Operations
Data definition language (DDL) • The schema for each relation (关系模式) • The domain of values associated with each attribute (列类型) • Integrity constraints (完整性约束) • The set of indices to be maintained for each relation (索引) • Security and authorization information for each relation (权限) • The physical storage structure for each relation (存储引擎)
Data Types in SQL • char(n): Fixed length character string, with user-specified length n • varchar(n): Variable length character string with user-specified maximum length n • int: Integer (a finite subset of the integers that is machine-dependent) • smallint: Small integer (a machine-dependent subset of the integer domain type) • numeric(p, d): Fixed point number, with user-specified precision of p digits, with d digits to the right of decimal point • real, double precision: floating point and double-precision floating point number, with machine-dependent precision • float(n): Floating point number, with user-specified precision of at least n digits • To get more details, see the help page for data types in MySQL
Create table CREATE TABLE r(A1 D1, A2 D2, …, AnDn, [integrity-constraint 1], … [integrity-constraint k]); • r is the name of the relation • each Ai is an attribute name in the schema for relation r • Di is the data type for the domain for attribute Ai
Integrity constraints • NOT NULL • PRIMARY KEY (A1, …, Ak)
DROP TABLE DROP TABLE r; • Delete all information for relation r from the database
ALTER TABLE ALTER TABLE r ADD AkDk; • Add attributes to an existing relation • All tuples in the relation are assigned null as the values for the added attribute ALTER TABLE r DROP A; • Drop attributes of a relation • not supported by many DBMS
Basic Queries • SQL is based on set and relational operations with certain modifications and enhancements • A typical SQL query has the form: SELECT A1, A2, …, An FROM r1, r2, …, rm WHERE P; • This query is equivalent to the relational algebra expression:
The SELECT Clause • The SELECT clause lists the attributes for the result of a query • Projection operation in relational algebra • Keyword DISTINCT can be used to eliminate the duplicate records • SELECT DISTINCT loan_numberFROM loan; • Keyword ALL can be used to preserve all the duplicate records • SELECT ALL loan_numberFROM loan;
SELECT Clause (Cont.) • “*”means “all attributes”: • SELECT * FROM loan; • Arithmetic expressions (+,-,*,/) can be involved in the expression following SELECT • SELECTloan_number, amount*100 FROM loan;
The WHERE Clause • WHERE clause specifies conditions the result must satisfy • corresponding to the selection predicate of the relation algebra • Logical operations like AND, OR and NOT can be used to combine multiple conditional expressions SELECTloan_number FROM loan WHEREbranch_name=‘Shanghai’ AND amount>1200;
WHERE Clause • BETWEEN can be used to specified the range SELECTloan_number FROM loan WHERE amount BETWEEN 90000 AND 100000; SELECTloan_number FROM loan WHERE (amount > 90000) AND (amount > 100000);
FROM Clause • FROM specifies the relations involved in the query • corresponding to the Cartesian product operations of the relational algebra SELECT * FROM borrower, loan; • Find the name, loan_number and loan amount of all customers having a loan at Shanghai branch: SELECTcustomer_name, borrower.loan_number, amount FROM borrower, loan WHEREborrower.loan_number = loan.loan_number and branch_name=‘Shanghai’;
AS Clause • Synopsis: old_name AS new_name • corresponding to the renaming operation in relational algebra: SELECTcustomer_name, borrower.loan_numberASloan_id, amount FROM borrower, loan WHEREborrower.loan_number = loan.loan_number; SELECT customer_name, T.loan_number, S.amount FROM borrower AS T, loan AS S WHERE T.loan_number = S.loan_number;
string operations • SQL includes a string-matching operator for comparisons on character string: LIKE • %: matches any string • _: matches any character • Find the names of all customers whose street includes the substring “Main”: SELECTcustomer_name FROM customer WHEREcustomer_streetLIKE ‘%MAIN%’; • If a percentage sign is included in a string, escape character can be used: LIKE ‘MAIN\%’; NOTE: Many more functions can be used in string operations.
ORDER BY: Ordering the results SELECT DISTINCT customer_name FROM borrower, loan WHEREborrower.loan_number = loan.loan_numberANDbranch_name=‘Shanghai’ ORDER BY customer_name; • The default ordering rule is ascending, but we can also use ‘DESC’ to specify the descending order: ORDER BY customer_nameDESC;
Set Operations • Set operations UNION, INTERSECT and EXCEPT correspond to the relational algebra ᴜ,∩,and -. • In MySQL, NO INTERSECTand EXCEPT but we can use IN and NOT IN as an alternative.
Set operations • Find all customers with a loan, an account, or both: (SELECTcustomer_nameFROM depositor) UNION (SELECTcustomer_nameFROM borrower); • Find all customers who have both loan and account (SELECTcustomer_nameFROM depositor) INTERSECT (SELECTcustomer_nameFROM borrower); • Find all customers who have an account but no loan (SELECTcustomer_nameFROM depositor) INTERSECT (SELECTcustomer_nameFROM borrower);
Aggregate functions • AVG: average value • MIN: minimum value • MAX: maximum value • SUM: sum of values • COUNT: number of values
Aggregate functions: examples • Find the average account balance at the Shanghai branch: SELECTavg(balance) FROM account WHEREbranch_name = ‘Shanghai’; • Find the number of tuples in the customer relation SELECTcount(*) FROM customer; • Find the number of depositors in the bank SELECTcount(distinct customer_name) FROM depositor;
Aggregate: GROUP BY • Find the number of depositors for each branch: SELECTbranch_name, count(distinct customer_name) FROM depositor, account WHEREdepositor.account_number=account.account_number GROUP BY branch_name; • NOTE:The attribute names after the GROUP BY should appear after SELECT, while outside the aggregate functions.
HAVINGQualifying Results by Categories • Find the names of all branches where the average account balance is more than 1200 SELECTbranch_name, avg(balance) FROM account GROUP BY branch_name HAVINGavg(balance) > 1200; • Distinctions between HAVING and WHERE • Predicates in HAVING are applied after forming GROUPS, while predicates in WHERE are applied before forming GROUPS.
NULL values • NULL signifies an unknown value or a value that does not exist. • The predicate IS NULL can be used to check for NULL values. SELECTloan_number FROM loan WHERE amount IS NULL; • The result of any arithmetic expression involving NULL is NULL • 1+NULL returns NULL • Many aggregate functions ignore NULL values
NULL (cont.) • Any comparison with NULL returns UNKNOWN • LOGIC involving UNKNOWN: • OR • (UNKNOWN OR true) = true, • (UNKNOWN OR false) = UNKNOWN, • (UNKNOWN OR UNKNOWN) = UNKNOWN • AND • (true AND UNKNOWN) = UNKNOWN, • (false AND UNKNOWN) = false, • (UNKNOWN AND UNKNOWN) = UNKNOWN • NOT • (NOT UNKNOWN) = UNKNOWN • Any UNKNOWN in WHERE is treated as false • IS UNKNOWN is used to evaluate if the expression is UNKNOWN.
NULL value in aggregate • Get the total all loan amounts: SELECTSUM(amount) FROM loan; • Above statement ignores all NULL amounts • Result is NULL is all is NULL. • All aggregate operation except count(*) ignore tuples with NULL values on the aggregated attributes.
Nested subqueries • SQL provides a mechanism for the nesting of subqueries • A subquery is a SELECT-FROM-WHERE expression that is nested within another query • A common use of subqueries is to perform tests for set membership, set comparison, and set cardinality
Example subqueries • Find all customers who have both an account and a loan at the bank. SELECT distinct customer_name FROM borrower WHEREcustomer_namein (SELECT customer_name FROM depositor); • Find all customers who have a loan at the bank but do not have an account at the bank SELECT distinct customer_name FROM borrower WHEREcustomer_nameNOT IN (SELECT customer_name FROM depositor);
Example subqueries • Find all customers who have both an account and a loan at the Shanghai branch SELECT distinct customer_name FROM borrower, loan WHEREborrower.loan_number = loan.loan_number AND branch_name = ‘Shanghai' and (branch_name, customer_name ) IN (SELECTbranch_name, customer_nameFROM depositor, account WHEREdepositor.account_number = account.account_number ); • NOTE: The formulation above is simply to illustrate SQL features.
Set comparisons • SOME (至少存在其中之一) • Find all branches that have more assets than some branch located in Beijing SELECTdistinct T.branch_name FROM branch as T, branch as S WHERET.assets > S.assets and S.branch_city = 'Beijing' ; • Same query SELECTbranch_nameFROM branch WHERE assets > SOME(SELECT assets FROM branch WHEREbranch_city = 'Beijing') ;
Set comparisons • ALL (所有的都) • Find all branches that have more assets than all branch located in Beijing SELECTbranch_name FROM branch WHERE assets > ALL (SELECT assets FROM branch WHEREbranch_city = 'Beijing') ;
Test for empty results • EXISTS returns true if the argument subquery is non-empty • EXISTS(select-clause) • NOT EXISTS(select-clause)
Example query • Find all customers who have an account at all branches located in Beijing. SELECT DISTINCT S.customer_name FROM depositor as S WHERE NOT EXISTS ( (SELECTbranch_nameFROM branch WHEREbranch_city = 'Beijing') and branch_nameNOT IN(SELECTR.branch_nameFROM depositor as T, account as R WHERET.account_number = R.account_number and S.customer_name = T.customer_name )); • Note that
WITH Clause • The WITH clause provides a way of defining a temporary view whose definition is available only to the query in which the WITH clause occurs. • Find all accounts with the maximum balance WITHmax_balance (value) AS SELECT max(balance) FROM account SELECTaccount_number FROM account, max_balance WHEREaccount.balance = max_balance.value;
Views (视图) • Consider a person who needs to know a customer’s name, loan number and branch name, but has no need to see the loan amount. This person should see a relation described, in SQL, by (SELECTcustomer_name, borrower.loan_number, branch_name FROM borrower, loan WHEREborrower.loan_number = loan.loan_number ) • A viewprovides a mechanism to hide certain data from the view of certain users. • Any relation that is not of the conceptual model but is made visible to a user as a “virtual relation” is called a view
Views: Definition • A view is defined using the CREATE VIEW statement which has the form CREATE VIEW v AS <query expression> • <query expression> is any legal SQL expression. • Once a view is defined, the view name can be used to refer to the virtual relation that the view generates. • When a view is created, the query expression is stored in the database; the expression is substituted into queries using the view.
Example using view • A view consisting of branches and their customers: CREATEVIEW all_customerAS (select branch_name, customer_name from depositor, account where depositor.account_number = account.account_number ) union (select branch_name, customer_name from borrower, loan where borrower.loan_number = loan.loan_number ); • Find all customers of the Shanghai branch from view select customer_name from all_customer where branch_name = ‘Shanghai' ;
DELETE • Delete all accounts at the Guangzhou branch DELETE FROM account WHERE branch_name = ‘Guangzhou’; • Delete all accounts at every branch located in Shanghai. DELETE FROM account WHERE branch_name in (SELECT branch_name FROM branch WHERE branch_city=‘Shanghai’);
INSERT • Add a new account INSERTINTO account VALUES (‘A-9732’,’Minhang’, 12500); INSERT INTO account(branch_name, balance, account_number) VAlUES (‘Minhang’, 12500, ‘A-9732’); INSERT INTO account VAlUES (‘A-777’, ‘Minhang’, NULL);
INSERT: Example • Gifts for customers INSERT INTO account select loan_number, branch_name, 200 from loan where branch_name = ‘Jiaoda‘; INSERT INTO depositor select customer_name, loan_number from loan, borrower where branch_name = ‘Jiaoda' and loan.account_number = borrower.account_number;
UPDATE • Increase all accounts with balances over 10,000 by 6%, all other accounts receive 5% increase UPDATE account SET balance = balance * 1.06 WHERE balance > 10000; UPDATE account SET balance = balance * 1.05 WHERE balance <= 10000; • The order is important • Better using the CASE…WHEN statement
UPDATE (Cont.) • CASE WHEN…THEN statement UPDATE account SET balance = CASE WHEN balance <= 10000 THEN balance*1.05 ELSE balance*1.06 END;
JOIN • INNER JOIN • LEFT OUTERJOIN • RIGHT OUTER JOIN • FULL OUTER JOIN
Relations: borrower and loan • loan • borrower • borrower information missing for L-260; • Loan information missing for L-155
JOIN: Examples (1) • loan inner join borroweron loan.loan_number = borrower.loan_number • loan left outer join borrower on loan.loan_number = borrower.loan_number;
JOIN: Examples (2) • loan natural inner join borrower • loan natural right outer join borrower
JOIN: Examples (3) • loan full outer join borrower using (loan_number) • Find all customers who have either an account or a loan (but not both) at the bank: SELECTcustomer_name FROM (depositor natural full outer join borrower) WHEREaccount_number is NULL or loan_number is NULL;