290 likes | 449 Views
Information technology in business and society. Session 16 – SQL Sean J. taylor. Administrativia. Assignment 3: New drop for any updates related to AdSense only Database tutorial led by Varun : Tuesday 3/27 12:30pm-1:45 Assignment 4: Posted on the web due Friday 3/30
E N D
Information technology in business and society Session 16 – SQL Sean J. taylor
Administrativia • Assignment 3: New drop for any updates related to AdSense only • Database tutorial led by Varun:Tuesday 3/27 12:30pm-1:45 • Assignment 4: Posted on the web due Friday 3/30 • Midterm: end of class
Administrativia II • Groups:Please fill out your forms by SUNDAY 3/25 • 2-way feedback:1. Please fill out your surveys!2. I will send you a brief summary of your current grade.
Learning objectives • Be able to query single tables using SQL. • Be able to perform 1:1, 1:M joins to query relational information. • Be able to compute aggregates information using where and having clauses. • Be able to perform N:M joins using a join table.
Review: Relational Databases • Information is stored in tables • Each table contains information about a real word “entity” (e.g., a book, a customer) • Each table contains fields (e.g., BookName, Author, Price) • Each row of the table contains a unique identifier, a.k.a. primary key (e.g., ISBN)
Review: Normalization • A technique for designing relational database tables to minimize duplication of information and to safeguard the database against certain types of logical or structural problems, namely data anomalies. • For example, when multiple instances of a given piece of information occur in a table, the possibility exists that these instances will not be kept consistent when the data within the table is updated, leading to a loss of data integrity. Prevents insertion, deletion and update anomalies. • Rule of thumb: If you find yourself typing the same information in a field again and again, then the design is bad • You will encounter anomalies in the future
SQL: Introduction • What is SQL? • Review: Database and DBMS (chapter 1) • SQL is a standard command language use by relational DBMS to perform database operations • Some facts about SQL • SQL 92 is the most commonly supported version • English-like (not programming) • Case insensitive • Venders have different implementations
SQL Statements • DDL - data definition language • Defining and modifying data structures (metadata): database, tables, views, etc. • DML - data manipulation language • Manipulating user data: insert, update, search, etc. • DCL - data control language • Control data access: permissions, etc.
SELECT Statement • SELECT statement retrieves data from database (query) • The result is usually another table (but not necessarily a relation) • We will learn • Defining selection criteria • Sorting • Calculation • Grouping
SELECT Statement Syntax SELECT Column(s) or other expressions FROM Table(s) [WHERE …] [ORDER BY ColumnName]
Select Columns • Syntax • SELECT * (or a list of columns) FROM TableName • Wild card: * • Example • SELECT * FROM Book • SELECT BookName, PriceFROM Books
Select Rows • Use WHERE clause to specify selection criteria • Example • SELECT * FROM Book WHERE Price= 29.99 • SELECT BookName, PriceFROM Books WHERE Price< 20 • 2.1 Comparison Operators • “=“, “>”, “<“, “>=“, “<=“, “<>”
Data Types in Comparison • Text, String, Memo, etc. – '…' • BookName='database' • BookName>'2008' //alphabetical order • Number, integer, decimal, currency, etc. • Price > 20.99 • Data/Time • Access 2007: >#08/30/2008# //after the date • SQL Server: <'08/30/2008' //before the date
More Comparison Operators • IN (value list) • SELECT * FROM Book WHERE PriceIN (19.99, 29.99, 39.99) • BETWEEN min AND max • SELECT * FROM Book WHERE PriceBETWEEN 9.99 AND 19.99 • IS NULL • SELECT * FROM Book WHERE AuthorIS NULL
String Pattern Match • Fuzzy query using LIKE • _ (underscore): single character wildcard • ? in Access 2007 • % (percentage): multiple character wildcard • * in Access 2007 • Example • SELECT * FROM Book WHERE BookNameLIKE '*information systems*'
NOT • Reversal criteria • NOT (expression) • Examples: • NOT Price > 20 • NOT BookNameLIKE '*information*' • NOT IS NULL or IS NOT NULL
Compound Conditions • Use logical operators to connect multiple conditions • AND: an intersection of the data sets (higher precedence) • OR: a union of the data sets • Best practice: use parentheses () to explicitly mark comparison order • Examples • SELECT * FROM Book WHERE Price<= 19.99 AND Price>= 9.99 • SELECT * FROM Book WHERE PubDate=#10/1/2003# OR PubDate=#10/1/2004# • SELECT * from Book WHERE (Publisher = 'Que' OR Publisher = 'Alpha') AND Price= 29.99
Sorting • Syntax • ORDER BY Column(s) [ASC/DESC] • Examples SELECT * FROM Books ORDER BY PubDate • Multiple columns SELECT * FROM Book ORDER BY Publisher DESC, PubDate
Column Based Calculation • Column based calculation • Calculated (derived) columns are not designed directly into the table • Using +, -, *, / with columns and numbers • Example • SELECT BookName, Price, Price/numberofPagesAS PricePerPage FROM Books • SELECT BookName, Price, Price* 0.1 AS Discount FROM Book WHERE Price* 0.1 >= 15
A Complete Query Example SELECT ISBN, BookName, Price, Publisher FROM Book WHERE BookNamelike '*Information Systems*' AND PubDate > #1/1/2002# AND Price< 100 ORDER BY Price
Row Based Calculation • Using aggregate functions based on rows • MIN (minimum) • MAX (maximum) • COUNT (The number of) • AVG (Average) • SUM (Sum of) • Example • SELECT COUNT(*) FROM Book • SELECT AVG(Price) FROM Book WHERE Publisher = 'Prentice Hall'
Grouping • GROUP BY: doing math with groups • SELECT COUNT(*) FROM Book WHERE Publisher = 'Prentice Hall';SELECT COUNT(*) FROM Book WHERE Publisher = ‘Springer';… • Or: SELECT Publisher, COUNT(*) FROM Book GROUP BY Publisher
Group By … Having • Use “Having” clause to filter aggregation result SELECT Publisher, COUNT(*) FROM Book GROUP BY Publisher Having Count(*) > 2 • Use “where” clause to filter records to be aggregated SELECT Publisher, COUNT(*) as total FROM Book Where Price < 100 GROUP BY Publisher Having Count(*) > 10 Order by Count(*)
Uniqueness • Using the keyword “DISTINCT” • Example: • SELECT DISTINCT Publisher FROM Book
Table Join • Querying data from multiple tables • The query result consists of columns from more than one table • How do rows match? • Specifying matching/joining criteria: usually by the pair of primary key/foreign key, or candidate key/foreign key
Table Join Example ISBN is a Foreign Key here SELECT BookName, DateFROM Book, Order Where Book.ISBN = Order.ISBN Order By Book.ISBN, Order.Date Use “table.column” format to avoid ambiguity Joining/matching criteria: very important, don’t forget!
Joining More Tables SELECT Book.BookName, Order.Date, Customer.LastName FROM Book, Order, Customer WHERE Books.ISBN = Order.ISBNand Order.CustomerID= Customer.CustomerID ORDER BY Books.ISBN, Rating
Next Class:Data Mining • Read “Diamond in the Data Mine” • Work on A4 • Groups form
Midterm Review process • Consult the answer key. • Photocopy the page(s) of your exam that you wish to dispute. • Write why you think you deserve points. • Submit to my mailbox on the 8th floor by Thursday 3/29.