450 likes | 627 Views
Information Resources Management. March 13, 2001. Agenda. Administrivia Normalization Homework #7 Mid-Term #2. Administrivia. Homework #4 Homework #5 Homework #6 Quiz 2 Mid-Term #1 Keys Mid-Term Grades. Regrade Requests HW 5 & 6. Create Database Enter query(s) as submitted
E N D
Information Resources Management March 13, 2001
Agenda • Administrivia • Normalization • Homework #7 • Mid-Term #2
Administrivia • Homework #4 • Homework #5 • Homework #6 • Quiz 2 • Mid-Term #1 Keys • Mid-Term Grades
Regrade Requests HW 5 & 6 • Create Database • Enter query(s) as submitted • Submit to me • Database (electronic) • Graded homework (paper) • Reserve the right to change test data and reexecute query
Normalization • Why & What • 1st Normal Form • 2nd Normal Form • 3rd Normal Form • Boyce-Codd Normal Form • 4th Normal Form
Normalization - Why • Eliminate anomalies • Avoid duplication • Increase flexibility and stability • Reduce maintenance
Normalization - What?!? • Analysis of functional dependencies between attributes • Building several smaller tables from larger ones • Decomposing relations with anomalies to produce smaller, well-structured relations • Reducing complexity & increasing stability
Normalization - What (2) • Series of Steps • Recipe for constructing a “good” physical model of a database from a logical model • Applied to all existing tables, including ones produced by earlier normalization steps
Example Sales (Order#, Date, CustID, Name, Address, City, State, Zip, {Product#, ProductDesc, Price, QuantityOrdered}, Subtotal, Tax, S&H, Total) • What are the problems with using a single table for all order information?
Problems • Implementing Repeating Groups • Duplication of Data (customer name & address) • Unnecessary Data (subtotal, total, tax) • Others Normalization is a process to eliminate these problems.
1st Normal Form • Eliminate Repeating Groups • 1st Normal Form has no repeating groups • Create definition with all other attributes, remove the repeat {}, and change the primary key to include the “key” for the repeating group.
Example Sales (Order#, Date, CustID, Name, Address, City, State, Zip, Product#, ProductDesc, Price, QuantityOrdered, Subtotal, Tax, S&H, Total) • Why is this better?
1st NF Improvements • Implementation is possible • Querying is possible
2nd Normal Form • Remove all partial functional dependencies • 2nd Normal Form has no partial functional dependencies and is in 1st Normal Form • Partial dependencies get their own tables -- original table gets a foreign key
Partial Functional Dependencies • An attribute is only dependent on part of the primary key • must be composite key • single attribute key is 2nd NF • Functional dependencies can be specified explicitly but usually come from the E-R model, user specifications, and common sense key non-key attributes
Example - Functional Dependencies Order# Date, CustID, Name, Address, City, State, Zip, Subtotal, Tax, S&H, Total Order#, Product# ProductDesc, Price, QuantityOrdered CustID Name, Address, City, State, Zip Product# ProductDesc, Price Which are partial functional dependencies?
Example Sales (Order#, Date, CustID, Name, Address, City, State, Zip, Subtotal, Tax, S&H, Total) OrderLine (Order#, Product#, ProductDesc, Price, QuantityOrdered) • Is this 2nd NF?
Example Sales (Order#, Date, CustID, Name, Address, City, State, Zip, Subtotal, Tax, S&H, Total) OrderLine (Order#, Product#, QuantityOrdered) Product (Product#, ProductDesc, Price) • Is this 2nd NF? Why is this better than 1st NF?
2nd NF Improvements • Elimination of Duplicate Data • No Loss
3rd Normal Form • Eliminate transitive functional dependencies • 3rd Normal Form has no transitive depencencies and is in 2nd Normal Form • Transitive dependencies get their own tables -- original table gets a foreign key
Transitive Functional Dependencies • Attribute is dependent on another, non-key attribute or attributes • Attribute is the result of a calculation CustID ® Name, Address, City, State, Zip
Example Sales (Order#, Date, CustID, Subtotal, Tax, S&H, Total) OrderLine (Order#, Product#, QuantityOrdered) Product (Product#, ProductDesc, Price) Customer (CustID, Name, Address, City, State, Zip) • Is this 3rd NF? Why is this better than 2nd NF?
Example Sales (Order#, Date, CustID) OrderLine (Order#, Product#, QuantityOrdered) Product (Product#, ProductDesc, Price) Customer (CustID, Name, Address, City, State, Zip) • Is this 3rd NF? Why is this better than 2nd NF?
3rd NF Improvements • Elimination of Duplicate Data • No Loss • Data is Well-grouped
Beyond 3rd Normal Form • Assume we also want to track information about products, builders, and finishes • The following are the functional dependencies: • Product, Finish ® Builder • Builder ® Finish
Beyond 3rd Normal Form ProdFinish (Product#, {Finish, Builder}) becomes ProdFinish (Product#, Finish, Builder) Is this 3rd NF?
What’s wrong with 3rd NF? • Product, Finish ® Builder • Builder ® Finish
What’s Wrong with 3rd NF? What happens when: 1. Vera is replaced by Vern? 2. Vera is rehired to work with Oak? 3. Product #3 in pine is discontinued?
What’s Wrong with 3rd NF? • Problems 1. Multiple changes need to be made 2. Can’t assign a builder without a product 3. Lose information that Marv works in Pine
Problem & Solution Problem: • Builder ® Finish • Builder is not a key Solution: • Boyce-Codd Normal Form
Boyce-Codd Normal Form (BCNF) • Every determinant in a relation (LHS of the FD’s) is a candidate key and 3rd NF • Make determinant part of the key and that which is dependent on it an attribute and renormalize
Example ProductFinish (Product#, Builder, Finish) Is this BCNF? Hint: Is it 3rd NF?
Example ProductFinish (Product#, Builder) Builder (Builder, Finish) Is there anything wrong with this?
Example ProductBuilder (Product#, Builder) Builder (Builder, Finish) Normalization often results in the need to rename tables so the table name matches the actual contents.
Beyond BCNF • Normalization with separate repeating groups can result in other anomalies CustService (State, {SalesPerson}, {Delivery})
Beyond BCNF CustService (State, SalesPerson, Delivery) Is this BCNF?
Beyond BCNF • Everything is in the key -- must be BCNF • Still problems with duplication • Multivalued Dependencies
Multivalued Dependency • At least three attributes (A, B, C) • A ® B and A ® C • B and C are independent of each other (they really shouldn’t be in the same table)
4th Normal Form • No multivalued dependencies and BCNF • Create separate tables for each separate functional dependency
Example SalesForce (State, SalesPerson) Delivery (State, Delivery)
Beyond 4th Normal Form • 5th Normal Form • Project-Join Normal Form • Domain Key Normal Form (DKNF)
User View 1st NF 2nd NF 3rd NF BCNF 4th NF Remove partial functional dependencies Remove repeating groups Remove remaining functional dependency anomalies Remove transitive functional dependencies Remove multivalued dependencies
In-Class Exercises • Identify the current normal form • If not 4th NF, transform to 4th NF
Homework #7 • Normalization • Database schema from HW #3 • Earlier due date - post key?
Mid-Term #2 • Next week, 3/20 • Topics • Converting an E-R Diagram to a physical database schema • Normalizing that schema (3NF) • SQL • Identification of BCNF, 4NF problems