360 likes | 569 Views
Lecture 3 :Database Analysis and Design (III) Normalization. ITN 170 MySQL Database Programming. Section objectives. Define normalization and explain its benefits. Place tables in Third Normal Form. Explain how conceptual data modeling rules ensure normalized tables. Normalize Tables.
E N D
Lecture 3 :Database Analysis and Design (III)Normalization ITN 170 MySQL Database Programming ITN 170 - Table Normalization
Section objectives • Define normalization and explain its benefits. • Place tables in Third Normal Form. • Explain how conceptual data modeling rules ensure normalized tables. ITN 170 - Table Normalization
Normalize Tables Categorize tables according to their degree of normalization. Normal Form Rule Description The table must be expressed as a set of unordered, two-dimensional tables. The table cannot contain repeating groups. First Normal Form (1NF) The table must be in 1NF. Every non-key column must be dependent on all parts of the primary key. Second Normal Form (2NF) Third Normal Form (3NF) The table must be 2NF. No non-key column may be functionally dependent on another non-key column. ITN 170 - Table Normalization
Normalize Tables “Each non-primary key value MUST be dependent on the key, the whole key, and nothing but the key.” • Normalization minimizes data redundancy. Un-normalized data is redundant. • Data redundancy causes integrity problems. Update and delete transactions may not be consistently applied to all copies of the data causing inconsistencies in the data. • Normalization helps identify missing entities, relationships, and tables. Why normalize tables? ITN 170 - Table Normalization
Normalize Tables In addition to the three normal forms we mentioned, there are some more higher normal forms such as Boyce-Codd normal form, fourth normal form, and fifth normal form. However, they are not widely used in database designs. In general, third normal form is accepted goal for a database design that eliminates redundancy. ITN 170 - Table Normalization
This an Excel table. Looks nice, isn’t it? ITN 170 - Table Normalization
ORDER_ID DATE CUST_ID CUST_NAME STATE ITEM NUM ITEM DESCRIP QUANTITY PRICE 2301 6/23 101 Volleyrite IL 3786 net 3 35.00 4011 racket 6 65.00 9132 3-pack 8 4.75 2302 6/25 107 Herman’s WI 5794 6-pack 4 5.00 2303 6/26 110 We-R-Sports MI 4011 racket 2 65.00 3141 cover 2 10.00 Recognize Un-normalized Data Let’s simplify the Excel Table as follows: Consider the following set of data. The Excel Table is un-normalized. Un-normalized data does not comply with any of the rules of normalization we just mentioned. Three variable length records are shown – one for each ORDER_ID, i.e. 2301, 2302, and 2303. Why is this data un-normalized? ITN 170 - Table Normalization
[Answer]The table contains a repeating group of ITEM NUM, ITEM DESCRIPTION, QUANTITY, and PRICE. Recognize Un-normalized Data ORDER_ID DATE CUST_ID CUST_NAME STATE ITEM NUM ITEM DESCRIP QUANTITY PRICE 2301 6/23 101 Volleyrite IL 3786 net 3 35.00 4011 racket 6 65.00 9132 3-pack 8 4.75 2302 6/25 107 Herman’s WI 5794 6-pack 4 5.00 2303 6/26 110 We-R-Sports MI 4011 racket 2 65.00 3141 cover 2 10.00 Remember, First Normal Form prohibits repeating groups. ITN 170 - Table Normalization
First Normal Form Remove any repeating groups: • Fill the identical data in the empty spaces of the base table (temporarily makes the base table to be a non-repeated group) • Remove the repeating group from the base table • Create a new table with the PK column from the base table and the repeating group. ITN 170 - Table Normalization
ORDER_ID DATE CUST_ID CUST_NAME STATE ITEM NUM ITEM DESCRIP QUANTITY PRICE 2301 6/23 101 Volleyrite IL 3786 net 3 35.00 2301 6/23 101 Volleyrite IL 4011 racket 6 65.00 9132 3-pack 8 4.75 2301 6/23 101 Volleyrite IL 2302 6/25 107 Herman’s WI 5794 6-pack 4 5.00 2303 6/26 110 We-R-Sports MI 4011 racket 2 65.00 3141 cover 2 10.00 2303 6/26 110 We-R-Sports MI First Normal Form (continued) Fill the identical data in the base table to temporarily avoid the repeating group in the base table, for a table with repeating groups is illegal for violating database definition. ITN 170 - Table Normalization
ORDER_ID DATE CUST_ID CUST_NAME ITEM NUM ITEM DESCRIP QUANTITY PRICE STATE 2301 6/23 101 Volleyrite IL 3786 net 3 35.00 2301 6/23 101 Volleyrite IL 4011 racket 6 65.00 9132 3-pack 8 4.75 2301 6/23 101 Volleyrite IL 2302 6/25 107 Herman’s WI 5794 6-pack 4 5.00 2303 6/26 110 We-R-Sports MI 4011 racket 2 65.00 3141 cover 2 10.00 2303 6/26 110 We-R-Sports MI First Normal Form (continued) Remove the repeating group of ITEM NUM, ITEM DESCRIPTION, QUANTITY, and PRICE from the following table to a new table. The PK of the remaining table is ORDER ID. Create a new ORDER_ITEM table with ORDER ID and the repeating group. ITN 170 - Table Normalization
ORDER_ID ITEM NUM ITEM DESCRIP QUANTITY PRICE 2301 3786 net 3 35.00 2301 4011 racket 6 65.00 9132 3-pack 8 4.75 2301 2302 5794 6-pack 4 5.00 2303 4011 racket 2 65.00 2303 3141 cover 2 10.00 First Normal Form (continued) Therefore, we normalize the “Big” table into two “Small” relational tables (ORDER table and ORDER_ITEMtable). ORDER DATE STATE CUST_ID CUST_NAME ORDER_ID 2301 6/23 101 Volleyrite IL 2302 6/25 107 Herman’s WI 2303 6/26 110 We-R-Sports MI ORDER_ITEM ITN 170 - Table Normalization
Second Normal Form Remove any non-key columns that are not dependent upon the table’s entire primary key. • Determine which non-key columns are not dependent upon the table’s entire primary key. • Remove those columns from the base table. • Create second table with those columns and the columns(s) from the PK that they are dependent upon. ITN 170 - Table Normalization
DATE STATE CUST_ID CUST_NAME ORDER_ID 2301 6/23 101 Volleyrite IL 2302 6/25 107 Herman’s WI 2303 6/26 110 We-R-Sports MI Second Normal Form (continued) Put the ORDER table in 2NF. Remember, for a Second Normal Form table. The table must be first in 1NF. Then, every non-key column must be dependent on all parts of the primary key. Is this in 2NF? ITN 170 - Table Normalization
DATE STATE CUST_ID CUST_NAME ORDER_ID 2301 6/23 101 Volleyrite IL 2302 6/25 107 Herman’s WI 2303 6/26 110 We-R-Sports MI Second Normal Form (continued) • The ORDER table is already in 2NF. Any value of ORDER_ID uniquely determines a single value of each column. Therefore, all columns are dependent on the PK ORDER_ID. [Answer] ITN 170 - Table Normalization
ORDER_ID ITEM NUM ITEM DESCRIP QUANTITY PRICE 2301 3786 net 3 35.00 2301 4011 racket 6 65.00 9132 3-pack 8 4.75 2301 2302 5794 6-pack 4 5.00 2303 4011 racket 2 65.00 2303 3141 cover 2 10.00 Second Normal Form (continued) Remove any non-key columns that are not dependent upon the table’s entire primary key. Put the ORDER_ITEM table in 2NF. So, what about this? Still. Remember, for a Second Normal Form table. The table must be first in 1NF. Then, every non-key column must be dependent on all parts of the primary key. ITN 170 - Table Normalization
ITEM NUM ITEM DESCRIP PRICE 3786 net 35.00 4011 racket 65.00 9132 3-pack 4.75 5794 6-pack 5.00 4011 racket 65.00 3141 cover 10.00 Second Normal Form (continued) ORDER_ID QUANTITY 2301 3 2301 6 • No. The ORDER_ITEM table is not in 2NF since PRICE and ITEM DESCRIPTION are dependent upon ITEM NUM, but not dependent upon ORDER ID. 8 2301 2302 4 2303 2 2303 2 [Answer] Now. How to convert to 2NF? ITN 170 - Table Normalization
ITEM NUM ITEM DESCRIP PRICE 3786 net 35.00 ??? 4011 racket 65.00 9132 3-pack 4.75 5794 6-pack 5.00 ORDER_ID ITEM NUM QUANTITY 3141 cover 10.00 2301 3786 3 2301 4011 6 9132 8 2301 2302 5794 4 2303 4011 2 2303 3141 2 Second Normal Form (continued) • To convert the table to 2NF, remove any partially dependent columns. Create an ITEM table with those columns and the column from part of PK columns that they are dependent upon. ITEM ORDER_ITEM ITN 170 - Table Normalization
Third Normal Form Remove any columns that are dependent upon another non-key column. • Determine which columns are dependent upon another non-key column. • Remove those columns from the base table. • Create a second table with those columns and the non-key column that they are dependent upon. ITN 170 - Table Normalization
DATE STATE CUST_ID CUST_NAME ORDER_ID 2301 6/23 101 Volleyrite IL 2302 6/25 107 Herman’s WI 2303 6/26 110 We-R-Sports MI Third Normal Form (continued) ORDER table is already in 2NF as mentioned. Put the ORDER table in 3NF. Is this in 3NF? Remember. Remove any columns that are dependent upon another non-key column ITN 170 - Table Normalization
DATE STATE CUST_ID CUST_NAME ORDER_ID 2301 6/23 101 Volleyrite IL 2302 6/25 107 Herman’s WI 2303 6/26 110 We-R-Sports MI Third Normal Form (continued) • CUSTOMER NAME and STATE are dependent upon CUSTOMER ID. Since you know that CUSTOMER ID is not the PK. Therefore, the ORDER table is not in 3NF. [Answer] ITN 170 - Table Normalization
DATE CUST_ID ORDER_ID STATE CUST_ID CUST_NAME 2301 6/23 101 101 Volleyrite IL 2302 6/25 107 107 Herman’s WI 2303 6/26 110 110 We-R-Sports MI Third Normal Form (continued) • Move the dependent non-key columns with the non-key column they depend upon into a new CUSTOMER table. ORDER CUSTOMER Note: A table is in Third Normal Form if no non-key column is functionally dependent upon another non- key column ITN 170 - Table Normalization
Third Normal Form (continued) No non-key column can be functionally dependent upon another non-key column. Example Consider theORDER_ITEMtable as follows: ORDER_ID ITEM NUM QUANTITY 2301 3786 3 Is this in 3NF? 2301 4011 6 9132 8 2301 2302 5794 4 2303 4011 2 2303 3141 2 ITN 170 - Table Normalization
Third Normal Form (continued) Consider theORDER_ITEMtable as follows: ORDER_ID ITEM NUM QUANTITY • All non-key attributes are dependent on the key, the whole key, and nothing but the key. Therefore, the ORDER_ITEMtable is in 3NF. 2301 3786 3 2301 4011 6 9132 8 2301 2302 5794 4 2303 4011 2 2303 3141 2 [Answer] ITN 170 - Table Normalization
Third Normal Form (continued) No non-key column can be functionally dependent upon another non-key column. Example Consider theITEMtable as follows: ITEM NUM ITEM DESCRIP PRICE What about this? 3786 net 35.00 4011 racket 65.00 9132 3-pack 4.75 5794 6-pack 5.00 3141 cover 10.00 ITN 170 - Table Normalization
ITEM NUM ITEM DESCRIP PRICE 3786 net 35.00 4011 racket 65.00 9132 3-pack 4.75 5794 6-pack 5.00 3141 cover 10.00 Third Normal Form (continued) Consider theITEMtable as follows: • All non-key attributes are dependent on the key, the whole key, and nothing but the key. Therefore, the ITEMtable is in 3NF. [Answer] ITN 170 - Table Normalization
Normalization During data Modeling Ensure a 3NF table design by following the rules of data modeling. • A table must contain no repeating groups [First Normal Form Rule] Example CLIENT #* identifier * date contacted Is this entity CLIENT in 1NF? If not, how could it be converted to 1NF? ITN 170 - Table Normalization
Normalization During data Modeling • [Answer] • The attribute date contacted has multiple values, therefore the entity CLIENT is not in 1NF. • Create an additional entity CONTACT with a M:1 relationship to CLIENT. Create an additional entity and 1:M relationship to ensure 1NF. CLIENT #* identifier CONTACT #* date contacted o location o result for the subject of ITN 170 - Table Normalization
Normalization During data Modeling Validate attribute dependence upon its entity’s entire UID. • Every non-key column must be dependent upon all parts of the primary key. [Second Normal Form Rule] [Corresponding Data Modeling Rule] • An attribute must be dependent upon it entity’s entire unique identifier. ITN 170 - Table Normalization
ACCOUNT #* number o balance o date opened o bank location BANK #* number * name managed by the manager of Normalization During data Modeling Example Are all of the attribute in the E-R diagram dependent upon their entity’s UID? ITN 170 - Table Normalization
Normalization During data Modeling • [Answer] • The attribute bank location is not dependent upon the UID of ACCOUNT. It is dependent upon the UID of BANK. • Move the attribute and place it where it depends upon the UID of it’s entity. ACCOUNT #* number o balance o date opened BANK #* number * name o bank location managed by the manager of ITN 170 - Table Normalization
Normalization During data Modeling Validate attribute placement to ensure a normalized table design. • No non-key column can be functionally dependent upon another non-key column. [Third Normal Form Rule] [Corresponding Data Modeling Rule] • No non-UID attribute can be dependent upon another non-UID attribute. ITN 170 - Table Normalization
Normalization During data Modeling Are any of the non-UID attributes for this entity dependent upon another non-UID attributes? Example ORDER #* id * date of order * customer id * customer name * state ITN 170 - Table Normalization
Normalization During data Modeling • [Answer] • The attributes customer name and state are dependent upon the customer id. • Create another entity called CUSTOMER with a UID of customer id, and place the attributes accordingly. CUSTOMER #* id * name * state ORDER #* id * date of order for the submitter of ITN 170 - Table Normalization