880 likes | 1.04k Views
B. Information Technology (Hons.) CMPB245: Database Design. Normalization Pt. 1. Objectives. Explain the purpose of normalization and the problems associated with redundant information Identify the types of update anomalies Explain the concept of functional dependency.
E N D
B. Information Technology (Hons.)CMPB245:Database Design Normalization Pt. 1
Objectives • Explain the purpose of normalization and the problems associated with redundant information • Identify the types of update anomalies • Explain the concept of functional dependency
Normalization:The Purpose Objective 1
Relational Data Analysis • Edgar Codd of IBM, proposed the application of mathematical set theory and algebra to the organization of data • Relational Data Analysis • Application of mathematical principles to the storage and manipulation of data, thereby reducing redundancy and increasing flexibility
Relational Data Analysis • Prior to relational model • Data tended to be stored in a relatively ad-hoc fashion • File structures frequently mirrored the paper documents • Data duplication hence data redundancy was rife, leading to problems with maintenance and flexibility
NormalizationPurpose • To develop a logical data model, we must • create an accurate representation of the data, its relationship and constraints • This is achieved by • creating a set of relations using a technique called Normalization
NormalizationDefinitions • A technique for producing a set of rela-tions with desirable properties, based on the data requirements of the enterprise • A process of producing tables which con-form to specified standards, known as Normal Forms
NormalizationDefinitions • A formal method that identifies relations based on their primary key and the functional dependencies among their attributes • A process of identifying and regrouping attributes so that data duplication and anomalies are avoided
NormalizationInformation Reduncancy • One of the aims of database design is to group attributes into relations to minimize information redundancy • This also reduces the storage space • What sort of redundancy encountered and how can we avoid it?
Staff_No SName Position Salary Branch_No BAddress Tel_No SL21 John White Manager 30000 B5 22 Deer Rd., London 0171-8861212 SG37 Ann Beech Sen. Assist. 12000 B3 163 Main St., Glasgow 0141-3392178 SG14 David Ford Deputy 18000 B3 163 Main St., Glasgow 0141-3392178 SA9 Mary Howe Assistant 9000 B7 16 Argyll St., Aberdeen 01224-67125 SG5 Susan Brand Manager 24000 B3 163 Main St., Glasgow 0141-3392178 SL41 Julie Lee Assistant 9000 B5 22 Deer Rd., London 0171-8861212 NormalizationInformation Reduncancy • Consider the Staff_Branch relation which stores data about staffs working at a parti-cular branch STAFF_BRANCH
STAFF_BRANCH Staff_No SName Position Salary Branch_No BAddress Tel_No SL21 John White Manager 30000 B5 22 Deer Rd., London 0171-8861212 SG37 Ann Beech Sen. Assist. 12000 B3 163 Main St., Glasgow 0141-3392178 SG14 David Ford Deputy 18000 B3 163 Main St., Glasgow 0141-3392178 SA9 Mary Howe Assistant 9000 B7 16 Argyll St., Aberdeen 01224-67125 SG5 Susan Brand Manager 24000 B3 163 Main St., Glasgow 0141-3392178 SL41 Julie Lee Assistant 9000 B5 22 Deer Rd., London 0171-8861212 NormalizationInformation Reduncancy • The details of a branch is repeated for every member of staff located at that branch
Staff_No SName Position Salary Branch_No SL21 John White Manager 30000 B5 SG37 Ann Beech Sen. Assist. 12000 B3 SG14 David Ford Deputy 18000 B3 SA9 Mary Howe Assistant 9000 B7 SG5 Susan Brand Manager 24000 B3 SL41 Julie Lee Assistant 9000 B5 Branch_No BAddress Tel_No B5 22 Deer Rd., London 0171-8861212 B3 163 Main St., Glasgow 0141-3392178 B7 16 Argyll St., Aberdeen 01224-67125 STAFF BRANCH
Update Anomalies:The Types Objective 2
NormalizationUpdate Anomalies • Refers to the inconsistencies of updating (adding, modifying and deleting) informa-tion • Can be classified as • Insertion anomalies • Deletion anomalies • Modification anomalies
Staff_No SName Position Salary Branch_No BAddress Tel_No SL21 John White Manager 30000 B5 22 Deer Rd., London 0171-8861212 SG37 Ann Beech Sen. Assist. 12000 B3 163 Main St., Glasgow 0141-3392178 SG14 David Ford Deputy 18000 B3 163 Main St., Glasgow 0141-3392178 SA9 Mary Howe Assistant 9000 B7 16 Argyll St., Aberdeen 01224-67125 SG5 Susan Brand Manager 24000 B3 163 Main St., Glasgow 0141-3392178 Update AnomaliesInsertion Anomalies (Case 1) • To insert details of new members of staff (in Staff_Branch relation) STAFF_BRANCH
Staff_No SName Position Salary Branch_No BAddress Tel_No SL21 John White Manager 30000 B5 22 Deer Rd., London 0171-8861212 SG37 Ann Beech Sen. Assist. 12000 B3 163 Main St., Glasgow 0141-3392178 SG14 David Ford Deputy 18000 B3 163 Main St., Glasgow 0141-3392178 SA9 Mary Howe Assistant 9000 B7 16 Argyll St., Aberdeen 01224-67125 SG5 Susan Brand Manager 24000 B3 163 Main St., Glasgow 0141-3392178 SL41 Julie Lee Assistant 9000 B5 22 Dear Rd., London 0171-8861222 STAFF_BRANCH Update AnomaliesInsertion Anomalies (Case 1) • Correct details of branch at which the staff are to be located, must be entered
STAFF_BRANCH Staff_No SName Position Salary Branch_No BAddress Tel_No SL21 John White Manager 30000 B5 22 Deer Rd., London 0171-8861212 SG37 Ann Beech Sen. Assist. 12000 B3 163 Main St., Glasgow 0141-3392178 SG14 David Ford Deputy 18000 B3 163 Main St., Glasgow 0141-3392178 SA9 Mary Howe Assistant 9000 B7 16 Argyll St., Aberdeen 01224-67125 SG5 Susan Brand Manager 24000 B3 163 Main St., Glasgow 0141-3392178 Update AnomaliesInsertion Anomalies (Case 2) • To insert details of a new branch that has no staff allocated
STAFF_BRANCH Staff_No SName Position Salary Branch_No BAddress Tel_No SL21 John White Manager 30000 B5 22 Deer Rd., London 0171-8861212 SG37 Ann Beech Sen. Assist. 12000 B3 163 Main St., Glasgow 0141-3392178 SG14 David Ford Deputy 18000 B3 163 Main St., Glasgow 0141-3392178 SA9 Mary Howe Assistant 9000 B7 16 Argyll St., Aberdeen 01224-67125 SG5 Susan Brand Manager 24000 B3 163 Main St., Glasgow 0141-3392178 B5 22 Deer Rd., London 0171-8861212 Update AnomaliesInsertion Anomalies (Case 2) • Nulls must be entered into the attributes for staff, e.g. Staff_No
STAFF_BRANCH Staff_No SName Position Salary Branch_No BAddress Tel_No SL21 John White Manager 30000 B5 22 Deer Rd., London 0171-8861212 SG37 Ann Beech Sen. Assist. 12000 B3 163 Main St., Glasgow 0141-3392178 SG14 David Ford Deputy 18000 B3 163 Main St., Glasgow 0141-3392178 SA9 Mary Howe Assistant 9000 B7 16 Argyll St., Aberdeen 01224-67125 SG5 Susan Brand Manager 24000 B3 163 Main St., Glasgow 0141-3392178 B5 22 Deer Rd., London 0171-8861212 Update AnomaliesInsertion Anomalies (Case 2) • Since Staff_No is the primary key, entering nulls violates entity integrity!
Staff_No SName Position Salary Branch_No BAddress Tel_No SL21 John White Manager 30000 B5 22 Deer Rd., London 0171-8861212 SG37 Ann Beech Sen. Assist. 12000 B3 163 Main St., Glasgow 0141-3392178 SG14 David Ford Deputy 18000 B3 163 Main St., Glasgow 0141-3392178 SL41 Julie Lee Assistant 9000 B5 22 Deer Rd., London 0171-8861212 Update AnomaliesDeletion Anomalies • If a staff is deleted from the Staff_Branch relation that represents the last member of staff at a branch, information about that branch is also lost! STAFF_BRANCH
STAFF_BRANCH Staff_No SName Position Salary Branch_No BAddress Tel_No SL21 John White Manager 30000 B5 22 Deer Rd., London 0171-8861212 SG37 Ann Beech Sen. Assist. 12000 B3 163 Main St., Glasgow 0141-3392178 SG14 David Ford Deputy 18000 B3 163 Main St., Glasgow 0141-3392178 SL41 Julie Lee Assistant 9000 B5 22 Deer Rd., London 0171-8861212 Update AnomaliesModification Anomalies • To change the value of one of the attri-butes of a particular branch, e.g. Tel_ No. • The rows for all staffs located at that branch must be updated as well
Functional Dependency: The Concept Objective 3
Functional DependenciesDefinition • Describes the relationship between attributes in a relation • e.g. if A and B are attributes of a relation, B is functionally dependent on A, if each value of A is associated with exactly one value of B B is functionally B A dependent on A Attribute Attribute
B is functionally B A dependent on A Attribute Attribute Functional DependenciesDeterminant • The determinant of a functional dependency refers to the attribute or group of attributes on the left-hand side of the arrow • A is the determinant of B
Staff_No SName Position Salary Branch_No BAddress Tel_No SL21 John White Manager 30000 B5 22 Deer Rd., London 0171-8861212 SG37 Ann Beech Sen. Assist. 12000 B3 163 Main St., Glasgow 0141-3392178 SG14 David Ford Deputy 18000 B3 163 Main St., Glasgow 0141-3392178 SA9 Mary Howe Assistant 9000 B7 16 Argyll St., Aberdeen 01224-67125 SG5 Susan Brand Manager 24000 B3 163 Main St., Glasgow 0141-3392178 SL41 Julie Lee Assistant 9000 B5 22 Deer Rd., London 0171-8861212 Functional DependenciesExample • Consider the attributes Staff_No and Position of the Staff_ Branch relation STAFF_BRANCH
STAFF_BRANCH Staff_No SName Position Salary Branch_No BAddress Tel_No SL21 John White Manager 30000 B5 22 Deer Rd., London 0171-8861212 SG37 Ann Beech Sen. Assist. 12000 B3 163 Main St., Glasgow 0141-3392178 SG14 David Ford Deputy 18000 B3 163 Main St., Glasgow 0141-3392178 SA9 Mary Howe Assistant 9000 B7 16 Argyll St., Aberdeen 01224-67125 SG5 Susan Brand Manager 24000 B3 163 Main St., Glasgow 0141-3392178 SL41 Julie Lee Assistant 9000 B5 22 Deer Rd., London 0171-8861212 Functional DependenciesExample • For a specific Staff_No, e.g., SL21, we can determine the Position of that staff as Manager 23
STAFF_BRANCH Staff_No SName Position Salary Branch_No BAddress Tel_No SL21 John White Manager 30000 B5 22 Deer Rd., London 0171-8861212 SG37 Ann Beech Sen. Assist. 12000 B3 163 Main St., Glasgow 0141-3392178 SG14 David Ford Deputy 18000 B3 163 Main St., Glasgow 0141-3392178 SA9 Mary Howe Assistant 9000 B7 16 Argyll St., Aberdeen 01224-67125 SG5 Susan Brand Manager 24000 B3 163 Main St., Glasgow 0141-3392178 SL41 Julie Lee Assistant 9000 B5 22 Deer Rd., London 0171-8861212 Functional DependenciesExample • Thus, Position is functionally dependent on Staff_No
STAFF_BRANCH Staff_No SName Position Salary Branch_No BAddress Tel_No SL21 John White Manager 30000 B5 22 Deer Rd., London 0171-8861212 SG37 Ann Beech Sen. Assist. 12000 B3 163 Main St., Glasgow 0141-3392178 SG14 David Ford Deputy 18000 B3 163 Main St., Glasgow 0141-3392178 SA9 Mary Howe Assistant 9000 B7 16 Argyll St., Aberdeen 01224-67125 SG5 Susan Brand Manager 24000 B3 163 Main St., Glasgow 0141-3392178 SL41 Julie Lee Assistant 9000 B5 22 Deer Rd., London 0171-8861212 Functional DependenciesExample • The opposite is not true because Staff_No is not functionally dependent on Position
STAFF_BRANCH Staff_No SName Position Salary Branch_No BAddress Tel_No SL21 John White Manager 30000 B5 22 Deer Rd., London 0171-8861212 SG37 Ann Beech Sen. Assist. 12000 B3 163 Main St., Glasgow 0141-3392178 SG14 David Ford Deputy 18000 B3 163 Main St., Glasgow 0141-3392178 SA9 Mary Howe Assistant 9000 B7 16 Argyll St., Aberdeen 01224-67125 SG5 Susan Brand Manager 24000 B3 163 Main St., Glasgow 0141-3392178 SL41 Julie Lee Assistant 9000 B5 22 Deer Rd., London 0171-8861212 Functional DependenciesExample • There is another member of staff which holds the position of a Manager
Functional DependenciesExample Position is functionally Staff_No Position dependent on Staff_No Staff_No SL21 Manager Determinant of X X Staff_No is not functionally Position Staff_No dependent on Position Staff_No SL21 Manager Staff_No SG5 25
Functional DependenciesIdentification • How to identify functional dependencies? • Analyze each attribute • e.g. Staff_No • Check to see if other attributes in the relation are functionally dependent on it • e.g. see pg. 194, Example 6.2 • Analyze all other attributes
Functional DependenciesIdentification • Group all attributes which are functio-nally dependent on a particular attribute, e.g. Staff_No SName SAddress Position Salary Branch_No Tel_No BAddress Branch_No BAddress
Functional DependenciesExercise • A manufacturing company distributes its products to its customers via distribution depot which keep stocks of its products • Preliminary studies indicate that the following are sample data of orders • Identify all functional dependencies
P1235 203 XYZ Ent. 2 Jln. P/2E 8252211 D-012 Bangi T1209 G107 O876 TUBE GASKET O-RING 500 300 432 189.50 280.90 130.70 Sample Data RELATION NAME CUSTOMER ORDER Customer Order No. Customer No. Customer Name Customer Address Customer Tel. No. Depot No. ATTRIBUTES Depot Name Product No. Product Name Product Quantity Product Price
Customer Order No. Customer No. Customer Name Customer Address Customer Tel. No. Depot No. Depot Name Product No. Product Name Product Quantity Customer No. Customer Name Customer Address Customer Tel. No. Depot No. Depot Name Product No. Product Name Product Price
Unnormalized Form(UNF) Objective 1
Unnormalized Form (UNF) • A table that contains one or more repeating groups • Transfer the information from the source into a table format with rows and columns • The table is in Unnormalized Form (UNF) • See pg. 201
First Normal Form(1NF) Sub-Objective 2a
First Normal Form(1NF) • A relation in which the intersection of each row and column contains one and only one value • To transform the UNF to 1NF we identify and remove repeating groups within the table
First Normal Form(1NF) • Repeating group • A group of attributes that occurs with multiple values for a single occurrence of the nominated key attribute(s) for that table • The nominated key attribute(s) refers to the attribute(s) that uniquely identify each row within the unnormalized table
First Normal Form(1NF) • Two ways to remove repeating group: • First approach • Enter appropriate data in the empty columns of rows containing the repeating data • The resulting table (relation) contains atomic values at the intersection of each row and column
First Normal Form(1NF) • Second approach • Select an attribute or attributes as a key for the unnormalized table • Remove the repeating groups by placing the repeating data, along with a copy of the original key attribute(s) to a separate relation • Identify primary keys for the new relation
Unnormalized FormUNF Example • Examine the DreamHome Customer Rental Details form on pg. 216, Fig. 6.6 • The information in the forms are transformed into a table format (Unnormalized Form) • The key attribute is identified as Customer _No.
Customer_No CName Property_No PAddress RentStart RentFinish Rent Owner_No ONameCustomer_No CName Property_No PAddress RentStart RentFinish Rent Owner_No OName CR76 John Kay PG4 6 Lawrence Street, 1-Jul-93 31-Aug-95 350 CO40 Tina Glasgow PG16 5 Novar Drive, 1-Sep-95 1-Sep-96 450 CO93 Tony Glasgow CR56 Aline Stew PG4 6 Lawrence Street, 1-Sep-92 10-Jun-93 350 CO40 Tina Glasgow PG36 2 Manor Road, 10-Oct-93 1-Dec-94 375 CO93 Tony Glasgow PG16 5 Novar Drive, 1-Jan-95 10-Aug-95 450 CO93 Tony Unnormalized FormUNF Example • Transform the information into a table for-mat (Unnormalized Form) CUSTOMER_RENTAL TABLE
CUSTOMER_RENTAL TABLE Customer_No CName Property_No PAddress RentStart RentFinish Rent Owner_No OName CR76 John Kay PG4 6 Lawrence Street, 1-Jul-93 31-Aug-95 350 CO40 Tina Glasgow PG16 5 Novar Drive, 1-Sep-95 1-Sep-96 450 CO93 Tony Glasgow CR56 Aline Stew PG4 6 Lawrence Street, 1-Sep-92 10-Jun-93 350 CO40 Tina Glasgow PG36 2 Manor Road, 10-Oct-93 1-Dec-94 375 CO93 Tony Glasgow PG16 5 Novar Drive, 1-Jan-95 10-Aug-95 450 CO93 Tony Unnormalized FormUNF Example • Then, identify the repeating group • (Property_No, PAddress, RentStart, RentFinish, Rent, Owner_No., OName)
Customer_No CName Property_No PAddress RentStart RentFinish Rent Owner_No ONameCustomer_No CName Property_No PAddress RentStart RentFinish Rent Owner_No OName CR76 John Kay PG4 6 Lawrence Street, 1-Jul-93 31-Aug-95 350 CO40 Tina Glasgow PG16 5 Novar Drive, 1-Sep-95 1-Sep-96 450 CO93 Tony Glasgow CR56 Aline Stew PG4 6 Lawrence Street, 1-Sep-92 10-Jun-93 350 CO40 Tina Glasgow PG36 2 Manor Road, 10-Oct-93 1-Dec-94 375 CO93 Tony Glasgow PG16 5 Novar Drive, 1-Jan-95 10-Aug-95 450 CO93 Tony Unnormalized FormUNF Example • These attributes repeat for each Customer_ No • e.g. there are two values for Property_No (PG4 and PG16) for customer John Kay
First Normal Form (1NF) Example: First Approach • Remove the repeating group • By entering the appropriate customer data in each row • The primary key is then identified • The primary key is the composite key (Customer_No, Property_No) • The relation is in 1NF • There is a single value at the intersection of each row and column
Customer_No Property_No CName PAddress RentStart RentFinish Rent Owner_No ONameCustomer_No Property_No CName PAddress RentStart RentFinish Rent Owner_No OName CR76 PG4 John Kay 6 Lawrence Street, 1-Jul-93 31-Aug-95 350 CO40 Tina Glasgow CR76 PG16 John Kay 5 Novar Drive, 1-Sep-95 1-Sep-96 450 CO93 Tony Glasgow CR56 PG4 Aline Stew 6 Lawrence Street, 1-Sep-92 10-Jun-93 350 CO40 Tina Glasgow CR56 PG36 Aline Stew 2 Manor Road, 10-Oct-93 1-Dec-94 375 CO93 Tony Glasgow CR56 PG16 Aline Stew 5 Novar Drive, 1-Jan-95 10-Aug-95 450 CO93 Tony First Normal Form (1NF) Example: First Approach CUSTOMER_RENTAL RELATION
First Normal Form (1NF) Example: Second Approach • Remove the repeating group to a separate relation • Add a copy of the original key attribute (Customer _No) • Identify a primary key for each relation Customer (Customer_No, CName) Prop_Rental_Owner (Customer_No, Property_No, Paddress, RentStart, RentFinish,Rent, Owner_No, OName)
First Normal Form (1NF) Example: Second Approach Unnormalized Form (UNF) First Normal Form (1NF) Customer_No CName Property_No PAddress RentStart RentFinish Rent Owner_No OName Which of these attributes are repeating? Remove and group repeating items in a separate relation