330 likes | 517 Views
BIS 360 – Lecture Eight. Ch. 12: Database Design and Normalization. Objectives. Where are we? ER Diagram Transformation Why DB Normalization? Database Anomalies Normalization Theory 1NF - 3NF. 1. Database Design (Ch. 12) 2. Form and Report Design (Ch. 13) 3. Interface Design (Ch. 14).
E N D
BIS 360 – Lecture Eight Ch. 12: Database Design and Normalization
Objectives • Where are we? • ER Diagram Transformation • Why DB Normalization? • Database Anomalies • Normalization Theory • 1NF - 3NF
1. Database Design (Ch. 12) 2. Form and Report Design (Ch. 13) 3. Interface Design (Ch. 14) Project ID and Selection Project Initiation & Planning Analysis Logical Design Maintenance Physical Design Implementation Where we are now
E-R Model DFD Model Information Reports Bills Charts... Database Applications Why we design a database
DB Design – Where we start • From the Conceptual Data Model (ERD) • examine the diagram for accuracy – entities, attributes,relationships, and cardinalities • Transform this Conceptual Data Model into Logical Data Model (i.e., Relational Data Model) • a set of flat tables (relations)
ER Diagram Transformation Step 1: For each regular entity type E, create a relation R that includes all the simple attributes. A unique identifier of E will be a primary key of R E EMPLOYEE PK: EmpID R
ER Diagram Transformation Step 2: For each weak entity type W with identifying entity type E, create a relation Rw with all the attributes of W and also create a relation Re for E as explained in Step 1. Then, include the primary key of Re as a foreign key of Rw. A combination of the partial identifier of W and the unique identifier of E will be a primary key of Rw. E W EMPLOYEE DEPENDENT PK: EmpID Re PK: EmpID + DpdSSN FK: EmpID Rw
ER Diagram Transformation Step 3: When there is a 1:1 relationship between entity type S and T, create relations Rs and Rt as explained in step 1. Include a primary key of S as a foreign key of T. S T is managed by EMPLOYEE DEPARTMENT manages PK: EmpID Rs PK: DeptID FK: EmpID Rt
ER Diagram Transformation Step 4: When there is a 1:M relationship between entity types S and T, create relations Rs and Rt as explained in step 1. If T is an entity type at the MANY-side, include a primary key of Rs as a foreign key of Rt. S T works for DEPARTMENT EMPLOYEE hires PK: DeptID Rs PK: EmpID FK: DeptID Rt
ER Diagram Transformation Step 5: When there is a M:N relationship P between entity types S and T and there is no property associated with this relationship P, create relations Rs and Rt as explained in step 1. Also create a relation Rp to represent the relationship and include primary keys of Rs and Rt as foreign keys of Rp. A combination of primary keys of Rs and Rt will be a primary key of Rp. S T WAREHOUSE PRODUCT PK: WhID PK: ProdID Rs Rt PK: WhID + ProdID FK: WhID, ProdID Rp
ER Diagram Transformation Step 6: When there is a M:N relationship P between entity types S and T and there are some properties associated with this relationship P, create relations Rs, Rt, and Rp as explained in step 5. All properties associated with the relationship P will be the non-key attributes of Rp. SchID SSN Name Attends Name SCHOOL STUDENT Type Phone ZIP ZIP Date Degree STUDENT ( SSN , Name , Phone , Zip ) SCHOOL ( SchID , Name , Type , Zip ) STU_SCH ( SSN , SchID , Date , Degree )
CID Name Phone . . . ZIP Complex ERD TransformationExample of (M:N) Relationship Same customer may order same product many times Date ProdID Desc. Order CUSTOMER PRODUCT U_Price . . . Units Qty ORDER (CID , ProdID , Date, Units ) Q: Are you happy with the above design?
CID Name Phone . . . ZIP Complex ERD TransformationExample of (M:N) Relationship Same customer may order same product many times Date ProdID Desc. Order CUSTOMER PRODUCT U_Price . . . Units Qty ORDER(OrderID, CID , ProdID , Date , Units ) Create a unique primary key – OrderID
Why DB Normalization? • To avoid database processing errors (anomalies) • To verify the relations derived from the ER diagram – each derived relation would be at least in 3rd normal form (3NF)
Database Anomalies Anomalies -- Data errors occurred during or after the processing of data Three types of anomalies • Insertion anomaly - the difficulty in adding new data due to the poor design of a relation • Deletion Anomaly - unintentional data loss due to the deletion of some data • Update Anomaly- data become inconsistent after some data were updated
Insertion Anomaly EMPLOYEE-PROJECT Insert new employees
Insertion Anomaly EMPLOYEE-PROJECT Insert new projects
Deletion Anomaly EMPLOYEE-PROJECT Delete project A Delete employee # 1
Update Anomaly EMPLOYEE-PROJECT Update employees # 1
Normalization TheoryBasic Concept - Functional Dependency Functional Dependency (FD):A relationship between attributes of an entity. FD is the foundation of Normalization. Notation:ab - value of a uniquely determines the value of b • a is a “determinant” • a functionally determines b • b is functionally dependent on a
Name Phone DOB SSN EmpID Normalization TheoryFunctional Dependency - Examples Interpretation: either SSN or EmpID can uniquely determine his/her Name, Phone, and DOB, but not the reverse! Both SSN and EmpID are candidate keys You can choose one of them as a PK EMPLOYEE ( SSN , EmpID , Name , Phone , DOB ) or EMPLOYEE ( SSN , EmpID , Name , Phone , DOB )
# of doors Color Type VIN Normalization TheoryFunctional Dependency - Examples Interpretation: VIN can uniquely determine a vehicle’s # of doors, Color, and Type, but not the reverse! VIN is the only candidate key and it is used as a PK VEHICLE ( VIN , # of doors , Color , Type )
Database NormalizationWhere is the beef? • Reality is not that simple ! • All candidate keys, including PK, are the determinant • But, determinant may not be the candidate key Q: What is the difference between a candidate key and a determinant? A: They are similar, but not the same - Scope
Database NormalizationWhere is the beef? Call # CourseID Title Classroom • Call # is a candidate key and is used as a PK • Call # is also a determinant of CourseID, Title, and Classroom • CourseID is a determinant of Title Q: Should we put these four attributes on the same table (relation)? A: No !! We need database normalization
Database NormalizationBasic Ideas Unnormalized 1st NF (1NF) 2nd NF (2NF) 3rd NF (3NF)
Database NormalizationNormal Form Definitions • A relation is in its first normal form (1NF) if it does not contain repeating groups. • A relation is in its second normal form (2NF) if every non-primary key attribute is fully dependent on the (whole) primary key. • A relation is in its third normal form (3NF) if it has no transitive dependency between non-key attributes.
Repeating groups Major Phone Major SID Name Sex DOB Phone Major Normalization First Normal Form (1NF) 1NF:A relation is in itsfirst normal form(1NF) if it does not contain repeating groups STUDENT ( SID , Name , Sex , DOB ) STU_PHONE ( SID , Phone ) STU_MAJOR ( SID , Major )
Second Normal Form(2NF) 2NF:A relation is in itssecond normal form(2NF) if it is in 1NF and every non- primary key attribute is fully functionally dependent on the (whole) primary key JOB_HIST ( EmpID , Name , Position , SDate , Dept ) • 123, Jim, Line crew, 01/01/96, Factory • 123, Jim, Supervisor, 01/01/99, Factory • 211, John, Sales Rep, 09/01/94, MKT • 211, John, Sales Manager, 01/01/98, MKT • 235, Joe, Accountant, 07/01/96, Acct A combination of EmpID and SDate is the only candidate key and is used as a PK (EmpID, SDate)Name , Position , Dept EmpIDName , Dept Q: Do we see any partially functional dependency?
Normalization Second Normal Form(2NF) JOB_HIST ( EmpID , Name , Position , SDate , Dept ) JOB_HIST ( EmpID , SDate , Position ) EMPLOYEE ( EmpID , Name , Dept )
Comments on 2NF Verification You don’t need to worry about whether a relation is in its 2NF if its PK includes only one attribute (Why?) Because Partially functional dependency only occurs when the PK is a composite (compound) key
Third Normal Form(3NF) 3NF:A relation is in itsthird normal form(3NF) if it is in 2NF and there is no transitive dependency between non-key attributes in the relation Transitive dependency: If a b , and b c , then there is a transitive dependency between a and c EMPLOYEE ( EmpID , Name , Phone , Office , Street , City , State , Zip ) EmpIDName , Phone , Office , Street , City , State , Zip PhoneOffice ZipCity , State Q: Do you see the transitive dependency?
Normalization Third Normal Form(3NF) EMPLOYEE ( EmpID , Name , Phone , Office , Street , City , State , Zip ) EMPLOYEE ( EmpID , Name , Phone , Street , Zip ) PHONE ( Phone , Office ) ZIP ( Zip , City , State )