1 / 26

RELATIONAL TABLE NORMALIZATION

RELATIONAL TABLE NORMALIZATION. Key Concepts. Guidelines for Primary Keys Deletion anomaly Update anomaly Insertion anomaly Functional dependency Transitive dependency. Key Concepts (cont’d). Multivalued dependency First normal form (1NF) Second normal form (2NF)

hope
Download Presentation

RELATIONAL TABLE NORMALIZATION

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. RELATIONAL TABLE NORMALIZATION

  2. Key Concepts • Guidelines for Primary Keys • Deletion anomaly • Update anomaly • Insertion anomaly • Functional dependency • Transitive dependency

  3. Key Concepts (cont’d) • Multivalued dependency • First normal form (1NF) • Second normal form (2NF) • Third normal form (3NF) • Fourth normal form (4NF) • Domain key normal form (DKNF)

  4. Guidelines for Primary Keys • Guideline 1 • The domain of the primary key should be large enough to accommodate the identification of unique rows for the next 100 years • Guideline 2 • Primary keys should be a unique random collection of alphabetic, numeric or alphanumeric characters

  5. Guidelines for Primary Keys (cont’d) • Guideline 3 • Avoid using smart keys. Primary keys should not contain “fact giving” data. If these facts are necessary, they should be entity attributes • Guideline 4 • Use the suffix ID in constructing primary key names (CUST_ID, Vendor_ID, etc.)

  6. Data Anomalies • Definition of “anomaly”: • Deviation or departure from the normal or common order, form, or rule. • An item that is peculiar, irregular, abnormal, or difficult to classify

  7. Deletion Anomaly • Occurs when the removal of a record results in a lost of important information • For example, if all the information about a customer is contained in the ORDER table, deleting an order also deletes customer information • See Recycled Tractor problem for example

  8. Update Anomaly • Occurs when multiple record changes for a single attribute are necessary when a change to only one record in a database should be necessary. • Example: an evaluator at Recycled Tractor changes his/her cell phone number

  9. Insertion Anomaly • Occurs when there does not appear to be any reasonable place to assign attributes and attribute values to records in a database • Two types of insertion anomalies: • Type 1: Adding new attributes to a record • Type 2: Updating only part of a record

  10. Insertion Anomaly (cont’d) • Type 1 example: • Adding Recycled Tractor evaluator’s home address and phone number to the database

  11. Insertion Anomaly (cont’d) • Type 2 example: • Essence of the Insertion Anomaly problem: when to enter values into the database • Assign the new Recycled Tractor evaluator to a new dummy lead • Or, add new evaluator to all records in LEAD database – can result in lots of null values

  12. Eliminating Data Anomalies • Normalization facilitates the removal of data anomalies • Basic rule of normalization: • The attribute values in a relational table should be functionally dependent on the primary key value

  13. Eliminating Data Anomalies (cont’d) • Corollaries to the basic rule: • No repeating groups are allowed in relational tables • A relational table cannot have attributes involved in a transitive dependency with the primary key

  14. Eliminating Data Anomalies (cont’d) • The different types of dependencies are critical to understanding and executing the normalization process • One of the primary responsibilities of the database designer is to formalize data relationships by identifying the dependencies among the attributes

  15. Functional Dependency • A functionally dependent relationship exists between two attributes when one attribute value implies or determines the value for the other attribute • Example: the value LEAD_NAME determines value of LEAD_BANK in the Recycled Tractor problem • A functional dependency can be reciprocal • Social Security # and Name of person

  16. Transitive Dependency (TD) • Occurs when a nonkey attribute value is functionally dependent on another nonkey attribute value that is not a candidate key

  17. Transitive Dependency (cont’d) • Example: • EMPLOYEE (EMPLOYEE_ID, CATEGORY, HOURLY_RATE) • If JOB_CATEGORY = SUPERVISOR • Then HOURLY_RATE is $25.00 per hour • if JOB_CATEGORY = WELDER • Then HOURLY_RATE is $18.00 per hour • HOURLY_RATE is dependent on JOB_CATEGORY

  18. Multi-Valued Dependency (MVD) • Results from having multiple values for a particular attribute • Three types of MVD’s • Simple • Independent • Transitive

  19. Simple MVD • Similar to 1:N cardinality (one to many) • Most common type of MVD • Examples • A student can register for many courses • LEAD_ID functionally determines many values for TRACTOR_ID

  20. Independent and Transitive MVD’s • Both types involve three or more attributes • Usually eliminated by first three normal forms

  21. First Normal Form (1NF) • A relational table is in first normal form if no attributes form repeating groups • Repeating group attributes are removed by creating another table • In the Recycled Tractor problem, tractor attributes are removed from LEAD and placed in the TRACTOR table, and EVALUATOR attributes are placed in the EAVLUATOR table

  22. Second Normal Form (2NF) • A relational table is in second normal form when all nonkey attributes are functionally dependent on the primary key • Only tables with concatenated (composite) keys will a problem in meeting the 2NF requirement • Does our new EVALUATOR table meet 2NF requirements?

  23. Third Normal Form (3NF) • A relational table is in third normal form when • it is in second normal form • no attribute has a transitive dependency involving nonkey attributes • In the Recycled Tractor problem, TRACKER_PHONE# is functionally dependent on TRACKER_NAME, which is functionally dependent on LEAD_ID • Boyce-Codd normal form adds requirement that all attribute determinants are also candidate keys

  24. Fourth Normal Form (4NF) • A relational table is in fourth normal form when all multivalued dependencies have been removed • In most situations, normalizing tables to third normal form removes multivalued dependencies

  25. Domain-Key Normal Form ( DKNF) • DKNF is a philosophy that focuses on developing themes for tables • A student table contains attributes describing students • A relational table is in DKNF if every constraint on the table or file is the result of defining primary keys for a relational table and defining domains for the attributes • Examples of data constraints: • Edit rules for attributes • Relationships of attributes • Functional and multi-valued dependencies

  26. Comments on Normalization • The benefits of additional levels of normalization decrease rapidly after tables have been put in 3NF • The instances where higher-level normalization strategies are necessary are considered rare and theoretical

More Related