140 likes | 300 Views
Normalization—Topics. Functional Dependency Candidate Keys Normalization 1 st Normal Form 2 nd Normal Form 3 rd Normal Form Boyce-Codd, 4 th & 5 th Normal Forms. Functional Dependency.
E N D
Normalization—Topics • Functional Dependency • Candidate Keys • Normalization • 1st Normal Form • 2nd Normal Form • 3rd Normal Form • Boyce-Codd, 4th & 5th Normal Forms
Functional Dependency • Functional dependency refers to the relationship between nonkey attributes and the primary key (or other attributes) of an entity • Attribute A is functionally dependent on Attribute B if, having specified a value for Attribute B, there is only one true value for Attribute A • E.g., if we identify a student by specifying their SSN there is only one current value of last name • Note that time dependency may come into play but we will handle that as with any time-dependent data
Functional Dependency (cont.) • An attribute may be functionally dependent on a composite set of attributes • E.g., the name of a course is functionallydependent on the course identified by Department Code and Course Number
Functional Dependency (cont.) • Note that we don't define functional dependency to mean that there is only one possible value of the functionally dependent attribute • A student can change their name or address • A product can get a new price • Identifying functional dependencies is a critical step in the normalization that follows • We must achieve a structure in which each attribute in an entity is functionally dependent on only the PK and on the whole PK
Candidate Keys • We use the term "Candidate Key" to indicate that an entity may have multiple attributes or sets of composite attributes that could be primary keys (uniquely identify each record) • What are the candidate keys in the entity above? • One candidate key is always selected as the PK • Other(s) become alternate keys (AK) • Uniqueness must be enforced • May not be null (usually) • More on alternate keys later
Normalization • Normalization is the process of distributing attributes among entities to achieve an efficient DB structure • Ensures entity satisfies six properties of relations(earlier lecture) • Ensure functional dependency on PK • A fully normalized database has tremendous storage and retrieval advantages over a poorly designed DB • We will look (later) at selectively denormalizing entities to achieve performance efficiencies
Normalization (cont.) • Normalization consists of six steps • 1st Normal Form (1NF) • 2nd Normal Form (2NF) • 3rd Normal Form (3NF) • Boyce-Codd NormalForm (BCNF) • 4th Normal Form (4NF) • 5th Normal Form (5NF)
Normalization (cont.) • Normalization is applied to entities • An entity is said to be in 3NF, for example, if it has been checked for compliance with the rules of 3NF and any necessary corrections applied • Each normalization step is applied in order • You do not attempt to put an entity in 2NF until it has been checked and corrected to be in 1NF • Etc. • Corrections to normalization problems almost always result in the creation of new entities and relationships
First Normal Form • "Place entities in 1NF by removing any repeating attributes" • Repeating attribute are multivalued attributes in another guise and are problems for the same reason • The definition doesn't tell you what to do with them • Handle as with multivalued attributes • Create new entity • PK is PK of original entity plusthe repeated attribute
Second Normal Form • "Place entities in 2NF by placing them in 1NF and then ensuring that each nonkey attribute is fully functionally dependent on the whole primary key" • Remove the offending attributes to a newentity • PK is the portion of the PK on whichthe problem attribute is dependent • 2NF problems cannot exist in an entity with a single-attribute PK
Third Normal Form • "Place entities in 3NF by placing them in 2NF and removing any transitive dependencies" • A 'transitive dependency' is an attribute that is functionally dependent on another nonkey attribute (or set ofattributes) which, in turn, is functionally dependent on the PK • The problem attribute(s) aredependent on the PK but only because the intermediary attribute is dependent
Third Normal Form (cont.) • If the value of the PK changes (i.e., we move to a new record) the value of the dependent attributes (as well as all other attributes) could change • The problem arises because changing the value of the nonkey intermediary attribute also forces a change to the dependent attributes • May also be a problem with redundant storage
Third Normal Form (cont.) • Place entities in 3NF by removing the offending attributes to a new entity • The intermediate attribute(s) will become the PK of the new entity • New PK will remain in original entity as a foreign key • Attributes with transitive dependency will be removed to the new entity
Other Normal Forms • Boyce-Codd, 4NF, and 5NF deal with unusual circumstances • Boyce-Codd NF requires that each attribute be functionally dependent on any candidate key, not just the selected PK • 4NF requires that independently multivalued composite PK attributes be removed to two new parent entities • 5NF requires the removal of pairwise cyclic dependencies within composite primary keys with three or more component attributes