180 likes | 277 Views
Normal Forms (Part 1). Steven Le ~ CS157B. Normalization is a systematic way of ensuring that a database structure is suitable for general-purpose querying and free of certain undesirable characteristics—insertion, update, and deletion anomalies—that could lead to a loss of data integrity.
E N D
Normal Forms (Part 1) Steven Le ~ CS157B
Normalization is a systematic way of ensuring that a database structure is suitable for general-purpose querying and free of certain undesirable characteristics—insertion, update, and deletion anomalies—that could lead to a loss of data integrity.
In English: • Normalization is a well-defined way to structure a database so that weird things don’t happen to your data when you use it
Normal Forms • Edgar Codd, the inventor of the relational model, also introduced the concept of Normalization and Normal Forms (NF) • Normal Forms are criteria for determining a table's degree of vulnerability to logical inconsistencies and anomalies
Edgar Codd • Created First Normal Form (1NF) in 1970 • Created Second Normal Form (2NF) and Third Normal Form (3NF) in 1971 • With Raymond Boyce, created Boyce-Codd Normal Form (BCNF) in 1974
There also exists other Normal Forms invented by other people. These include 4NF, 5NF, and 6NF • We will be looking at 1NF, 2NF, and 3NF
Normal Form Properties • The higher the Normal Form a table has, the less vulnerable to inconsistencies and anomalies it is • A table always meets the requirements of its Highest Normal Form (HNF), as well as all the requirements of all lower Normal Forms • A table does not meet the requirements of Normal Forms higher than its HNF
Achieving a Normal Form is NOT step-wise i.e. 1NF -> 2NF -> 3NF • Rather, you design the table to be around 3NF on the first try. This is usually the minimum for a table to be considered “Normalized”. Once there, it should be easier to achieve higher Normal Forms
Vocab Overview • Superkey • Candidate Key • Primary Key • Prime Attribute • Dependency
Superkey • a set of attributes of a relation variable for which it holds that in all relations assigned to that variable there are no two distinct tuples (rows) that have the same values for the attributes in this set
Candidate/Primary Key • Candidate Key: Superkey with no proper subsets that are also superkeys • Primary Key: Candidate Key with no NULL values • Prime Attribute: Attribute part of the Candidate Key
Dependency • X → A: The attribute set “X” determines set “A” • Trivial Functional Dependency: X → Y is called trivial if Y is a subset of X
First Normal Form • A table is said to be in First Normal Form if it faithfully represents a relation • No repeating groups
Second Normal Form • A table is 2NF if given any candidate key and any attribute that is not a constituent of a candidate key, the non-key attribute depends upon the whole of the candidate key rather than just a part of it
Third Normal Form • A table is in 3NF if for each of its functional dependencies X → A, at least one of the following conditions holds: • X contains A (that is, X → A is trivial functional dependency), or • X is a superkey, or • A is a prime attribute (i.e., A is contained within a candidate key)
End • Applause