160 likes | 324 Views
Normalization. Database Normalization. Database normalization is the process of removing redundant data from your tables in to improve storage efficiency, data integrity, and scalability.
E N D
Database Normalization • Database normalization is the process of removing redundant data from your tables in to improve storage efficiency, data integrity, and scalability. • In the relational model, methods exist for quantifying how efficient a database is. These classifications are called normal forms (or NF), and there are algorithms for converting a given database between them. • Normalization generally involves splitting existing tables into multiple ones, which must be re-joined or linked each time a query is issued.
History • Edgar F. Codd first proposed the process of normalization and what came to be known as the 1st normal form in his paper A Relational Model of Data for Large Shared Data Banks Codd stated: “There is, in fact, a very simple elimination procedure which we shall call normalization. Through decomposition nonsimple domains are replaced by ‘domains whose elements are atomic (nondecomposable) values.’”
Normal Form • Edgar F. Codd originally established three normal forms: 1NF, 2NF and 3NF. There are now others that are generally accepted, but 3NF is widely considered to be sufficient for most applications.
Table 1 problems • This table is not very efficient with storage. • This design does not protect data integrity. • Third, this table does not scale well.
First Normal Form • 1NF requires NO Repeating Groups • This greatly simplifies searching and management • In Table 1 we have more than one author field, • Also, our subject field contains more than one piece of information.
First Normal Table • Table 2
We now have two rows for a single book but.. • 2NF requires Eliminating Redundant Data • A better solution would be to separate the data into separate tables- an Author table and a Subject table to store our information, removing that information from the Book table:
Subject Table Author Table Book Table
Each table has a primary key, A primary key value must be non-null and unique within the table . • We implement the book – author and book – subject many to many relationships using intersection tables.
Relationships Book_Author Table Book_SubjectTable
Second Normal Form 2NF also says Eliminate Redundancy of data in vertical columns. • We implement the book – publisher one to many relationship as parent and child tables.
2NF Table Publisher Table Book Table
2NF • The other requirement for Second Normal Form is that all non-key attributes must depend on the key. • If we have a composite (multi-attribute) key then All non-key attributes must depend on the entire key.
Third Normal Form • Third normal form (3NF) requires that there are no functional dependencies of non-key attributes on something other than a candidate key. • A table is in 3NF if all of the non-primary key attributes are mutually independent • That is, there are NO transitive dependencies