270 likes | 394 Views
Normalization of Databases. Farrokh Alemi Ph.D. Francesco Loaiza Ph.D. J.D. Vikas Arya Updated by Janusz Wojtusiak, Fall 2009. Objectives of Design. Increase efficiency Reduce redundancy Reduce missing data entries Allow users access to data without knowing its location
E N D
Normalization of Databases Farrokh Alemi Ph.D. Francesco Loaiza Ph.D. J.D. Vikas Arya Updated by Janusz Wojtusiak, Fall 2009
Objectives of Design • Increase efficiency • Reduce redundancy • Reduce missing data entries • Allow users access to data without knowing its location • Remove database anomalies. Normalization of Databases
Database Anomalies • Update anomaly • Insertion anomaly • Deletion anomaly Source: wikipedia.org Normalization of Databases
Design Principles • Each Table should correspond to a single Entity or a single Relationship • Rows in the Table should correspond to individual occurrences of that Entity or Relationship • The Primary Key should uniquely identify individual occurrences of the Entity or Relationship Normalization of Databases
Design Principles • Each Table should correspond to a single Entity or a single Relationship • Rows in the Table should correspond to individual occurrences of that Entity or Relationship • The Primary Key should uniquely identify individual occurrences of the Entity or Relationship Normalization of Databases
Design Principles • Each Table should correspond to a single Entity or a single Relationship • Rows in the Table should correspond to individual occurrences of that Entity or Relationship • The Primary Key should uniquely identify individual occurrences of the Entity or Relationship Normalization of Databases
Principles of Design (Continued) • Non-key fields should be facts about the occurrence identified by the Primary Key • Each fact should be represented only once in the database Normalization of Databases
Principles of Design (Continued) • Non-key fields should be facts about the occurrence identified by the Primary Key • Each fact should be represented only once in the database Normalization of Databases
Normalization • Normalization is the process of applying principles of design to data structures so that they conform to out expectations • This lecture covers three formal rules for designing databases • These rules correspond to three, so called, Normal forms Normalization of Databases
Design Flaw Normalization of Databases
First Normal Form • A Table is in first Normal form (1NF) if and only if all fields contain only atomic values and there are no repeating fields within a row • No Composite Fields • A street address is an example of a non-atomic field • No Repeating Groups • If names are listed in two columns under a household, then the field is repeated • There is a primary key. Normalization of Databases
First Normal Form? ID 1 2 3 4 How about primary key? Normalization of Databases
Another Example • Consider a table with invoices of a company Is this table in the first normal form? Normalization of Databases
Example cont. Is this table in the first normal form? Normalization of Databases
Functional Dependency • An Attribute Y is Functionally Dependent on an Attribute X, if a Value for X Determines a Unique Value for Y • X may be a Set of Attributes • Notation: X Y (read X determines Y) Normalization of Databases
Functional Dependency Example Employee Number Employee Name Is it also true that? Employee Name Employee Number Normalization of Databases
Full Functional Dependency • An Attribute Y may be Determined by a Set of Attributes A,B,C (ABC Y) • Let X is a Set of Attributes Such That X Y. If there is no subset Z of X so that Z Y then Y is Fully Functionally Dependent on X Normalization of Databases
Example for Full Functional Dependence Employee Number, Dept Employee Name But Employee Number Employee Name So Employee Name is not Fully Functionally Dependent on Employee Number and Department Normalization of Databases
Second Normal Form • A Table is in Second Normal Form if and only if • it is in the first normal form • all informational fields (facts) are fully functionally dependent on the primary key. Normalization of Databases
Violation of Second Normal Form Invoice PK PK Why the above table is not in the second normal form? Normalization of Databases
Solution to Second Normal Form Invoice InvoiceItem Wait!!! Normalization of Databases
Solution to Second Normal Form Invoice InvoiceItem Item Normalization of Databases
Third Normal Form • A Table is in Third Normal form if and only if • It is in second normal form • there are no combinations of strictly informational fields (not primary key fields) that determine the value of another fields Normalization of Databases
Violation of the Third Normal Form Invoice InvoiceItem Item Normalization of Databases
Violation of the Third Normal Form Invoice InvoiceItem Item Customer Normalization of Databases
Normalization • There are more normal forms • The most commonly used are the three normal forms covered in this lecture • Sometimes it is desired to use denormalization that is converting tables into lower normal forms, for example to perform data analysis. Normalization of Databases
Rules for Good Design • Each table should correspond to a single entity • Each row should correspond to occurrences of the entity • Facts in the table should describe the primary key • Each fact should be represented only once in the database • No composite or repeating fields should be used • No combination of facts should determine the value of another • The primary key should uniquely identify the entity • All facts should be fully functionally dependent on the primary key. Normalization of Databases