180 likes | 308 Views
Normalization. Dr. Mario Guimaraes. Data Normalization. Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of data.
E N D
Normalization Dr. Mario Guimaraes
Data Normalization • Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of data. • The process of decomposing relations with anomalies to produce smaller, well-structured relations. • Primary Objective: Reduce Redundancy,Reduce nulls, • Improve “modify” activities: • insert, • update, • delete, • but not read • Price: degraded query, display, reporting
Functional Dependency and Keys • Functional Dependency: The value of one attribute (the determinant) determines the value of another attribute. • Candidate Key: Each non-key field is functionally dependent on every candidate key.
Normalization – most used • Four most commonly used normal forms are first (1NF), second (2NF) and third (3NF) normal forms, and Boyce–Codd normal form (BCNF). • Based on functional dependencies among the attributes of a relation. • A relation can be normalized to a specific form to prevent possible occurrence of update anomalies.
First Normal Form • No multi-valued attributes. • Every attribute value is atomic. • Why are the following tables not in 1NF Employee (ssn, Name, Salary, Address, ListOfSkills) Department (Did, Dname, ssn)
Second Normal Form • 1NF and every non-key attribute is fully functionally dependent on the primary key. • Every non-key attribute must be defined by the entire key, not by only part of the key. • No partial functional dependencies. Assuming that we have a composite PK (LicensePlate, OwnerSSN) for the Vechicle Table below, why is the table not in 2NF ? Vehicle (LicensePlate, Brand, Model, PurchasePrice, Year, OwnerSSN, OwnerName)
Third Normal Form & BCNF • 2NF and no transitive dependencies (or no functional dependency between non-key attributes = BCNF) Why are the following tables not in 3NF or BCNF ? • Why is Employee [ssn, name, salary, did, dname] • Customer
3NF & BCNF • It is very rare for a Table to be in 3NF and not be in BCNF (violation of BCNF). • Given a Relation R with attributes A, B and C where A and B are together the composite PK, IF A, B -> C and C -> B THEN R is in 3NF and is not in BCNF Example: Student, course -> Instructor Instructor -> Course
Steps in Normalization • 1NF: a table, without multivalued attributes • if not, then decompose • 2NF: 1NF and every non-key attribute is fully functionally dependent on the primary key • if not, then decompose • 3NF: 2NF and no transitive dependencies • if not, then decompose • GENERAL: • Each table should describe a single theme • Modification anomalies are minimized Hint: THE KEY, THE WHOLE KEY AND NOTHING BUT THE KEY
Normalized Tables Must: • Two important properties of decomposition: - Lossless-join property enables us to find any instance of original relation from corresponding instances in the smaller relations. - Dependency preservation property enables us to enforce a constraint on original relation by enforcing some constraint on each of the smaller relations.
4NF • Dependency between attributes (for example, A, B, and C) in a relation, such that for each value of A there is a set of values for B and a set of values for C. However, set of values for B and C are independent of each other.
Matching • 1) A Weak Entity _____ a) skills • 2) A Derived Attribute _____ b) ssn • 3) A Composite Attribute _____ c) Monthly Sales • 4) A Multi-Valued Attribute_____ d) full name • 5) Primary Key _______ e) Section of a Course
Which tables are in 3NF ? Employee (ssn, Name, Salary, pid) (obs.: employee can only work in one project) Employee (ssn, Name, Salary, pid) (obs.: employee can work in many projects) Vehicle (OwnerSSN, OwnerName, OwnerAddress, LicensePlate, Brand, Model, PurchasePrice, Year) (obs.: an owner can have several cars) Project (pid, pname, did) (pid determines did) Project (pid, pname, ssn) (ssn determines pid)
Normalize to 3NF • ClientName -> Address, EmpNo -> Name, Service -> AmountDue • ClientName and Date -> Service, ClientName and Date -> EmpNo
End of Lecture End Of Today’s Lecture.