990 likes | 1k Views
Chapter 8: Relational Database Design Normalization in Databases. Chapter 8: Relational Database Design. Features of Good Relational Design Atomic Domains and First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form (3NF). Combine Schemas?.
E N D
Chapter 8: Relational Database DesignNormalization in Databases
Chapter 8: Relational Database Design • Features of Good Relational Design • Atomic Domains and First Normal Form (1NF) • Second Normal Form (2NF) • Third Normal Form (3NF)
Combine Schemas? • Suppose we combine instructor and department into inst_dept • (No connection to relationship set inst_dept) • Result is possible repetition of information
A Combined Schema Without Repetition • Consider combining relations • sec_class(sec_id, building, room_number) and • section(course_id, sec_id, semester, year) into one relation • section(course_id, sec_id, semester, year, building, room_number) • No repetition in this case
What About Smaller Schemas? • Suppose we had started with inst_dept. How would we know to split up (decompose) it into instructor and department? .
What About Smaller Schemas? • Write a rule “if there were a schema (dept_name, building, budget), then dept_name would be a candidate key” • Denote as a functional dependency: dept_namebuilding, budget • In inst_dept, because dept_name is not a candidate key, the building and budget of a department may have to be repeated. • This indicates the need to decompose inst_dept • Not all decompositions are good. Suppose we decomposeemployee(ID, name, street, city, salary) into employee1 (ID, name) employee2 (name, street, city, salary) • The next slide shows how we lose information -- we cannot reconstruct the original employee relation -- and so, this is a lossy decomposition.
Example of Lossless-Join Decomposition • Lossless join decomposition • Decomposition of R = (A, B, C) R1 = (A, B) R2 = (B, C) A B C A B B C 1 2 A B 1 2 1 2 A B r B,C(r) A,B(r) A B C A (r) B (r) 1 2 A B
Normal Forms • 1NF • 2NF • 3NF • Other… • Not covered.
Normal Forms: Review • Unnormalized – There are multivalued attributes or repeating groups • 1 NF – No multivalued attributes or repeating groups. • 2 NF – 1 NF plus no partial dependencies • 3 NF – 2 NF plus no transitive dependencies
First Normal Form • Domain is atomic if its elements are considered to be indivisible units • Examples of non-atomic domains: • Set of names, composite attributes • Identification numbers like CS101 that can be broken up into parts • A relational schema R is in first normal form if the domains of all attributes of R are atomic • Non-atomic values complicate storage and encourage redundant (repeated) storage of data • Example: Set of accounts stored with each customer, and set of owners stored with each account
First Normal Form (Cont’d) • Atomicity is a property of how the elements of the domain are used. • Example: Strings would normally be considered indivisible • Suppose that students are given roll numbers which are strings of the form CS0012 or EE1127 • If the first two characters are extracted to find the department, the domain of roll numbers is not atomic. • Doing so is a bad idea: leads to encoding of information in application program rather than in the database.
Example 1: Table Not Violating 1NF It violates other normal forms, though.
Example 2: Table Not Violating 1NF It violates other normal forms, though.
Types of Normalization for 1NF • Properties • each field contains the smallest meaningful value • the table does not contain • repeating groups of fields or, • repeating data within the same field • Remedies • Create a separate field/table for each set of related data. • Identify each set of related data with a primary key
Tables Violating First Normal Form Really Bad Set-up! Better, but still flawed!
Second Normal Form – 2NF • Usually used in tables with a multiple-field primary key (composite key) • Each non-key field relates to the entire primary key • Any field that does not relate to the primary key is placed in a separate table • MAIN POINT – • Eliminate redundant data in a table • Create separate tables for sets of values that apply to multiple records
Table Violating 2NF Where is the problem?
Tables Conforming to 2NF PART_STOCK TABLE ∞ WAREHOUSE TABLE 1
Third Normal Form – 3NF • Usually used in tables with a single- field primary key • Records do not depend on anything other than a table's primary key • Each non-key field is a fact about the key • Values in a record that are not part of that record's key do not belong in the table. • Rule of Thumb • In general, any time the contents of a group of fields may apply to more than a single record in the table, consider placing those fields in a separate table.
Table Violating 3NF EMPLOYEE_DEPARTMENT TABLE The underlying problem is the transitive dependency to which the DeptName attribute is subject. DeptName actually depends on WORKDEPT, which in turn depends on the key EmpNO.
Tables Conforming to Third Normal Form EMPLOYEE TABLE ∞ DEPARTMENT TABLE 1
A Note on 2NF • A table may have multiple candidate key. • A functional dependency on part of any candidate key is a violation of 2NF. • It is necessary to establish that no non-prime attributes have part-key dependencies on any of these candidate keys.
Example Candidate Key PK Example taken from Wikipedia: http://en.wikipedia.org/wiki/Second_normal_form
Example Electric Toothbrush Manufacturers Electric Toothbrush Models
Example 1 • Un-normalized Table:
Table in First Normal Form • No Repeating Fields • Data in Smallest Parts
Is table in 2NF? • What is the key?
Is table in 2NF? • What is the key? • What do we notice? • Advisor fields depend on Student#
Tables in Second Normal Form • Redundant Data Eliminated Table: Registration Table: Students
Tables Registration in 2NF • Who about the Students? Table: Registration Table: Students What is the candidate key for Students?
Table: Advisors • Tables in 2NF. Table: Registration Table: Students
Advisors Advisor# AdvFirstName AdvLastName Adv-Room Students Student# Advisor# Registration Student# Class# Relationships for Example 1
Example 2 • Un-normalized Table:
Table in First Normal Form What is the candidate key?
Tables in Second Normal Form Table: Employees and Projects Table: Employees Are they in 3NF? The underlying problem is the transitive dependency to which the Dept Name attribute is subject. Dept Name actually depends on Dept Code, which in turn depends on the key EmpID.
Tables in Third Normal Form Table: Employees_and_Projects Table: Employees Table: Departments
Departments DeptCode DeptName Employees EmpID FirstName LastName DeptCode Employees_and_Projects EmpID ProjectNumber TimeonProject Relationships for Example 2
Example 3 • Un-normalized Table:
Table in First Normal FormFields contain smallest meaningful values
Table in First Normal FormNo more repeated fields What is the candidate key? Is the table in 2NF?
Second/Third Normal FormRemove Repeated Data From Table Step 1
Tables in Second Normal Form Removed Repeated Data From Table Step 2 We look for the transitive dependency.
Tables in Second Normal Form How about 3NF? Step 3 We look the transitive dependency. If I know Dept, then I know ManagerID and Sector. If I know EmpID then I know Dept.
Tables in Third Normal Form Employees Table Manager Table Dependents Table Department Table