1.18k likes | 1.2k Views
Chapter 14 Basics of Functional Dependencies and Normalization for Relational Databases Chapter 15 Relational Database Design Algorithms and Further Dependencies. Functional Dependency Armstrong’s Axioms Closure of FDs Closure of attribute(X) – Find Keys Minimal set of FDs Normalization
E N D
Chapter 14Basics of Functional Dependencies and Normalization for Relational DatabasesChapter 15Relational Database Design Algorithms and Further Dependencies
Functional Dependency • Armstrong’s Axioms • Closure of FDs • Closure of attribute(X) – Find Keys • Minimal set of FDs • Normalization • Lossless Join Decomposition
Outline • 1 Informal Design Guidelines for Relational Databases • 1.1Semantics of the Relation Attributes • 1.2 Redundant Information in Tuples and Update Anomalies • 1.3 Null Values in Tuples • 1.4 Spurious Tuples • 2 Functional Dependencies (FDs) • 2.1 Definition of FD • 2.2 Inference Rules for FDs • 2.3 Equivalence of Sets of FDs • 2.4 Minimal Sets of FDs
Outline 3 Normal Forms Based on Primary Keys 3.1 Normalization of Relations 3.2 Practical Use of Normal Forms 3.3 Definitions of Keys and Attributes Participating in Keys 3.4 First Normal Form 3.5 Second Normal Form 3.6 Third Normal Form 4 General Normal Form Definitions (For Multiple Keys) 5 BCNF (Boyce-Codd Normal Form)
What is relational database design? • The grouping of attributes to form "good" relation schemas • Two levels of relation schemas • The logical "user view" level • The storage "base relation" level • Design is concerned mainly with base relations • What are the criteria for "good" base relations?
1.2 Redundant Information in Tuples and Update Anomalies • Information is stored redundantly • Wastes storage • Causes problems with update anomalies • Insertion anomalies • Deletion anomalies • Modification anomalies
Functional Dependency: Attribute B is functionally dependent on attribute A if at each point in time, each value of A has only one value of B associated with it.
Examples: • SSN ENAME • PNUMBER{PNAME, PLOCATION} • {SSN, PNUMBER}HOURS
functional dependency A functional dependency (X Y) between two sets of attributes X and Y that are subsets of R specifies a constraint on the possible tuples that can form a relation state r of R. The constraint is that, for any two tuples t1 and t2 in r that have t1[X] = t2[X], we must also have t1[Y] = t2[Y].
--X is a candidate keyof R—this implies that XYfor any subset of attributes Y of R --If X Yin R, this does not say whether ornot Y Xin R.
Inference Rules: F = {SSN {ENAME, BDATE, ADDRESS, DNUMBER}, DNUMBER {DNAME,DMGRSSN}} Can infer additional FDs such as: SSN {DNAME, DMGRSSN}, SSN SSN, DNUMBER DNAME
Inference rules (Armstrong’s Axioms) • Given a set of FDs F, we can infer additional FDs that hold whenever the FDs in F hold • Armstrong's inference rules: • IR1. (Reflexive) If Y subset-of X, then X -> Y • IR2. (Augmentation) If X -> Y, then XZ -> YZ • (Notation: XZ stands for X U Z) • IR3. (Transitive) If X -> Y and Y -> Z, then X -> Z • IR1, IR2, IR3 form a sound and complete set of inference rules • These are rules hold and all other rules that hold can be deduced from these
Inference rules (Armstrong’s Axioms) -continued IR4 (decomposition, or projective rule): {X YZ} |= X Y. IR5 (union, or additive rule): {X Y, X Z} |= X YZ. IR6 (pseudotransitive rule) {X Y, WY Z } |= WX Z.
Armstrong’s Axioms • IR1 – Reflexive rule X Y, then X Y ssn, name ssn • IR2 – Augmentation rule X Y |= XZ YZ ssn name |= ssn, dob name,dob
IR3 – Transitive rule X Y, Y Z |= X Z Ssn phone# Phone# State Ssn State
PROOF OF IR4 (USING IR1 THROUGH IR3) IR4 (decomposition, or projective, rule): {X YZ} |= X Y. • X YZ (given). • YZ Y (using IR1 and knowing that YZ Y). • X Y (using IR3 on 1 and 2).
PROOF OF IR5 (USING IR1 THROUGH IR3) IR5 (union, or additive, rule) {X Y, X Z} |= X YZ. XY (given). XZ (given). X XY (using IR2 on 1 by augmenting with X; notice that XX = X). XY YZ (using IR2 on 2 by augmenting with Y). X YZ (using IR3 on 3 and 4).
PROOF OF IR6 (USING IR1 THROUGH IR3) IR6 (pseudotransitive rule) {X Y, WY Z } |= WX Z. X Y (given). WY Z (given). WX WY (using IR2 on 1 by augmenting with W). WX Z (using IR3 on 3 and 2).
Exercise a) {w y, x z} |= {wx y} {w y, x z} |= {wx y} w y Given wx xy Aug(X) xy y Reflexive wx y Transitive STUID DOB C# C_TITLE STUID, C# DOB
Closure of FDs (F+) Set of FDs that could be derived from FDs using Armstrong’s Axioms ssn name ssn dob => Ssn name, dob
closure sets with respect to F (Example) F = { SSNENAME, NUMBER{NAME,PLOCATION}, {SSN, PNUMBER}HOURS } { SSN }+ = { SSN, ENAME } { PNUMBER }+ = { PNUMBER, PNAME, PLOCATION } { SSN, PNUMBER }+ = { SSN, PNUMBER, ENAME, PNAME, PLOCATION, HOURS }
Inference Rules for FDs (3) • Closure of a set F of FDs is the set F+ of all FDs that can be inferred from F • Closure of a set of attributes X with respect to F is the set X+ of all attributes that are functionally determined by X • X+ can be calculated by repeatedly applying IR1, IR2, IR3 using the FDs in F
Closure of attribute X (X+) • Closure of a set of attributes X with respect to F is the set X+ of all attributes that are functionally determined by X Algorithm 16.1Determining X+, the closure of X under F X+ := X Repeat old X+ := X+; for each functional dependency Y Z in F do if X+ Y then X+ := X+ U Z; Until ( X+ = old X+ ); p 310
ExampleR(A,B,C,D,M,F,G) FDs: A B B CD C M D FG A+ = A = AB (A B) = ABCD (B CD) = ABCDM (C M) = ABCDFGM (DFG) A IS A KEY OF THIS RELATION
R(STUID,F,M,L,STREET,CITY,STATE,ZIP,C#,CNAME,GRADE) STUID F,M,L,STREET, CITY,STATE,ZIP C# CNAME ZIP CITY, STATE STUID,C# GRADE (STUID)+ = STUID = STUID, F,M,L,STREET, CITY,STATE,ZIP (C#)+ = C# = C#, CNAME (STUID,C#)+ = STUID,C# =STUID,C#, F,M,L,STREET, CITY,STATE,ZIP = STUID,C#, F,M,L,STREET, CITY,STATE,ZIP ,CNAME = STUID,C#, F,M,L,STREET, CITY,STATE,ZIP ,CNAME,GRADE
Algorithm 16.2 (a): Finding a Key K for R Given a set F of Functional Dependencies Input: A universal relation R and a set of functional dependencies F on the attributes of R. 1.Set K := R; 2.For each attribute A in K { Compute (K - A)+ with respect to F; If (K - A)+ contains all the attributes in R, then set K := K - {A}; }
Find key of R(ABCDFGMN) FDs: AC B B CD C MN D FG Is (AC) a candidate key? (AC)+ = ABCDFGMN A+ = A C+ = CMN (AC)+ = AC = ABC (AC B) = ABCD (B CD) = ABCDMN (C MN) = ABCDMNFG (D FG)
Find key of R(ABCDFGMN) FDs: AC B B CD C MN D FG A C Is (AC) a candidate key? (AC)+ = ABCDFGMN A+ = AC = ABC (AC B) = ABCD (B CD) = ABCDMN (C MN) = ABCDMNFG (D FG)
Find key of R(ABCDFGMN) FDs: A B B CD C MN D FG ABCDFGMN ACDFGMN A B AFGMN A B, B CD AFG A B, B C, C MN A A B, B D, D FG
Find key of R(ABCDFGMN) FDs: AC B B CD C MN D FG A C Is (AC) a candidate key? (AC)+ = ABCDFGMN A+ = C+ =
Example: Patient(Patient#, Pat_Name, Pat_Adr, Med_Exam_Description, Doctor_Number, Med_Exam#, Date_Test, Test_Result, Doctor_Name) • Key:
Example: Hotel_Bill(Guest_Id, Room#, Room_Rate, Date_Checkin, Total_Amount, Guest_Fname, Guest_Lname, InvoiceNumber, Guest_Address) Key:
16.1.3 Minimal Sets of Functional Dependencies F is minimal if: • Every dependency in F has a single attribute for its right-hand side. • We cannot replace any dependency X A in F with a dependency Y A, where Y is a proper subset of X, and still have a set of dependencies that is equivalent to F. • We cannot remove any dependency from F and still have a set of dependencies that is equivalent to F.
Algorithm 16.2 Finding a minimal cover F for a set of functional dependencies E (p550) Input: a set of functional dependencies E. 1. Set F := E. 2. Replace each functional dependency X {A1, A2,…, An} in F by the n functional dependencies XA1, XA2, . . ., XAn . 3. For each functional dependency XA in F for each attribute B that is an element of X if ((F - {XA}) D {(X - {B}) A}) is equivalent to F, then replace XA with (X - {B}) A in F. 4. For each remaining functional dependency XA in F if (F - {XA}) is equivalent to F, then remove XA from F.
Example 1) The set of functional dependencies F of relation R(ABCDEFGHIJKLM) is A A,B,C,D,E B C,D,E,F C E,H D,E F D F,G K L L K a) Find a minimal cover of the set F. b) Find a candidate key for the relation R.
Normalization: The process of decomposing complex data structures into simple relations according to a set of dependency rules. – reduce anomalies (update, insert, delete) first proposed by Codd (1972) Assumption: all non-key fields will be updated frequently - tend to penalize retrieval
Problems with update anomalies • Insertion anomalies • Deletion anomalies • Modification anomalies
EXAMPLE OF AN UPDATE ANOMALY • Consider the relation: • EMP_PROJ(Emp#, Proj#, Ename, Pname, No_hours) • Update Anomaly: • Changing the name of project number P1 from “Billing” to “Customer-Accounting” may cause this update to be made for all 100 employees working on project P1.
EXAMPLE OF AN INSERT ANOMALY • Consider the relation: • EMP_PROJ(Emp#, Proj#, Ename, Pname, No_hours) • Insert Anomaly: • Cannot insert a project unless an employee is assigned to it. • Conversely • Cannot insert an employee unless a he/she is assigned to a project.
EXAMPLE OF AN DELETE ANOMALY • Consider the relation: • EMP_PROJ(Emp#, Proj#, Ename, Pname, No_hours) • Delete Anomaly: • When a project is deleted, it will result in deleting all the employees who work on that project. • Alternately, if an employee is the sole employee on a project, deleting that employee would result in deleting the corresponding project.
15.3 Normalization of Relations (1) • Normalization: • The process of decomposing unsatisfactory "bad" relations by breaking up their attributes into smaller relations • Normal form: • Condition using keys and FDs of a relation to certify whether a relation schema is in a particular normal form
Normalization of Relations (2) • 2NF, 3NF, BCNF • based on keys and FDs of a relation schema • 4NF • based on keys, multi-valued dependencies : MVDs; 5NF based on keys, join dependencies : JDs (Chapter 11) • Additional properties may be needed to ensure a good relational design (lossless join, dependency preservation; Chapter 11)
Definitions of Keys and Attributes Participating in Keys (2) • If a relation schema has more than one key, each is called a candidate key. • One of the candidate keys is arbitrarily designated to be the primary key, and the others are called secondary keys. • A Prime attribute must be a member of some candidate key • A Nonprime attribute is not a prime attribute—that is, it is not a member of any candidate key.
Example – Unnormalized table Product_Order(Cust_Num, Cust_Fname, Cust_Lname, Cust_Phone, Product_Num, Product_Name, Product_Price, Company_Produced_Product, Product_Size, Order_Num, Order_Total_Amount, Order_Date, Qty_Order_Per_Product, Sub_Total_Per_Product) Functional Dependencies: Cust_Num → Cust_Phone, Cust_Lname, Cust_Fname Product_Num → Product_Name, Product_Price, Company_Produced_Product, Product_Size Order_Num → Order_Total_Amount, Order_Date, Cust_Num Order_Num, Product_Num → Qty_Order_Per_Product, Sub_Total_Per_Product Key: Order_Num, Product_Num
Problems: • Anomalies • Insert • Cannot insert a product unless a customer order a product is assigned to it. • Cannot insert an customer unless a he/she places an order. • Update • Changing the name or price of a product may cause this update to be made for all orders that customers placed • Delete • If a customer is the sole customer on a product, deleting that order information would result in deleting the corresponding product information.