1 / 42

Fundamentals of Database Systems Fourth Edition El Masri & Navathe

University Of Palestine. Fundamentals of Database Systems Fourth Edition El Masri & Navathe. Chapter 10 Functional Dependencies and Normalization for Relational Databases. Database Management CH 10. University Of Palestine. Functional Dependencies

Download Presentation

Fundamentals of Database Systems Fourth Edition El Masri & Navathe

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. University Of Palestine Fundamentals ofDatabase SystemsFourth EditionEl Masri & Navathe Chapter 10 Functional Dependencies and Normalization for Relational Databases

  2. Database Management CH 10 University Of Palestine • Functional Dependencies • Functional dependencies (FDs) are used to specify formal measures of the "goodness" of relational designs. • FDs and keys are used to define normal forms for relations • FDs are constraints that are derived from the meaning and interrelationships of the data attributes.

  3. Database Management CH 10 University Of Palestine • Functional Dependencies • A set of attributes X functionally determines a set of attributes Y if the value of X determines a unique value for Y • XY • X  Y holds if whenever two tuples have the same value for X, they must have the same value for Y. • X  Y in R specifies a constraint on all relation instances r(R).

  4. Database Management CH 10 University Of Palestine • Examples of FD constraints • social security number determines employee name • SSN  ENAME • project number determines project name and location • PNUMBER  {PNAME, PLOCATION} • City name determines ZIP code • City ZIP • Employee’s ssn and project number determines the hours per week that the employee works on the project • {Level, Experience_Years}  Salary

  5. Database Management CH 10 University Of Palestine • Examples of FD constraints • If K is a key of R, then K functionally determines all attributes in R • (since we never have two distinct tuples with t1[K]=t2[K]) • So: • Keys are Special Cases of FDs

  6. Database Management CH 10 University Of Palestine • Inference Rules for FDs • Given a set of FDs F, we can infer additional FDs that hold whenever the FDs in F hold. • Armstrong's inference rules: • IR1.(Reflexive) • If Y ⊇X, then X -> Y • IR2.(Augmentation) • If X  Y, then XZ  YZ • (Notation: XZ stands for X U Z) • IR3.(Transitive) • If X  Y and Y  Z, then X -> Z

  7. Database Management CH 10 University Of Palestine • Inference Rules for FDs (Cont.) • IR1, IR2, IR3 form a sound and complete set of inference rules • Some additional inference rules that are useful: • (Decomposition) • If X YZ, then X Y and X Z • (Union) • If X  Y and X  Z, then X  YZ • (Pseudo transitivity) • If X  Y and WY  Z, then WX  Z • The last three inference rules, as well as any other inference rules, can be deduced from IR1, IR2, and IR3 (completeness property)

  8. Database Management CH 10 University Of Palestine • F Closure • Closure of a set F of FDs is the set F+of all FDs that can be inferred from F. • F+ =F ∪{all FDs that can be logically imply from F using the inference rules} • So F ≡ F+

  9. Database Management CH 10 University Of Palestine • F Closure • Example: • R(A,B,C,D,E,G,H) • F= { BCD, AC, EGA, CEH } • Find F+ ? • F+ = { BCD, AC, EGA, CEH , • BC, BD, • EG, EA, • CE, CH, • BE, • BH, • …. • …. }

  10. Database Management CH 10 University Of Palestine • Attributes Closure • Closure of a set of attributes X with respect to F is the set X+ of all attributes that are functionally determined by X • Given a relation schema R and a set of FDs F that hold on R, • Let X ⊇R ( X is a set of attributes that found in R) • X+ = X ∪ { all attributes that can be derived from X using F}

  11. Database Management CH 10 University Of Palestine • Attributes Closure • Example: • R(A,B,C,D,E,G,H) • F= { BCD, AC, EGA, CEH } • A+ = ACEHG • (CE)+ = CEHGA • B+ = BCDEHGA • Note: Here B Give me all other attributes, So B is a key for R. • So We can Use X+ to find Keys for any relation

  12. Database Management CH 10 University Of Palestine • Minimal Cover of F • Also called Canonical Cover of F and is denoted by Fc • Every set of FDs has an equivalent Fc. • There can be several equivalent minimal sets.

  13. Database Management CH 10 University Of Palestine • Minimal Cover of F • Compute Fc from F • Let Fc F. • Repeat the following steps until no more reduction can be made on Fc • Remove all redundant attributes from X in XY • Ex. AB, ACB (C is redundant) • Remove all redundant attributes from Y in XY • Ex. ABC, BC (C is redundant) • Replace each pair of FDs of the form • XY1, XY2 • By XY1Y2 • Ex. AB, AC • Then ABC

  14. Database Management CH 10 University Of Palestine • Minimal Cover of F • Example • F= { AC, • ACD, • EAD, • EH } • Remove C from ACD • Then Convert AC, AD to ACD • Then Convert EAD, EH to EADH • Remove D from EADH • Then Fc is { ADC, EAH} Find Fc?

  15. Database Management CH 10 University Of Palestine • Problems caused by Bad Database Design • Wastes storage • Causes problems with update anomalies • Insertion anomalies • Deletion anomalies • Modification anomalies

  16. Database Management CH 10 University Of Palestine • EXAMPLE OF AN UPDATE ANOMALY • Consider the relation: • EMP_PROJ(Emp#, Proj#, Ename, Pname, No_hours) • Update Anomaly: • Changing the name of project number P1 from “Billing” to “Customer-Accounting” may cause this update to be made for all 100 employees working on project P1.

  17. Database Management CH 10 University Of Palestine • EXAMPLE OF AN INSERT ANOMALY • Consider the relation: • EMP_PROJ(Emp#, Proj#, Ename, Pname, No_hours) • Insert Anomaly: • Cannot insert a project unless an employee is assigned to it. • Conversely • Cannot insert an employee unless a he/she is assigned to a project.

  18. Database Management CH 10 University Of Palestine • EXAMPLE OF AN DELETE ANOMALY • Consider the relation: • EMP_PROJ(Emp#, Proj#, Ename, Pname, No_hours) • Delete Anomaly: • When a project is deleted, it will result in deleting all the employees who work on that project. • Alternately, if an employee is the sole employee on a project, deleting that employee would result in deleting the corresponding project.

  19. Database Management CH 10 University Of Palestine • Normalization • Dependency Preserving:- • Suppose that the relation R is decomposed into R1,R2, and the set of FDs F that holds on R ar distributed on R1,R2 then: • We can say that the decomposition is dependency preserving iff (F1UF2) ≡F • Otherwise the decomposition is not Dependency Preserving.

  20. Database Management CH 10 University Of Palestine • Normalization (Cont.) • Lossless-join decomposition:- • Suppose that the relation R is decomposed into R1,R2, and the set of FDs F that holds on R ar distributed on R1,R2, If the common attribute between R1 and R2 as a key in either R1 or R2 then: • We can say that the decomposition is Lossless-Join • Otherwise the decomposition is Lossy Join

  21. Database Management CH 10 University Of Palestine • Normalization (Cont.) • Example • Suppose that we have the following relation std_info: • and the following set of FDs that hold on sd_info • F={Sname Major,Phone, • Cname Instructor, • Sname, Cname Score} • Then The Decomposition : • F1={ Sname  Major, Department} F2={Cname instructor • Sname, Cname Score} • This Decomposition is Dependency Preserving and Lossless Join

  22. Database Management CH 10 University Of Palestine • Normalization (Cont.) • Normalization: • The process of decomposing unsatisfactory "bad" relations by breaking up their attributes into smaller relations • Normal form: • Condition using keys and FDs of a relation to certify whether a relation schema is in a particular normal form

  23. Database Management CH 10 University Of Palestine • Practical Use of Normal Forms • Normalization is carried out in practice so that the resulting designs are of high quality and meet the desirable properties • The practical utility of these normal forms becomes questionable when the constraints on which they are based are hard to understand or to detect • The database designers need not normalize to the highest possible normal form • (usually up to 3NF, BCNF or 4NF)

  24. Database Management CH 10 University Of Palestine • Definitions of Keys and Attributes Participating in Keys (1) • A superkeyof a relation schema R = {A1, A2, ...., An} is a set of attributes S subset-of R with the property that no two tuples t1 and t2 in any legal relation state r of R will have t1[S] = t2[S] • A key K is a superkeywith the additional property that removal of any attribute from K will cause K not to be a superkey any more.

  25. Database Management CH 10 University Of Palestine • Definitions of Keys and Attributes Participating in Keys (1) • If a relation schema has more than one key, each is called a candidate key. • One of the candidate keys is arbitrarily designated to be the primary key, and the others are called secondary keys. • A Prime attribute must be a member of some candidate key • A Nonprime attribute is not a prime attribute—that is, it is not a member of any candidate key.

  26. Database Management CH 10 University Of Palestine • First Normal Form • If a relation schema has more than one key, each is called a candidate key. • One of the candidate keys is arbitrarily designated to be the primary key, and the others are called secondary keys. • A Prime attribute must be a member of some candidate key • A Nonprime attribute is not a prime attribute—that is, it is not a member of any candidate key.

  27. Database Management CH 10 University Of Palestine • First Normal Form • Disallows • composite attributes • multivalued attributes • nested relations; attributes whose values for an individual tuple are non-atomic • Considered to be part of the definition of relation

  28. Database Management CH 10 University Of Palestine

  29. Database Management CH 10 University Of Palestine • BCNF (Boyce-Codd Normal Form) • The strongest Normal Form for the relation • Given arelation R and aset of FDs F hold on R, R is in BCNF (i.e. R is a good relation) iff for each FD in the form α -> β in F : α must be a super key. • Each normal form is strictly stronger than the previous one: • Every 2NF relation is in 1NF • Every 3NF relation is in 2NF • Every BCNF relation is in 3NF • There exist relations that are in 3NF but not in BCNF • The goal is to have each relation in BCNF (or 3NF)

  30. Database Management CH 10 University Of Palestine BCNF (Boyce-Codd Normal Form) stdinfo F={ sname  major,phone cname  instructor sname,cname  score } • Each FD on R must satisfy the condition to be in BCNF, otherwise we must divide the relation R into tow relation R1,R2 as follows: • R1(α, β) , R2(R- β) that satisfying the tow conditions • The decomposition is dependency preserving • The decomposition is loosless-join

  31. Database Management CH 10 University Of Palestine • BCNF (Boyce-Codd Normal Form) • Stdinforelation is not in BCNF so we must devide the relation as follows: • R1(sname, major, phone), R2(cname, sname,instructor,score) • The second step is ditrbuting the FDs to R1,R2 • F1={snamemajor,phone}, F2={cnameinstructor • cname,snamescore} • The third step we check the decomposition for dependency preserving and lossless-join conditions. R2 R1 F1 F2

  32. Database Management CH 10 University Of Palestine • BCNF (Boyce-Codd Normal Form) • This decomposition is dependency preserving because (F1UF2)=FD and loosless-join because the common attribute which is Sname is a key for R1. • The forth step is repeating checking for BCNF for the resulting relation R1,R2. • For F1 On R1, α which is • Sname is akey for R1, So • R1 Is in BCNF R1 F1

  33. Database Management CH 10 University Of Palestine BCNF (Boyce-Codd Normal Form) • For R2: • Sname, cname score • α is akeyfor R2 • Cname instructor • α is not akeyfor R2 so,, • We must devide R2 into tow relations R2.1,R2.2 R2 F2 F2={ Sname,Cname->score, Cname->instructor} R2.2 R2.1 F2.1 F2.1 R2.1,R2.2 are both in BCNF Form so finally we have 3 relations which are : R1, R2.1, R2.2 which ar all in BCNF

  34. Database Management CH 10 University Of Palestine Summary of BCNF Checking Check Each FD for the condition α is a key. If any FD fails then we divide the relation into tow relation R1(α , β) , R2(R- β(. Distribute the original FD on R1, R2. Check the decomposition for dependency preserving and loosless-join. Repeat the checking for every resulting relation till having all the relations in BCNF Form.

  35. Database Management CH 10 University Of Palestine BCNF: another example R FD FD={Bname,SnameBnakerName , BankerNameBname,Office#} Now we must check each FD for the BCNF condition: Bname,CnameBankerName : √ ( Bname,Sname is a key) BankerNameBname,Office# : X ( BankerName is not A key) So we must divide R into tow relations R1,R2 R1 R2 F2={ Ø } F1={ BankerName->Bname,Office# }

  36. Database Management CH 10 University Of Palestine • BCNF: another example (Cont.) • Hence BankerName is akey For R1 so the last decomposition is loosless-join, But it is Not dependency Preserving because F1UF2 ≠ FD. • So here we stop and don’t continue with BCNF, we try by 3NF OR 2NF. • Notes: • BCNF has avery restricted condition so not always we have a good decomposition . • BCNF sometimes looses FDs.

  37. Database Management CH 10 University Of Palestine • Third Normal Form 3NF • Given arelation R and aset of FDs F hold on R, R is in 3NF iff for each FD in the form α -> β in F at least one of the following conditions satisfies: • α must be a key for R OR • Each attribute in β is Prime attribute. • FD={Bname,SnameBnakerName, • BankerNameBname,Office#} R FD

  38. Database Management CH 10 University Of Palestine • 3NF (Cont.) • The last relation is not in 3NF Because for the second FD: α is not a key and Office# is not a prime attribute. • So we must divide R into two relations as follows: • R1(α , all non prime attributes in β ) • R2(R- non prime attributes in β ) R1 R2 F1 F2

  39. Database Management CH 10 University Of Palestine • 3NF (Cont.) • F1UF2 = FD So the decomposition is dependency preserving. • BankerName is a key for R1 so The decomposition is loosless join • R1,R2 are in 3NF so finally we have two relations R1,R2 • Important Notes: • 3NF Never ever loose FD. • If R is in BCNF then surely it will be in 3NF • Butt R may be in 3NF but Not in BCNF. R1 R2 F1 F2

  40. Database Management CH 10 University Of Palestine • Second Normal Form 2NF • Given arelation R and aset of FDs F hold on R, R is in 3NF iff for each FD in the form α -> β in F at least one of the following conditions satisfies: • α must be a key for R OR • Each attribute in β is Prime attribute. • α is not a subset of any key

  41. Database Management CH 10 University Of Palestine 2NF: Example FD={ BC->ADE E->G } For F1 BC is a key For F2 E is not a subset of any key So R is in 2NF R FD

  42. Database Management CH 10 University Of Palestine 2NF: Other Example R FD FD={ BC->ADE D->B E->G } F1: BC is akey F2: E is not a subset of any key F3: B is prime attribute So R is in 2NF The decomposition when the test fails is similar to 3NF

More Related