1 / 52

ALAK ROY. Assistant Professor Dept. of CSE NIT Agartala

CSE-503 :: Database Management System. Normalization. 2. National Institute of Technology Agartala Aug-Dec,2010. ALAK ROY. Assistant Professor Dept. of CSE NIT Agartala. Outline. Functional dependencies Normalization. Normalization – use Functional Dependencies.

lexi
Download Presentation

ALAK ROY. Assistant Professor Dept. of CSE NIT Agartala

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSE-503 :: Database Management System Normalization 2 National Institute of Technology Agartala Aug-Dec,2010 ALAK ROY. Assistant Professor Dept. of CSE NIT Agartala

  2. Outline • Functional dependencies • Normalization

  3. Normalization – use Functional Dependencies Product Support Coverage ProductNumber,Date -> Time, EmployeeID, PhoneNumber EmployeeID, Date, Time -> ProductNumber PhoneNumber, Date, Time -> EmployeeID, ProductNumber EmployeeID, Date -> PhoneNumber Date -> PayRate PayRate -> Withholding

  4. ProductNumber Date Time EmployeeID PhoneNumber PayRate Withholding Illustrate FDs Product Coverage Support ProductNumber,Date -> Time, EmployeeID, PhoneNumber EmployeeID, Date, Time -> ProductNumber PhoneNumber, Date, Time -> EmployeeID, ProductNumber EmployeeID, Date -> PhoneNumber Date -> PayRate PayRate -> Withholding

  5. Decompose

  6. Functional Dependencies • A functional dependency is a constraint between two sets of attributes in a relational database. • If X and Y are two sets of attributes in the same relation T, then X  Y means that X functionally determines Y so that • the values of the attributes in X uniquely determine the values of the attributes in Y • for any two tuples t1 and t2 in T, t1[X] = t2[X] implies that t1[Y] = t2[Y] • if two tuples in T agree in their X column(s), then their Y column(s) should also be the same.

  7. FD and Keys • Key constraint is a special kind of functional dependency • K is a superkey for relation schema R if and only if K → R • K is a candidate key for R if and only if • K → R, and • for no α ⊂ K, α → R • Key is on LHS, all attributes are on RHS • ROLL  ROLL, Name, Address • For a key, no two rows share the same values, thus by default, when ever a tuple agrees on LHS it agrees on the RHS.

  8. WELCOME 2ND DAY OF NORMALIZATION

  9. Armstrong’s Axioms of FDs • Reflexivity: If X Y then X  Y (trivial FD) • Name, Address  Name • Augmentation: If X  Y then X Z YZ • If Town  Zip then Town, Name  Zip, Name • Transitivity: If X  Y and Y  Z then X  Z But keep in mind 2 other rules that are useful: • Union: If X → Y and X → Z, then X → YZ • Decomposition: • If X → YZ, then X → Y and X → Z

  10. Soundness • Axioms are sound: • If an expression f: X Ycan be derived from a set of FDs F using the axioms, then f is a FD. We say F entailsf. Completeness • Axioms are complete: • If F entails f , then f can be derived from F using the axioms • As a result, to determine if F entails f, use the axioms in all possible ways to generate F+(the set of possible FD’s is finite so this can be done) and see if f is in F+

  11. Functional Dependency Closure (F+) • Set F of Functional Dependencies (given) Relation: • EmpProj: SSN, Pnumber, Hours, Ename, Pname, Plocation • FDs F: • {SSN → Ename} • Pnumber →{Pname, Plocation} • {SSN, Pnumber} →Hours} • Closures: • {SSN}+ = {SSN, Ename} • {Pnumber}+ = {Pnumber, Pname, Plocation} • F+ • {SSN, Pnumber}+ = {SSN, Pnumber, Ename, Pname, Plocation, Hours}

  12. Generating F+ F AB C AB BCD A D AB BD AB BCDE AB CDE D E BCD  BCDE union decomp aug trans aug Thus, AB BD, AB  BCD, AB  BCDE, and AB  CDE are all elements of F+

  13. Attribute Closure • Calculating attribute closure is a more efficient way of checking entailment • The attribute closure of a set of attributes, X, with respect to a set of functional dependencies, F, (denoted X+F) is the set of all attributes, A, such that X  A • X +F1 is not necessarily the same asX +F2 • Checking entailment: Given a set of FDs, F, then X  Y if and only if X+F  Y (by union & decomposition rule)

  14. Computation of Attribute Closure Example Problem: Compute the attribute closure of AB with respect to the set of FDs : AB  C (a) A  D (b) D  E (c) AC  B (d) Solution: Initially closure = {AB} Using (a) closure = {ABC} Using (b) closure = {ABCD} Using (c) closure = {ABCDE}

  15. Computation of Attribute Closure X+F closure := X; --since X  X+F repeat old := closure; if there is an FD Z  V in F such that Z  closure thenclosure := closure V untilold = closure -- If T  closure then X  T is entailed by F

  16. Example - Computing Attribute Closure X XF+ A {A, D, E} AB {A, B, C, D, E} (Hence AB is a key) B {B} D {D, E} F: AB  C A  D D  E AC  B Is AB  E a FD? Yes Is D C a FD? No Result: XF+ allows us to determine FDs entailed by F of the form X  Y

  17. Normalization

  18. The goal is to remove redundancy based on dependencies

  19. Normal Forms • Each normal form is a set of conditions on a schema that guarantees certain properties (relating to redundancy and update anomalies) • The two commonly used normal forms are • third normal form (3NF) and • Boyce-Codd normal form (BCNF)

  20. Levels of Normalization 1 NF 2 NF 3 NF BCNF

  21. Normal Forms • Considerations: • Relational design by analysis • Normal forms are based on functional dependencies (FDs) • Intuitive, perhaps, but identifying a strictly controlled procedure allows a programmatic process • Should consider 2 additional properties • Lossless join (nonadditive join property) • required • Dependency preservation property • use when possible

  22. Lossless Joins and Dependency Preservation • If relation R and FDs F hold over R, then decomposing R into R1 and R2 is lossless if the closure of F contains either: • FD R1 ∩ R2 -> R1 or • FD R1 ∩ R2 -> R2 • If the closure of the attributes in R1, independent of those attributes in R2, unioned with the closure of attributes of R2, independent of those attributes in R1, are equivalent to the closure F, then dependency is preserved

  23. First Normal Form (1NF) • A relational schema R is in first normal form if the domains of all attributes of R are atomic (Atomicity) • Domain is atomic if its elements are considered to be indivisible units • Non-atomic values complicate storage and encourage redundant (repeated) storage of data • Requirements: • 1NF disallows multivalued attributes, or composite attributes, or their combinations, by requiring only single atomic (indivisible) values in the domain of an attribute

  24. Business Rules Example 1NF • Staffing hours (S) are on a per project activity (activities within projects) basis - AN • Managers (PM) and their departments (D) are assigned to projects (PN) • A department is assigned to a project managers • A project manager is assigned to projects

  25. Prime Attribute: An attribute of relation schema R is called a prime attribute of R if it is a member of some candidate key of R Non Prime Attribute: An attribute of relation schema R is called a non prime attribute of R if it is not a member of any candidate key.

  26. Second Normal Form (2NF) • 2NF based on the concept of full functional dependency. • A functional dependency is full functional dependency if removal of any attribute A from X means that the dependency does not hold anymore; i.e. for any attribute AX, (X-{A}) does not functionally determine Y. • If FD X→Y, removal of A eliminates the FD • Partial functional dependency : • For any attribute AX, (X-{A})→Y • If A can be removed and FD remains, X→Y is a partial functional dependency (a violation of 2NF)

  27. Partial functional dependency (a violation of 2NF) {SSN,PNUMBERHOURS is a full dependency. {SSN,PNUMBER} ENAME is partial because ENO  ENAME holds

  28. Second Normal Form (2NF) • Definition of 2NF: • A relational schema R is in 2NF if every nonprime attribute A in R is fully functionally dependent on the primary key of R.

  29. 2NF 1NF Staffing is on a per project activity ( and activities within projects) basis Managers and their departments are assigned to projects {PN,AN→h} PN →{PM,D}

  30. WELCOME 3rd DAY OF NORMALIZATION

  31. Third Normal Form (3NF) • Definition of 3NF: • A relational schema R is in 3NF if it satisfies 2NFand no nonprime attribute of R is transitively dependent on the primary key. • 3NF is based on the concept of transitive dependency of nonprime attributes on another nonprime attribute. {X→Y,Y→Z}⊨X→Z • Transitive dependencies - is a 3NF violations • LHS of FD should be superkey, or RHS is a prime attribute.

  32. 3NF A department is assigned to a project manager A project manager is assigned to projects PN→PM →D 2NF

  33. 3 NF BCNF Boyce-Codd Normal Form (BCNF) • BCNF is a simpler form of 3NF that is more restrictive. • Every relationship in BCNF is also in 3NF; however 3NF is not necessarily in BCNF. • Definition of BCNF: • A relation schema R is in BCNF if whenever a nontrivial functional dependencies X→A holds in R, then X is a superkey of R. • LHS of a FD should be superkey • Note: • Each attribute is identified by nothing but the key • Sometimes too restrictive, may not be dependency-preserving with regard to closure

  34. BCNF KEY A1 A1 A1 …

  35. New example: LOT 1NF 2NF

  36. New example: LOT 3NF

  37. Let there exist a new FD5; • FD5 violates BCNF in LOTS1A because AREA is not a superkey of LOTS1A • FD5 satisfy 3NF in LOTS1A because COUNTRY_NAME is a prime attribute, but this condition does not exist in definition of BCNF • So decompose LOTS1A in to LOTS1AX and LOTS1AY FD5

  38. 3NF to BCNF

  39. Other dependences and normal forms (not commonly used) • Multivalued dependences (4NF) • If X and Y are subsets of attributes of relation schema R: • MVD X↠Y independent of the values in other attributes • R is in 4NF if for every MVD X↠Y that holds over R, one of the following is true: • YX or XY = R (trivial MVD), or • X is a superkey

  40. MVD An employee can be assigned to any project and, within those projects, to any activities, but the assignments are consistent for that employee A project or activity can have any number of employees assigned to it EN→P EN→A

More Related