1 / 37

Schema Refinement and Normal Forms Chapter 19

Schema Refinement and Normal Forms Chapter 19. The Evils of Redundancy. Redundancy causes several problems associated with relational schemas: Redundant storage Insert anomalies Delete anomalies Update anomalies What are the problems in the following schema?

brownjose
Download Presentation

Schema Refinement and Normal Forms Chapter 19

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Schema Refinement and Normal FormsChapter 19

  2. The Evils of Redundancy • Redundancycauses several problems associated with relational schemas: • Redundant storage • Insert anomalies • Delete anomalies • Update anomalies • What are the problems in the following schema? Hourly_Emps (ssn, name, lot, rating, hourly_wages, hours_worked)

  3. Example • Hourly_Emps (ssn, name, lot, rating, hourly_wages, hours_worked) • Abbreviate an attribute name to a single letter • Refer to a relation schema by a string of letters, one per attribute. • In the example, hourly_wages is determined by rating. This IC is an example of a functional dependency (RW). • What does it lead to?

  4. Example • Problems: • Redundant storage • Update anomaly: Can we change W in just the 1st tuple of SNLRWH? • Solutions to the anomalies?

  5. Example • Problems: • Insertion anomaly: What if we want to insert an employee and don’t know the hourly wage for his rating? • Solution? • Null values • What if the s is unknow?

  6. Example • Problems: • Deletion anomaly: If we delete all employees with rating 5, we lose the information about the wage for rating 5! • Solutions?

  7. Fighting the Evils of Redundancy • Redundancyis at the root of evil - true or not? • Integrity constraints, in particularfunctional dependencies, can be used to identify schemas with such problems and to suggest refinements. • RW • Decomposition(replace SNLRWH with SNLRH and RW) Wages Hourly_Emps2

  8. Fighting the Evils of Redundancy • RW • Decomposition(replace SNLRWH with SNLRH and RW) • Do we solve all this problems? • Redundant storage • Insert anomalies • Delete anomalies • Update anomalies Hourly_Emps2 Wages

  9. Fighting the Evils of Redundancy • Main refinement technique: decomposition (replacing ABCD with, say, AB and BCD, or ACD and ABD). • Problems related to decomposition: • Is there reason to decompose a relation? • Normal forms help us to decide whether or not to decompose a relation in further. • How? • What problems (if any) does the decomposition cause? • Lossless join decomposition • Dependency-preservation decomposition

  10. Properties of decomposition • Lossless join decomposition • Enable us to recover any instance of the decomposed relation from corresponding instances of the smaller relations. • Dependency-preservation decomposition • Enable us to enforce any constraints on the original relation by simple enforcing some constraints on each of smaller relations. • Key constrains Wages Hourly_Emps2

  11. Good? • List N, W, H • How?

  12. Good DB designer • Normal forms and what problems they do or do not alleviate? • Decomposition and what problems with decompositions. • Questions: • Is a relation in a given normal form? • Is a decomposition dependency preserving?

  13. Functional Dependencies (FDs) • A functional dependencyX Y holds over relation R if, for every allowable instance r of R: • t1 r, t2 r, t1.X =t2.X implies t1.Y =t2.Y • i.e., given two tuples in r, if the X values agree, then the Y values must also agree. (X and Y are sets of attributes.)

  14. Functional Dependencies (FDs) • An FD is a statement about all allowable relations. • Must be identified based on semantics of application. • Given some allowable instance r1 of R, we can check if it violates some FD f, but we cannot tell if f holds over R! • K is a candidate key for R means that K R • However, KR does not require K to be minimal!

  15. FDs- example FD: ABC Question: Do all tuples in the relation satisfy the Functional Dependents ABC Can we add a tuple <a1,b1,c2,d1> to the relation? Why?

  16. exercise • List all the functional dependencies that this relation instance satisfies. • Z → Y, X → Y, and XZ → Y

  17. Example: Constraints on Entity Set • Consider relation obtained from Hourly_Emps: • Hourly_Emps (ssn, name, lot, rating, hrly_wages, hrs_worked) • Notation: We will denote this relation schema by listing the attributes: SNLRWH • This is really the set of attributes {S,N,L,R,W,H}. • Sometimes, we will refer to all attributes of a relation by using the relation name. (e.g., Hourly_Emps for SNLRWH) • Some FDs on Hourly_Emps: • ssn is the key: SSNLRWH • rating determines hrly_wages: RW

  18. Wages Example (Contd.) Hourly_Emps2 • Problems due to RW : • Update anomaly: Can we change W in just the 1st tuple of SNLRWH? • Insertion anomaly: What if we want to insert an employee and don’t know the hourly wage for his rating? • Deletion anomaly: If we delete all employees with rating 5, we lose the information about the wage for rating 5! Will 2 smaller tables be better?

  19. Reasoning About FDs • Given some FDs, we can usually infer additional FDs: • For example: • Workers(ssn,name,lot,did,since) • Ssndid holds, • did lot holds, implies ssnlot • An FD f is implied bya set of FDs F if f holds whenever all FDs in F hold. • F+ = closure of F is the set of all FDs that are implied by a given set F of FDs.

  20. Closure of a set of FDs • Infer or compute the closure of a given set of F of FDs • Following three rules (Armstrong’s Axioms) • Armstrong’s Axioms (X, Y, Z are sets of attributes): • Reflexivity: If XY, then YX • Augmentation: If XY, then XZYZ for any Z • Transitivity: If XY and YZ, then XZ • These are sound and completeinference rules for FDs!

  21. Soundness and Completeness • Completeness: given F, the rules allows us to determine all dependencies in • Soundness: we cannot generate FDs not in • Addition rules: • Union: if XY, XZ then XYZ • Decomposition: if XYZ, then XY and XZ

  22. Soundness and Completeness • R=(A,B,C) and F={AB, BC} • Reflexivity: if Y is subset of X, then XY • Transitivity : if XY and YZ then XZ • AC • Augmentation : if XY then XZYZ for any Z • ACBC • ABBC • ABAC • ABC F+ ={A B, BC, AC, ACBC,ABAC. …}

  23. Reasoning About FDs • FD is a statement about the world as we understand it • Ssndid, did lot implies ssnlot • Is it correct to state that “Since X functionally determines Y, if we know X, we know Y.” Or “if XY, then X identifies Y”

  24. Reasoning About FDs (Contd.) • Couple of additional rules (that follow from AA): • Union: If X  Y and X Z, then XYZ • Decomposition: If XYZ, then XY and XZ • Example: Contracts(cid,sid,jid,did,pid,qty,value), and: • C is the key: CCSJDPQV • Project purchases each part using single contract: JPC • Dept purchases at most one part from a supplier: SDP

  25. Reasoning About FDs (Contd.) • C is the key: CCSJDPQV • Project purchases each part using single contract: JPC • Dept purchases at most one part from a supplier: SDP • JPC, CCSJDPQV imply JPCSJDPQV • SDP implies SDJJP • SDJJP, JPCSJDPQV imply SDJCSJDPQV

  26. Exercise • Consider a relation R with five attributes ABCDE. You are given the following dependencies: A→B, BC→E, and ED→A. • List all keys for R. • CDE, ACD, BCD

  27. Normal Forms • Returning to the issue of schema refinement, the first question to ask is whether any refinement is needed! • If a relation is in a certain normal form(BCNF, 3NF etc.), it is known that certain kinds of problems are avoided/minimized. This can be used to help us decide whether decomposing the relation will help.

  28. Normalization • Often performed as a series of tests on a relation to determine whether it satisfies or violates the requirements of a given normal form. • Four most commonly used normal forms are first (1NF), second (2NF), third (3NF) and Boyce-Codd (BCNF) normal forms. • Based on functional dependencies among the attributes of a relation. • A relation can be normalized to a specific form to prevent the possible occurrence of update anomalies

  29. First Norm Forms(1NF) • There are no repeating or duplication fields. • Each cell contains only a single value. • Each record is unique. • Identified by primary key

  30. Second Normal Forms (2NF) • All non-key fields depend on all components of the primary key. • Guaranteed when primary key is a single field.

  31. Third Normal Forms (3NF) • No non-key field depends on another. • All non-key fields depend only on primary key.

  32. R(ABCD)

More Related