340 likes | 713 Views
Schema Refinement. Chapter 19 (part 1) Functional Dependencies. The Evils of Redundancy. Redundancy is at the root of several problems associated with relational schemas: redundant storage, insert/delete/update anomalies Main refinement technique: decomposition
E N D
Schema Refinement Chapter 19 (part 1) Functional Dependencies
The Evils of Redundancy • Redundancyis at the root of several problems associated with relational schemas: • redundant storage, • insert/delete/update anomalies • Main refinement technique: • decomposition • Example: replacing ABCD with, say, AB and BCD, • Functional dependeny constraints utilized to identify schemas with such problems and to suggest refinements.
Insert Anomaly Student Question : How do we insert a professor who has no students? Insert Anomaly: We are not able to insert “valid” value/(s)
Delete Anomaly Student Question : Can we delete a student that is the only student of a professor ? Delete Anomaly: We are not able to perform a delete without losing some “valid” information.
Update Anomaly Student Question : How do we update the name of a professor? Update Anomaly: To update a value, we have to update multiple rows. Update anomalies are due to redundancy.
Functional Dependencies (FDs) • A functional dependencyX Y holds over relation R if, for every allowable instance r of R: t1 r, t2 r, (t1) = (t2) implies (t1) = (t2) • Given two tuples in r, if the X values agree, then the Y values must also agree.
FD Example Student • Suppose we have FD sName address • for any two rows in the Student relation with the same value for sName, the value for address must be the same • i.e., there is a function from sName to address
Keys + Functional Dependencies • Assume K is a candidate key for R • What does this imply about FD between K and R? • It means that K R ! • Does K R require K to be minimal ? • No. Any superkey of R also functionally implies all attributes of R.
Example: Constraints on Entity Set • Consider relation obtained from Hourly_Emps: • Hourly_Emps (ssn, name, lot, rating, hrly_wages, hrs_worked) • Notation: • We denote relation schema by its attributes: SNLRWH • This is really the set of attributes {S,N,L,R,W,H}. • Some FDs on Hourly_Emps: • ssn is the key: S SNLRWH • rating determines hrly_wages: R W
Problems Caused by FD • Problems due to Example FD : • rating determines hrly_wages: R W
Example rating (R) determines hrly_wages (W) • Problems due to R W : • Update anomaly: Can we change W in just the 1st tuple of SNLRWH? • Insertion anomaly: What if we want to insert an employee and don’t know the hourly wage for his rating? • Deletion anomaly: If we delete all employees with rating 5, we lose the information about the wage for rating 5! Hourly_Emps
Wages Same Example Hourly_Emps2 Solution : 2 smaller tables instead of one big one ! • Problems due to R W ! Hourly_Emps
Wages Same Example Hourly_Emps2 • Problems due to R W ! Will 2 smaller tables be better than one big one? • Update anomaly: Can we change W in just the 1st tuple of SNLRWH? • Insertion anomaly: What if we want to insert an employee and don’t know the hourly wage for his rating? • Deletion anomaly: If we delete all employees with rating 5, we lose information about wage for rating 5?
Reasoning About FDs • Given some FDs, we can usually infer additional FDs: • ssn did, did lot implies ssn lot
Properties of FDs • Consider A, B, C, Z are sets of attributes Armstrong’s Axioms: • Reflexive (also trivial FD): if A B, then A B • Transitive: if A B, and B C, then A C • Augmentation: if A B, then AZ BZ These are sound and completeinference rules for FDs! Additional rules (that follow from AA): • Union: if A B, A C, then A BC • Decomposition: if A BC, then A B, A C
Closure of FDs • An FD f is implied bya set of FDs F if f holds whenever all FDs in F hold. • = closure of F is set of all FDs that are implied by F. • Computing closure of a set of FDs can be expensive. • Size of closure is exponential in # attrs!
Reasoning About FDs (Contd.) • Instead of computing full closure F+ of a set of FDs • Too expensive • Typically, we just need to know if a given FD X Y is in closure of a set of FDs F. • Algorithm for efficient check: • Compute attribute closureof X (denoted ) wrt F: • Set of all attributes A such that X A is in • There is a linear time algorithm to compute this. • Check if Y is in X+ . If yes, then X Y in F+.
Algorithm for Attribute Closure • Computing the closure of set of attributes {A1, A2, …, An}, denoted {A1, A2, …, An}+ • Let X = {A1, A2, …, An} • If there exists a FD B1, B2, …, Bm C, such that every Bi X, then X = X C • Repeat step 2 until no more attributes can be added. • Output X+ = {A1, A2, …, An}+
Another Example : Inferring FDs • Consider R (A, B, C, D, E) • with FDs F = { A B, B C, CD E } • does A E hold ? • Rephrase as : • Is A E in the closure F+ ? • Equivalently, is E in A+ ? • Let us compute {A}+ • {A}+ = {A, B, C} • Conclude : E is not in A+, therefore A E is false
Recap: So Far. • Functional Dependencies : Relationships across Attributes of Relations • Redundancy : Arises due to certain relationships (FDs) holding. • So far : Reasoning with FDs. • Approach : Establish certain “normal forms” with respect to dependencies