180 likes | 290 Views
Functional Dependencies (FD). Given: relation schema R(A1, …, An), and X and Y be subsets of (A1, … An). FD : X Y means X functionally determines Y e.g. A 1 A 2 …A n B 1 B 2 …B m Significance: Key:
E N D
Functional Dependencies (FD) • Given: relation schema R(A1, …, An), and X and Y be subsets of (A1, … An). FD : X Y means X functionally determines Y e.g. A1A2…An B1B2…Bm • Significance: • Key: • Entity set R (A1, …, An), and X is a subset of A1, …, An that form a key for R, then may assert: X Y where Y is any subset of {A1, …, An} • R(A1, …, An) is a many-one relationship from entity set E1 to entity set E2, and among the Ai’s are attributes that form a key X for E1 and a key Y for E2, then may assert: X Y
Keys of Relations K is a key for relation R if: • K all attributes of R • For no proper subset of K is (1) true (i.e. a key must be minimal) • If K at least satisfies (1), then K is a superkey Example: {title, year, starName) forms a key for the Movie relation Superkeys: A set of attributes that contains a key is called a superkey
FD Rules • Splitting / Combining rule • Trivial Dependencies • Armstrong’s Axioms: • Reflexivity • Augmentatioin • Transitivity • Computing closure of attributes • Finding all implied FD’s ( Ref. section: 3.5.7)
Given vs. Derived FD’s • Given FD’s: stated initially for a relation. FD’s These are known to hold for a relation R. • Derived FD’s are FD’s that follow logically from the given FD’s (using the inference rules) or by the closure algorithm of attributes. • Basis : any set of given FD’s from which all FD’s for a relation can be inferred. • Minimal basis: if no proper subset of the FD’s in a basis can also derive the complete set of FD’s. • Example: 3.22 – pp. 98
Closure of attributes • Let {A1,A2,…,An} is a set of attributes, S is a set of FD’s • Closure of {A1,A2,…,An} under the dependencies in S = set of attributes B such that every relation that satisfies all the dependencies in set S also satisfies A1A2…An B. I.e.B is the set of attributes functionally determined by {A1,A2,…,An} • Notation: {A1,A2,…,An}+ • Allow trivial dependencies, so A1A2…An are always in{A1,A2,…,An}+
Closure Algorithm Let X be {A1,A2,…,An}+ • X = {A1,A2,…,An} // initialize X • Search for some FD B1B2…Bm C where B1B2…Bm X, but C is not in X. If found, add C to X • Repeat step 2 until no more attributes can be added to X. • The resulting X is the correct value of {A1,A2,…,An}+
Attributes Closure Example. 3.19 pp. 93 Given: R(A, B, C, D, E, F) and FD’s AB C BC AD D E CF B To find: {A, B}+ Procedure: • Let X = {A, B} • Add C to X // AB C, X = {A, B, C} • Add D to X // BC AD, X = {A, B, C, D} • Add E to X // D E, X = {A, B, C, D, E} • End, {A, B}+ = {A, B, C, D, E}
To infer if a FD follows a set of FD in a relation • AB D follows from dependencies in example 3.19? • Compute {A, B}+ , if D ends up in the closure, then AB D follows from the dependencies • Since {A, B}+ = {A, B, C, D, E}, AB D follows. • D A follows from dependencies in example 3.19? • Compute {D}+ , if A ends up in the closure, then D A follows from the dependencies • But {D}+ = {D, E}, D A does NOT follows.
Attributes Closures and Keys To test if A1,A2,…,An is a key for a relation R: • Compute {A1,A2,…,An}+ • Check if {A1,A2,…,An}+ is the set of all attributes of R • If yes, then check no subset of A1,A2,…,An, say S, such that S+ is the set of all attributes of R • If A1,A2,…,An only satisfies 1 but not 2, then A1,A2,…,An is a superkey
Finding all implied FD’s • Motivation: Given relation R with FDs F. When projecting R to form new relation S, want to know what FD’s hold in S • Method: compute all FD’s that • Follow from F • Involve only attributes of S • Problem: no. of FDs may be large (could be exponential in the number of attributes of S) • So, make simplifications in computing attribute closure: • Ignore empty set and set of all attributes (trivial FD’s) • Drop XYA if XA holds. • If closure of some set X is all attributes, then ignore supersets of X.
Algorithm to find all implied FD’s • For each set of attributes X compute X+. Start with closures of singleton set, and then move onto doubleton if necessary. • Add X A for each A inX+ - X • Ignore “obvious” dependencies that follow from others as described in the simplification guidelines.
Finding all implied FD’s Example • Given: R(ABCD), and FD’s F: AB C, C D, D A • What FD’s follow from F? • A+ = A; B+ = B // nothing • C+ = ACD // add C A • D+ = DA // nothing new • (AB)+ = ABCD // add AB D; skip all supersets of AB • (BC)+ = ABCD // nothing new; skip all supersets of BC • (BD)+ = ABCD // add BD C; skip all supersets of BD • (AC)+ = ACD ; (AD)+ = (AD); (CD)+ = ACD // nothing new • (ACD)+ = ACD // nothing new; • All other sets contain AB, BC, or BD, so skip • the only interesting FD’s that follow from F are : C A, AB D, BD C
Normalization • Purpose: process to eliminate redundancy in relations due to functional or multi-valued dependencies. • Normal forms: • Boyce-Codd Normal Form (BCNF) • Third Normal Form (3NF) • Fourth Normal Form (4NF)
Boyce-Codd Normal Form (BCNF) • Definition: A relation R is in BCNF iff whenever there is a nontrivial dependency A1A2…An B holds for R, {A1,A2,…,An} must be a superkey for R • Alternative definition after applying the combining rule to all FD’s with a common L.S.: A relation R is in BCNF iff whenever nontrivial dependency A1A2…An B1B2…Bm holds for R, {A1,A2,…,An} must be a superkey for R • Guarantees no redundancy, prevents update, deletion, and insertion anomalies • E.g. title year length filmType studioName // violates BCNF because key of Movie is {title, year, starName}, so Movie is not in BCNF
Decomposition to BCNF Given: relation R, FD’s F. Decomposition strategy: • look for a non-trivial FD A1A2…An B1B2…Bm that violates BCNF. (Heuristic: add to the RS as many attributes as are functionally determined by A1A2…An) • Compute {A1,A2,…,An}+ // all the attributes involved in the violating dependency • Decompose R into • {A1,A2,…,An}+ , and • (R - {A1,A2,…,An}+) {A1,A2,…,An} // LS + all attributes not involved in the violating dependency Others B’s A’s
BCNF Decomposition Example 3.24 pp 104 • Relation: Movie(title, year, length, filmType, studioName, starName) • Key: {title, year, starName} • FD’s: title year length filmType studioName is a BCNF violation, so Movie not in BCNF • Decomposition: Schema 1: {title, year,length, filmType, studioName} Schema 2: {title, year, starName} • To obtain the new relations, project the schemas onto Movie • To recover information (I.e. Movie) from the new relations: natural join the new relations. Does not lose information.
Additional points about BCNF • A relation may have more than 1 keys • Need some key in the LHS of any nontrivial FD • Only nontrivial FD’s are potential BCNF violation candidates • Any 2-attributes relations is in BCNF • Decomposition must be based on FD that holds in the relation, otherwise can’t recover the original relation from the new relations. • Eliminates all redundancies • Some decomposition may not preserve the FD’s (e.g. 3.32 pp 114)
Third Normal Form (3NF) • Definition: A relation R is in 3NF if: whenever there is a nontrivial dependency A1A2…An B holds for R, either {A1,A2,…,An} is a superkey for R, or B is a member of some key (i.e. B is prime). • BCNF with a relaxed condition • Some redundancy might be left in 3NF if resulting relations are not in BCNF • Preserves all the FD’s • Process similar to BCNF with the added condition • To recover information (I.e. Movie) from the new relations: natural join the new relations. Does not lose information.