280 likes | 412 Views
Functional Dependencies. Review. Relational data model (schema & instances) Physical and conceptual data independence ACID properties of transactions Integrity constraint Key constraint Foreign key constraint Inclusion dependency
E N D
Review • Relational data model (schema & instances) • Physical and conceptual data independence • ACID properties of transactions • Integrity constraint • Key constraint • Foreign key constraint • Inclusion dependency • SQL (DDL) – create tables, views, domains, intra and inter-relational constraints, assertions, etc.
Review: ER Modeling • ER Model – domains, attributes, entities, roles, relationships • Translating from ER to relational model • Key and participating constraints • Type hierarchies (disjoint & covering)
Review: Relational Algebra & SQL • Relational algebra: select, project, set ops, cross product, joins (diff types), division • SQL (query sublanguage): algebra to SQL • SELECT—FROM—WHERE—GROUP BY—HAVING—ORDER BY • Query evaluation strategy • Correlated, nested queries, division • Views in SQL • SQL (DML) – insert, delete, update
Review: TRC, DRC, FDs • Tuple relational calculus • Domain relational calculus • Bound and free variables • Relational algebra, SQL, TRC and DRC • Types of anomalies: update, deletion, insertion • Functional dependencies • Armstrong’s axioms • Entailment: attribute closure
Functional Dependencies • A functional dependency is a constraint between two sets of attributes in a relational database. • If X and Y are two sets of attributes in the same relation T, then X Y means that X functionally determines Y so that • the values of the attributes in X uniquely determine the values of the attributes in Y • for any two tuples t1 and t2 in T, t1[X] = t2[X] implies that t1[Y] = t2[Y] • if two tuples in T agree in their X column(s), then their Y column(s) should also be the same.
Functional Dependencies • Dependencies for this relation: • A B • A D • B,C E,F • Do they all hold in this instance of the relation R? • Functional dependencies are specified by the database programmer based on the intended meaning of the attributes.
Functional Dependencies • What are the functional dependencies in: COMPANIES(company_name, company_address, date_founded, owner_name, owner_title, #shares ) • company_name company_address • company_name date_founded • company_name, owner_id owner_title • company_name, owner_id #shares • company_name, owner_title owner_id • owner_id owner_name
FD and Keys • Key constraint is a special kind of functional dependency • Key is on LHS, all attributes are on RHS • SSN SSN, Name, Address • For a key, no two rows share the same values, thus by default, when ever a tuple agrees on LHS it agrees on the RHS.
Armstrong’s Axioms of FDs • Reflexivity: If Y X then X Y (trivial FD) • Name, Address Name • Augmentation: If X Y then X Z YZ • If Town Zip then Town, Name Zip, Name • Transitivity: If X Y and Y Z then X Z
Other derived rules • Union: If X Y and X Z, then X YZ • X YX (augment), YX YZ (augment) • thusX YZ (transitive) • Decomposition:If X YZ, then X Y and X Z • YZ Y (reflexive), thus X Y (transitive) • Pseudotransitivity:If X Y and WY Z, then XW Z • Accumulation rule: If X YZ and Z W, then X YZW
Derivation of Accumulation Ruleusing the Axioms • Accumulation rule: • If X YZ and Z W, then X YZW • Proof: from Z W, augment with YZ to get • YZZ YZW or YZ YZW • By transitivity, we get X YZW
Mathematical Properties of FDs • Definition: If F is a set of FDs on schema R and f is another FD on R, then F entails f if every instance r of R that satisfies F also satisfies f • Ex: F = {A B, B C} and f is A C • If Streetaddr Town and Town Zip then Streetaddr Zip • Definition: The closure of F, denoted F+, is the set of all FDs entailed by F • Definition: F and G are equivalent if F entails G and G entails F
Soundness • Axioms are sound: If an expression f: X Y can be derived from a set of FDs F using the axioms, then f is a FD. We say F entailsf. • Example: Given X Y and X Z then • Union rule: we can take the union of FDs that have the same LHS X XY Augmentation YX YZ Augmentation X YZ Transitivity
Completeness • Axioms are complete: If F entails f , then f can be derived from F using the axioms • As a result, to determine if F entails f, use the axioms in all possible ways to generate F+(the set of possible FD’s is finite so this can be done) and see if f is in F+
Generating F+ F AB C AB BCD A D AB BD AB BCDE AB CDE D E BCD BCDE union decomp aug trans aug Thus, AB BD, AB BCD, AB BCDE, and AB CDE are all elements of F+
Attribute Closure • Calculating attribute closure is a more efficient way of checking entailment • The attribute closure of a set of attributes, X, with respect to a set of functional dependencies, F, (denoted X+F) is the set of all attributes, A, such that X A • X +F1 is not necessarily the same asX +F2 • Checking entailment: Given a set of FDs, F, then X Y if and only if X+F Y (by union & decomposition rule)
Example - Computing Attribute Closure X XF+ A {A, D, E} AB {A, B, C, D, E} (Hence AB is a key) B {B} D {D, E} F: AB C A D D E AC B Is AB E a FD? Yes Is D C a FD? No Result: XF+ allows us to determine FDs entailed by F of the form X Y
Computation of Attribute Closure X+F closure := X; --since X X+F repeat old := closure; if there is an FD Z V in F such that Z closure thenclosure := closure V untilold = closure -- If T closure then X T is entailed by F
Computation of Attribute Closure Example Problem: Compute the attribute closure of AB with respect to the set of FDs : AB C (a) A D (b) D E (c) AC B (d) Solution: Initially closure = {AB} Using (a) closure = {ABC} Using (b) closure = {ABCD} Using (c) closure = {ABCDE}
Normal Forms • Each normal form is a set of conditions on a schema that guarantees certain properties (relating to redundancy and update anomalies) • The two commonly used normal forms are third normal form (3NF) and Boyce-Codd normal form (BCNF)
BCNF • Definition: A relation schema R is in BCNF if for every FD X Y associated with R either • Y X(i.e., the FD is trivial) or • X is a superkey of R • Example: Person1(SSN, Name, Address) • The only FD is SSN Name, Address • Since SSN is a key, Person1 is in BCNF
BCNF - Examples • Person (SSN, Name, Address, Hobby) • The FD SSN Name, Address does not satisfy requirements of BCNF (since the key is (SSN, Hobby)) • HasAccount (AccountNumber, ClientId, OfficeId) • The FD AcctNum OfficeId does not satisfy BCNF requirements (since keys are (ClientId, OfficeId) and (AcctNum, ClientId))
Redundancy • Suppose R has a FD A B. If an instance has 2 rows with same value in A, they must also have same value in B (=> redundancy) • If A is a superkey, there cannot be two rows with same value of A • Hence, BCNF eliminates redundancy SSN Name, Address SSN Name Address Hobby 1111 Joe 123 Main stamps 1111 Joe 123 Main coins
Third Normal Form • A relational schema R is in 3NF if for every FD X A associated with R either: • A X(i.e., the FD is trivial) or • X is a superkey of R or • A is part of some key • 3NF weaker than BCNF (every schema that is in BCNF is also in 3NF)
3NF Example • HasAccount (AcctNum, ClientId, OfficeId) • ClientId, OfficeId AcctNum • OK since LHS contains a key • AcctNum OfficeId • OK since RHS is part of a key • HasAccount is in 3NF but it might still contain redundant information
3NF Example ClientId OfficeId AcctNum 1111 Stony Brook 28315 2222 Stony Brook 28315 3333 Stony Brook 28315 • HasAccount might store redundant data: 3NF: OfficeId part of key FD: AcctNum OfficeId Decompose to eliminate redundancy (not always possible): ClientId AcctNum 1111 28315 2222 28315 3333 28315 BCNF (only trivial FDs) OfficeId AcctNum Stony Brook 28315 BCNF: AcctNum is key FD: AcctNum OfficeId
3NF (Non) Example • Person (SSN, Name, Address, Hobby) • (SSN, Hobby) is the only key. • SSN Name violates 3NF conditions since Name is not part of a key and SSN is not a superkey