430 likes | 561 Views
C20.0046: Database Management Systems Lecture #5. Matthew P. Johnson Stern School of Business, NYU Spring, 2004. Agenda. Last time: relational model Homework 1 is up, due next Tuesday This time: Functional dependencies Keys and superkeys in terms of FDs Finding keys for relations
E N D
C20.0046: Database Management SystemsLecture #5 Matthew P. Johnson Stern School of Business, NYU Spring, 2004 M.P. Johnson, DBMS, Stern/NYU, Sp2004
Agenda • Last time: relational model • Homework 1 is up, due next Tuesday • This time: • Functional dependencies • Keys and superkeys in terms of FDs • Finding keys for relations • Rules for combining FDs • Next time: anomalies & normalization M.P. Johnson, DBMS, Stern/NYU, Sp2004
Review: converting inheritance title year length stars Movies isa isa Weapon Voices Lion King Murder-Mysteries Cartoons Component M.P. Johnson, DBMS, Stern/NYU, Sp2004
Review: converting inheritance • OO-style: • Nulls-style: 1. Plain movies 3. Murder mysteries 2. Cartoons 4. Cartoon murder-mysteries M.P. Johnson, DBMS, Stern/NYU, Sp2004
Root entity set: Movies(title, year, length) Title Year length Star Wars 1980 120 Roger Rabbit 1990 115 Scream 1988 110 Lion King 1993 130 Subclass: MurderMysteries(title, year, murderWeapon) Title Year murderWeapon Subclass: Cartoons(title, year) Scream 1988 Knife Title Year R. Rabbit 1990 Knife Roger Rabbit 1990 Lion King 1993 Review: converting inheritance • E/R-style M.P. Johnson, DBMS, Stern/NYU, Sp2004
E/R-style & quasi-redundancy • Name and year of Roger Rabbit were listed in three different rows (in different tables) • Suppose title changes (“Roger” “Roget”) • must change all three places • Q: Is this redundancy? • A: No! • name and year are independent • multiple movies may have same name • Real redundancy reqs. depenency • two rows agree on SSN must agree on rest • conflicting hair colors in these rows is an error • two rows agree on movie title may still disagree • conflicting years may be correct – or may not be • Better: introduce “movie-id” key att M.P. Johnson, DBMS, Stern/NYU, Sp2004
Combined isa/weak example • Exercise 3.3.1 • Convert from E/R to R, by E/R, OO and nulls chair name number room givenBy Depts courses isa Lab-courses Computer-allocation M.P. Johnson, DBMS, Stern/NYU, Sp2004
Next topic: Functional dependencies (3.4) • FDs are constraints • part of the schema • can’t tell from particular relation instances • FD may hold for some instances “accidentally” • Finding all FDs is part of DB design • Used in normalization M.P. Johnson, DBMS, Stern/NYU, Sp2004
then they must also agree on the attributes B1, B2, …, Bm Functional dependencies • Definition: • Notation: • Read: Ai functionally determines Bj If two tuples agree on the attributes A1, A2, …, An A1, A2, …, An B1, B2, …, Bm M.P. Johnson, DBMS, Stern/NYU, Sp2004
Typical Examples of FDs • Product • name price, manufacturer • Person • ssn name, age • father’s/husband’s-name last-name • zipcode state • phone state (notwithstanding inter-state area codes) • Company • name stockprice, president • symbol name • name symbol M.P. Johnson, DBMS, Stern/NYU, Sp2004
A1 ... Am B1 ... Bm t1 t2 Functional dependencies • To check A B, erase all other columns; for each rows t1, t2 • i.e., check if remaining relation is many-one • no “divergences” • i.e., if AB is a well-defined function • thus, functional dependency if t1, t2 agree here then t1, t2 agree here M.P. Johnson, DBMS, Stern/NYU, Sp2004
FDs Example Product(name, category, color, department, price) Consider these FDs: namecolor categorydepartment color, categoryprice What do they say ? M.P. Johnson, DBMS, Stern/NYU, Sp2004
FDs Example • FDs are constraints: • On some instances they hold • On others they don’t namecolor categorydepartment color, categoryprice Does this instance satisfy all the FDs ? M.P. Johnson, DBMS, Stern/NYU, Sp2004
FDs Example namecolor categorydepartment color, categoryprice What about this one ? M.P. Johnson, DBMS, Stern/NYU, Sp2004
Recognizing FDs • Q: Is PositionPhone an FD here? • A: It is for this instance, but no, presumably not in general • Others FDs? • EmpID Name, Phone, Position • but Phone Position EmpID Name Phone Position E0045 Smith 1234 Clerk E1847 John 9876 Salesrep E1111 Smith 9876 Salesrep E9999 Mary 1234 Lawyer M.P. Johnson, DBMS, Stern/NYU, Sp2004
Keys of relations • {A1A2A3…An} is a key for relation R if • A1A2A3…Anfunctionally determine all other attributes • Usual notation: A1A2A3…An B1B2…Bk • rels = sets distinct rows can’t agree on all Ai • A1A2A3…An is minimal • No proper subset of A1A2A3…An functionally determines all other attributes of R • Primary key: chosen if there are several possible keys M.P. Johnson, DBMS, Stern/NYU, Sp2004
Keys example • Relation: Student(Name, Address, DoB, Email, Credits) • Which (/why) of the following are keys? • Name, Address • Name, SSN • Email, SSN • Email • SSN • NB: minimal != smallest M.P. Johnson, DBMS, Stern/NYU, Sp2004
Superkeys • A set of attributes that contains a key • Satisfies first condition: • functionally determines every other attribute in the relation • Might not satisfy the second condition: minimality • may be possible to peel away some attributes from the superkey • keys are superkeys • key are special case of superkey • superkey set is superset of key set • name;ssn is a superkey but not a key M.P. Johnson, DBMS, Stern/NYU, Sp2004
Discovering keys for relations • Relation entity set • Key of relation = (minimized) key of entity set • Relation binary relationship • Many-many: union of keys of both entity sets • Many(M)-one(O): only key of M (Why?) • One-one: key of either entity set (but not both!) M.P. Johnson, DBMS, Stern/NYU, Sp2004
Example – entity sets • Key of entity set = (minimized) key of relation • Student(Name, Address, DoB, SSN, Email, Credits) SSN Name Student Email Address Credits DoB M.P. Johnson, DBMS, Stern/NYU, Sp2004
Example – many-many • Many-many key: union of both ES keys Student Enrolls Course SSN Credits Name CourseID Enrolls(SSN,CourseID) M.P. Johnson, DBMS, Stern/NYU, Sp2004
Example – many-one • Key of many ES but not of one ES • keys from both would be non-minimal Course MeetsIn Room CourseID Name Capacity RoomNo MeetsIn(CourseID,RoomNo) M.P. Johnson, DBMS, Stern/NYU, Sp2004
Example – one-one • Keys of both ESs included in relation • Key is key of either ES (but not both!) Husbands Married Wives SSN Name Name SSN Married(HSSN, WSSN) or Married(HSSN, WSSN) M.P. Johnson, DBMS, Stern/NYU, Sp2004
Course Enrolls Student CourseID Name SSN Section Name RoomNo Capacity Discovering keys: multiway • Multiway relationships: • Multiple ways – may not be obvious • R:F,G,HE is many-one E’s key is included • but not part of key • Recall that relship atts are implicitly many-one Enrolls(CourseID,SSN,RoomNo) M.P. Johnson, DBMS, Stern/NYU, Sp2004
Example • Exercise 3.4.2 • Relation relating particles in a box to locations and velocities InPosition(id,x,y,z,vx,vy,vz) • Q: What FDs hold? • Q: What are the keys? M.P. Johnson, DBMS, Stern/NYU, Sp2004
Combining FDs (3.5) If some FDs are satisfied, thenothers are satisfied too namecolor categorydepartment color, categoryprice If all these FDs are true: name, categoryprice Then this FD also holds: Why? M.P. Johnson, DBMS, Stern/NYU, Sp2004
Rules for FDs • Reasoning about FDs: given a set of FDs, infer other FDs – useful • E.g. A B, B C A C • Definitions: for FD-sets S and T • T follows from S if all relation-instances satisfying S also satisfy T. • S and T are equivalent if the sets of relation-instances satisfying S and T are the same. • I.e., S and T are equivalent if S follows from T, and T follows from S. M.P. Johnson, DBMS, Stern/NYU, Sp2004
A1 ... Am B1 ... Bm t1 t2 Splitting & combining FDs Splitting rule: A1, A2, …, An B1 A1, A2, …, An B2 . . . . . A1, A2, …, An Bm A1A2…An B1B2…Bm Note: doesn’t apply to the left side Combining rule: Q: Does it apply to the left side? M.P. Johnson, DBMS, Stern/NYU, Sp2004
Reflexive rule: trivial FDs A1, A2, …, An Ai with i in 1..n is a trivial FD • FD A1A2…An B1B2…Bk may be • Trivial: Bs are a subset of As • Nontrivial: >=1 of the Bs is not among the As • Completely nontrivial: none of the Bs is among the As • Trivial elimination rule: • Eliminate common attributes from Bs, to get an equivalent completely nontrivial FD t t’ M.P. Johnson, DBMS, Stern/NYU, Sp2004
Transitive rule If A1, A2, …, An B1, B2, …, Bm and B1, B2, …, Bm C1, C2, …, Cp A1, A2, …, An C1, C2, …, Cp then t t’ M.P. Johnson, DBMS, Stern/NYU, Sp2004
Augmentation rule A1, A2, …, An B If then A1, A2, …, An , C B, C, for any C t t’ M.P. Johnson, DBMS, Stern/NYU, Sp2004
Rules summary • A B AC B (by definition) • Separation/Combination • Reflexive • Augmentation • Transitivity • Last 3 called Armstrong’s Axioms • Complete: entire closure follows from these • Sound: no other FDs follow from these M.P. Johnson, DBMS, Stern/NYU, Sp2004
Inferring FDs example 1. namecolor 2. categorydepartment 3. color, categoryprice Start from the following FDs: Infer the following FDs: Reflexive rule Transitivity(4,1) Reflexive rule combine(5,6) or Aug(1) Transitivity(3,7) M.P. Johnson, DBMS, Stern/NYU, Sp2004
Problem: infer ALL FDs Given a set of FDs, infer all possible FDs How to proceed? • Try all possible FDs, apply all rules • E.g. R(A, B, C, D): how many FDs are possible? • Drop trivial FDs, drop augmented FDs • Still way too many • Better: use the Closure Algorithm (next) M.P. Johnson, DBMS, Stern/NYU, Sp2004
Closure of a set of Attributes Given a set of attributes A1, …, An The closure, {A1, …, An}+ = {B in Atts: A1, …, An B} Example: namecolor categorydepartment color, categoryprice Closures: {name}+ = {name, color} {name, category}+ = {name, category, color, department, price} {color}+ = {color} M.P. Johnson, DBMS, Stern/NYU, Sp2004
Closure Algorithm Start with X={A1, …, An}. Repeat: if B1, …, Bn C is a FD and B1, …, Bn are all in X then add C to X. until X didn’t change Example: namecolor categorydepartment color, categoryprice {name, category}+ = {name, category, color, department, price} M.P. Johnson, DBMS, Stern/NYU, Sp2004
Example In class: A, B C A, D E B D A, F B R(A,B,C,D,E,F) Compute {A,B}+ X = {A, B, } Compute {A, F}+ X = {A, F, } M.P. Johnson, DBMS, Stern/NYU, Sp2004
More definitions • Given FDs versus Derived FDs • Given FDs: stated initially for the relation • Derived FDs: those that are inferred using the rules or through closure • Basis: A set of FDs from which we can infer all other FDs for the relation • Minimal basis: a minimal set of FDs (Can’t discard any of them) • No proper subset of the minimal basis can derive the complete set of FDs M.P. Johnson, DBMS, Stern/NYU, Sp2004
Problem: find all FDs • Given a database instance • Find all FD’s satisfied by that instance • Useful if we don’t get enough information from our users: need to reverse engineer a data instance • Q: How long does this take? • A: Some time for each subset of atts • Q: How many subsets? • powerset • exponential time in worst-case • But can often be smarter… M.P. Johnson, DBMS, Stern/NYU, Sp2004
Using Closure to Infer ALL FDs Example: A, B CA, D B B D Step 1: Compute X+, for every X: A+ = A, B+ = BD, C+ = C, D+ = D AB+ = ABCD, AC+ = AC, AD+ = ABCD ABC+ = ABD+ = ACD+ = ABCD (no need to compute– why?) BCD+ = BCD, ABCD+ = ABCD Step 2: Enumerate all FD’s X Y, s.t. Y X+ and XY = : AB CD, ADBC, ABC D, ABD C, ACD B M.P. Johnson, DBMS, Stern/NYU, Sp2004
Example • R(A,B,C) • Each of three determines other two • Q: What are the FDs? • Closure of singleton sets • Closure of doubletons • Q: What are the keys? • Q: What are the minimal bases? M.P. Johnson, DBMS, Stern/NYU, Sp2004
Computing Keys • Rephrasing of definition of key: • X is a key if • X+ = all attributes • X is minimal Note: there can be many minimal keys ! • Example: R(A,B,C), ABC, BCAMinimal keys: AB and BC M.P. Johnson, DBMS, Stern/NYU, Sp2004
Examples of Keys • Product(name, price, category, color) name, category price category color Keys are: {name, category} • Enrollment(student, address, course, room, time) student address room, time course student, course room, time Keys are: [in class] M.P. Johnson, DBMS, Stern/NYU, Sp2004