1 / 43

C20.0046: Database Management Systems Lecture #5

C20.0046: Database Management Systems Lecture #5. Matthew P. Johnson Stern School of Business, NYU Spring, 2004. Agenda. Last time: relational model Homework 1 is up, due next Tuesday This time: Functional dependencies Keys and superkeys in terms of FDs Finding keys for relations

yagil
Download Presentation

C20.0046: Database Management Systems Lecture #5

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. C20.0046: Database Management SystemsLecture #5 Matthew P. Johnson Stern School of Business, NYU Spring, 2004 M.P. Johnson, DBMS, Stern/NYU, Sp2004

  2. Agenda • Last time: relational model • Homework 1 is up, due next Tuesday • This time: • Functional dependencies • Keys and superkeys in terms of FDs • Finding keys for relations • Rules for combining FDs • Next time: anomalies & normalization M.P. Johnson, DBMS, Stern/NYU, Sp2004

  3. Review: converting inheritance title year length stars Movies isa isa Weapon Voices Lion King Murder-Mysteries Cartoons Component M.P. Johnson, DBMS, Stern/NYU, Sp2004

  4. Review: converting inheritance • OO-style: • Nulls-style: 1. Plain movies 3. Murder mysteries 2. Cartoons 4. Cartoon murder-mysteries M.P. Johnson, DBMS, Stern/NYU, Sp2004

  5. Root entity set: Movies(title, year, length) Title Year length Star Wars 1980 120 Roger Rabbit 1990 115 Scream 1988 110 Lion King 1993 130 Subclass: MurderMysteries(title, year, murderWeapon) Title Year murderWeapon Subclass: Cartoons(title, year) Scream 1988 Knife Title Year R. Rabbit 1990 Knife Roger Rabbit 1990 Lion King 1993 Review: converting inheritance • E/R-style M.P. Johnson, DBMS, Stern/NYU, Sp2004

  6. E/R-style & quasi-redundancy • Name and year of Roger Rabbit were listed in three different rows (in different tables) • Suppose title changes (“Roger”  “Roget”) •  must change all three places • Q: Is this redundancy? • A: No! • name and year are independent • multiple movies may have same name • Real redundancy reqs. depenency • two rows agree on SSN  must agree on rest • conflicting hair colors in these rows is an error • two rows agree on movie title  may still disagree • conflicting years may be correct – or may not be • Better: introduce “movie-id” key att M.P. Johnson, DBMS, Stern/NYU, Sp2004

  7. Combined isa/weak example • Exercise 3.3.1 • Convert from E/R to R, by E/R, OO and nulls chair name number room givenBy Depts courses isa Lab-courses Computer-allocation M.P. Johnson, DBMS, Stern/NYU, Sp2004

  8. Next topic: Functional dependencies (3.4) • FDs are constraints • part of the schema • can’t tell from particular relation instances • FD may hold for some instances “accidentally” • Finding all FDs is part of DB design • Used in normalization M.P. Johnson, DBMS, Stern/NYU, Sp2004

  9. then they must also agree on the attributes B1, B2, …, Bm Functional dependencies • Definition: • Notation: • Read: Ai functionally determines Bj If two tuples agree on the attributes A1, A2, …, An A1, A2, …, An B1, B2, …, Bm M.P. Johnson, DBMS, Stern/NYU, Sp2004

  10. Typical Examples of FDs • Product • name  price, manufacturer • Person • ssn  name, age • father’s/husband’s-name  last-name • zipcode  state • phone  state (notwithstanding inter-state area codes) • Company • name  stockprice, president • symbol  name • name  symbol M.P. Johnson, DBMS, Stern/NYU, Sp2004

  11. A1 ... Am B1 ... Bm t1 t2 Functional dependencies • To check A  B, erase all other columns; for each rows t1, t2 • i.e., check if remaining relation is many-one • no “divergences” • i.e., if AB is a well-defined function • thus, functional dependency if t1, t2 agree here then t1, t2 agree here M.P. Johnson, DBMS, Stern/NYU, Sp2004

  12. FDs Example Product(name, category, color, department, price) Consider these FDs: namecolor categorydepartment color, categoryprice What do they say ? M.P. Johnson, DBMS, Stern/NYU, Sp2004

  13. FDs Example • FDs are constraints: • On some instances they hold • On others they don’t namecolor categorydepartment color, categoryprice Does this instance satisfy all the FDs ? M.P. Johnson, DBMS, Stern/NYU, Sp2004

  14. FDs Example namecolor categorydepartment color, categoryprice What about this one ? M.P. Johnson, DBMS, Stern/NYU, Sp2004

  15. Recognizing FDs • Q: Is PositionPhone an FD here? • A: It is for this instance, but no, presumably not in general • Others FDs? • EmpID  Name, Phone, Position • but Phone  Position EmpID Name Phone Position E0045 Smith 1234 Clerk E1847 John 9876 Salesrep E1111 Smith 9876 Salesrep E9999 Mary 1234 Lawyer M.P. Johnson, DBMS, Stern/NYU, Sp2004

  16. Keys of relations • {A1A2A3…An} is a key for relation R if • A1A2A3…Anfunctionally determine all other attributes • Usual notation: A1A2A3…An B1B2…Bk • rels = sets  distinct rows can’t agree on all Ai • A1A2A3…An is minimal • No proper subset of A1A2A3…An functionally determines all other attributes of R • Primary key: chosen if there are several possible keys M.P. Johnson, DBMS, Stern/NYU, Sp2004

  17. Keys example • Relation: Student(Name, Address, DoB, Email, Credits) • Which (/why) of the following are keys? • Name, Address • Name, SSN • Email, SSN • Email • SSN • NB: minimal != smallest M.P. Johnson, DBMS, Stern/NYU, Sp2004

  18. Superkeys • A set of attributes that contains a key • Satisfies first condition: • functionally determines every other attribute in the relation • Might not satisfy the second condition: minimality • may be possible to peel away some attributes from the superkey • keys are superkeys • key are special case of superkey • superkey set is superset of key set • name;ssn is a superkey but not a key M.P. Johnson, DBMS, Stern/NYU, Sp2004

  19. Discovering keys for relations • Relation  entity set • Key of relation = (minimized) key of entity set • Relation  binary relationship • Many-many: union of keys of both entity sets • Many(M)-one(O): only key of M (Why?) • One-one: key of either entity set (but not both!) M.P. Johnson, DBMS, Stern/NYU, Sp2004

  20. Example – entity sets • Key of entity set = (minimized) key of relation • Student(Name, Address, DoB, SSN, Email, Credits) SSN Name Student Email Address Credits DoB M.P. Johnson, DBMS, Stern/NYU, Sp2004

  21. Example – many-many • Many-many key: union of both ES keys Student Enrolls Course SSN Credits Name CourseID Enrolls(SSN,CourseID) M.P. Johnson, DBMS, Stern/NYU, Sp2004

  22. Example – many-one • Key of many ES but not of one ES • keys from both would be non-minimal Course MeetsIn Room CourseID Name Capacity RoomNo MeetsIn(CourseID,RoomNo) M.P. Johnson, DBMS, Stern/NYU, Sp2004

  23. Example – one-one • Keys of both ESs included in relation • Key is key of either ES (but not both!) Husbands Married Wives SSN Name Name SSN Married(HSSN, WSSN) or Married(HSSN, WSSN) M.P. Johnson, DBMS, Stern/NYU, Sp2004

  24. Course Enrolls Student CourseID Name SSN Section Name RoomNo Capacity Discovering keys: multiway • Multiway relationships: • Multiple ways – may not be obvious • R:F,G,HE is many-one  E’s key is included • but not part of key • Recall that relship atts are implicitly many-one Enrolls(CourseID,SSN,RoomNo) M.P. Johnson, DBMS, Stern/NYU, Sp2004

  25. Example • Exercise 3.4.2 • Relation relating particles in a box to locations and velocities InPosition(id,x,y,z,vx,vy,vz) • Q: What FDs hold? • Q: What are the keys? M.P. Johnson, DBMS, Stern/NYU, Sp2004

  26. Combining FDs (3.5) If some FDs are satisfied, thenothers are satisfied too namecolor categorydepartment color, categoryprice If all these FDs are true: name, categoryprice Then this FD also holds: Why? M.P. Johnson, DBMS, Stern/NYU, Sp2004

  27. Rules for FDs • Reasoning about FDs: given a set of FDs, infer other FDs – useful • E.g. A  B, B  C  A  C • Definitions: for FD-sets S and T • T follows from S if all relation-instances satisfying S also satisfy T. • S and T are equivalent if the sets of relation-instances satisfying S and T are the same. • I.e., S and T are equivalent if S follows from T, and T follows from S. M.P. Johnson, DBMS, Stern/NYU, Sp2004

  28. A1 ... Am B1 ... Bm t1 t2 Splitting & combining FDs Splitting rule: A1, A2, …, An B1 A1, A2, …, An B2 . . . . . A1, A2, …, An Bm A1A2…An B1B2…Bm Note: doesn’t apply to the left side Combining rule: Q: Does it apply to the left side? M.P. Johnson, DBMS, Stern/NYU, Sp2004

  29. Reflexive rule: trivial FDs A1, A2, …, An Ai with i in 1..n is a trivial FD • FD A1A2…An B1B2…Bk may be • Trivial: Bs are a subset of As • Nontrivial: >=1 of the Bs is not among the As • Completely nontrivial: none of the Bs is among the As • Trivial elimination rule: • Eliminate common attributes from Bs, to get an equivalent completely nontrivial FD t t’ M.P. Johnson, DBMS, Stern/NYU, Sp2004

  30. Transitive rule If A1, A2, …, An B1, B2, …, Bm and B1, B2, …, Bm  C1, C2, …, Cp A1, A2, …, An C1, C2, …, Cp then t t’ M.P. Johnson, DBMS, Stern/NYU, Sp2004

  31. Augmentation rule A1, A2, …, An B If then A1, A2, …, An , C  B, C, for any C t t’ M.P. Johnson, DBMS, Stern/NYU, Sp2004

  32. Rules summary • A  B  AC  B (by definition) • Separation/Combination • Reflexive • Augmentation • Transitivity • Last 3 called Armstrong’s Axioms • Complete: entire closure follows from these • Sound: no other FDs follow from these M.P. Johnson, DBMS, Stern/NYU, Sp2004

  33. Inferring FDs example 1. namecolor 2. categorydepartment 3. color, categoryprice Start from the following FDs: Infer the following FDs: Reflexive rule Transitivity(4,1) Reflexive rule combine(5,6) or Aug(1) Transitivity(3,7) M.P. Johnson, DBMS, Stern/NYU, Sp2004

  34. Problem: infer ALL FDs Given a set of FDs, infer all possible FDs How to proceed? • Try all possible FDs, apply all rules • E.g. R(A, B, C, D): how many FDs are possible? • Drop trivial FDs, drop augmented FDs • Still way too many • Better: use the Closure Algorithm (next) M.P. Johnson, DBMS, Stern/NYU, Sp2004

  35. Closure of a set of Attributes Given a set of attributes A1, …, An The closure, {A1, …, An}+ = {B in Atts: A1, …, An B} Example: namecolor categorydepartment color, categoryprice Closures: {name}+ = {name, color} {name, category}+ = {name, category, color, department, price} {color}+ = {color} M.P. Johnson, DBMS, Stern/NYU, Sp2004

  36. Closure Algorithm Start with X={A1, …, An}. Repeat: if B1, …, Bn C is a FD and B1, …, Bn are all in X then add C to X. until X didn’t change Example: namecolor categorydepartment color, categoryprice {name, category}+ = {name, category, color, department, price} M.P. Johnson, DBMS, Stern/NYU, Sp2004

  37. Example In class: A, B  C A, D  E B  D A, F  B R(A,B,C,D,E,F) Compute {A,B}+ X = {A, B, } Compute {A, F}+ X = {A, F, } M.P. Johnson, DBMS, Stern/NYU, Sp2004

  38. More definitions • Given FDs versus Derived FDs • Given FDs: stated initially for the relation • Derived FDs: those that are inferred using the rules or through closure • Basis: A set of FDs from which we can infer all other FDs for the relation • Minimal basis: a minimal set of FDs (Can’t discard any of them) • No proper subset of the minimal basis can derive the complete set of FDs M.P. Johnson, DBMS, Stern/NYU, Sp2004

  39. Problem: find all FDs • Given a database instance • Find all FD’s satisfied by that instance • Useful if we don’t get enough information from our users: need to reverse engineer a data instance • Q: How long does this take? • A: Some time for each subset of atts • Q: How many subsets? • powerset •  exponential time in worst-case • But can often be smarter… M.P. Johnson, DBMS, Stern/NYU, Sp2004

  40. Using Closure to Infer ALL FDs Example: A, B  CA, D  B B  D Step 1: Compute X+, for every X: A+ = A, B+ = BD, C+ = C, D+ = D AB+ = ABCD, AC+ = AC, AD+ = ABCD ABC+ = ABD+ = ACD+ = ABCD (no need to compute– why?) BCD+ = BCD, ABCD+ = ABCD Step 2: Enumerate all FD’s X  Y, s.t. Y X+ and XY = : AB  CD, ADBC, ABC  D, ABD  C, ACD  B M.P. Johnson, DBMS, Stern/NYU, Sp2004

  41. Example • R(A,B,C) • Each of three determines other two • Q: What are the FDs? • Closure of singleton sets • Closure of doubletons • Q: What are the keys? • Q: What are the minimal bases? M.P. Johnson, DBMS, Stern/NYU, Sp2004

  42. Computing Keys • Rephrasing of definition of key: • X is a key if • X+ = all attributes • X is minimal Note: there can be many minimal keys ! • Example: R(A,B,C), ABC, BCAMinimal keys: AB and BC M.P. Johnson, DBMS, Stern/NYU, Sp2004

  43. Examples of Keys • Product(name, price, category, color) name, category  price category  color Keys are: {name, category} • Enrollment(student, address, course, room, time) student  address room, time  course student, course  room, time Keys are: [in class] M.P. Johnson, DBMS, Stern/NYU, Sp2004

More Related