1 / 45

C20.0046: Database Management Systems Lecture #8

C20.0046: Database Management Systems Lecture #8. Matthew P. Johnson Stern School of Business, NYU Spring, 2004. Agenda. Last time: Normalization This time: 4NF Relational Algebra Pep talk OHs today, drop-ins (80809). Normalization Review. Q: What’s required for BCNF?

faunia
Download Presentation

C20.0046: Database Management Systems Lecture #8

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. C20.0046: Database Management SystemsLecture #8 Matthew P. Johnson Stern School of Business, NYU Spring, 2004 M.P. Johnson, DBMS, Stern/NYU, Sp2004

  2. Agenda • Last time: Normalization • This time: • 4NF • Relational Algebra • Pep talk • OHs today, drop-ins (80809) M.P. Johnson, DBMS, Stern/NYU, Sp2004

  3. Normalization Review • Q: What’s required for BCNF? • Q: What are the two types of violations? • Q: What’s the loophole for 3NF? • Q: How do we fix a non-BCNF relation? M.P. Johnson, DBMS, Stern/NYU, Sp2004

  4. Normalization Review • Q: If AsBs violates BCNF, what do we do? • Q: In this case, could the decomposition be lossy? • Q: How do we combine two relations? • Q: Can BCNF decomp. lose FDs? • Q: Can 3NF decomp. lose FDs? M.P. Johnson, DBMS, Stern/NYU, Sp2004

  5. Name SSN Jobs Streets Citys Michael 123 Mayor 111 East 60th Street New York Michael 123 Mayor 222 Brompton Road London Michael 123 CEO 111 East 60th Street New York Michael 123 CEO 222 Brompton Road London Hilary 456 Senator 333 Some Street Chappaqua Hilary 456 Senator 444 Embassy Row Washington Hilary 456 First Lady 333 Some Street Chappaqua Hilary 456 First Lady 444 Embassy Row Washington Hilary 789 Lawyer 333 Some Street Chappaqua Hilary 789 Lawyer 444 Embassy Row Washington New topic: MVDs (3.7) • Consider this relation • People ~ their jobs ~ their residences • Person-address/city: many-many • Person-job: many-many • Address/city-job: independent M.P. Johnson, DBMS, Stern/NYU, Sp2004

  6. Redundancy in BCNF • Lots of redundancy! • Key? All fields • None determined by others! • Non-trivial FDs? None! •  In BCNF? Yes! • Now what? • New concept, leading to another normal form: • Multivalued dependencies M.P. Johnson, DBMS, Stern/NYU, Sp2004

  7. MVD definition • As  Bs if, when As are held fixed values in Bs are independent of values in rest • More precisely: if t1 and t3 agree on As, we then can find t2 such that t2, t2, t3 agree on As t2, t1 agree of Bs t2, t3 agree on Cs t1 | | | | t2 | | | | t3 M.P. Johnson, DBMS, Stern/NYU, Sp2004

  8. Name Streets Citys Jobs t1 Hilary 333 Some Street Chappaqua Senator t3 Hilary 444 Embassy Row Washington Lawyer t2 Hilary 333 Some Street Chappaqua Lawyer MVD example • Claim: name  streets,cities • If true: can pick arbitrary t1, t3 and find a t2 • We pick: first and last of Hilary’s tuples: • Now: if true, can find another Hilary row with street/address of t1 and job of t3 M.P. Johnson, DBMS, Stern/NYU, Sp2004

  9. MVD example • Now: if true, can find another Hilary row with street/address of t1 and job of t3 • Sure enough: t2 t2 M.P. Johnson, DBMS, Stern/NYU, Sp2004

  10. MVD rules • No splitting rule: • In the example, name  streets,cities • Do we have name  streets? • No: 444 Embassy Row doesn’t go with Chappaqua • NB: City doesn’t determine street – could have >1 house • But city, street aren’t independent t1 t3 M.P. Johnson, DBMS, Stern/NYU, Sp2004

  11. MVD rules • Trivial dependencies: • As  Bs iff As  BsAi • Transitive rule: • As  Bs, Bs  Cs  As  Cs • Complementation rule: • As  Bs  As  rest • Intuition: if each value in Bs is assoc’ed w/each value in rest, then each value of rest is assoc’ed w/each value in Bs M.P. Johnson, DBMS, Stern/NYU, Sp2004

  12. MVDs and FDs • MVD is a generalization of FD • Every FD is an MVD • Pf: Suppose As  Bs Pick t1, t3 that agree on As. Must find a t2. Let t2 be t3. Then 1) t2 agrees on As with both 2) t2 agrees on Bs with t1 (why?) 3) t2 agrees on rest with t3 (why?) QED M.P. Johnson, DBMS, Stern/NYU, Sp2004

  13. Fourth Normal Form • 4NF: like BCNF, but with MVDs not FDs • An MVD As  Bs is nontrivial if • No Bs are As • Some attributes left over (why?) • 4NF: for every nontrivial MVD As  Bs, As is a superkey • In example name  streets,cities, but name isn’t a superkey M.P. Johnson, DBMS, Stern/NYU, Sp2004

  14. Decomposition to 4NF • Again, analogous to BCNF • If we can find As  Bs for R where As isn’t a superkey, replace R with R1(As,Bs) and R2(As,rest) • Running example: name  streets,cities •  People(name,streets,cities,jobs) becomes Residences(name,street,city) and Employment(name,job) M.P. Johnson, DBMS, Stern/NYU, Sp2004

  15. 4NF: another construal • In nontrivial As  Bs, As must be superkey • After df of 4NF, text says: “That is, … every nontrivial MVD is really a FD with a superkey on the left” (p123). • We know: FDs are* MVDs but not vice versa • So: Why does this follow? Is it true? • Yes. As is a superkey  As  everything •  As  Bs  the MVD is an FD • Two kinds of MVDs: FDs and “true” MVDs • 4NF eliminates exactly the true ones * The typo swapping these was fixed. M.P. Johnson, DBMS, Stern/NYU, Sp2004

  16. Summary of normal forms M.P. Johnson, DBMS, Stern/NYU, Sp2004

  17. Combined isa/weak example • Exercise 3.3.1 • Convert from E/R to R, by E/R, OO and nulls chair name number room givenBy Depts courses isa Lab-courses Computer-allocation M.P. Johnson, DBMS, Stern/NYU, Sp2004

  18. Next topic: relational algebra (5.1-2) • Set operations: union, intersection, difference • Projection, selection • Cartesian Product • Joins: natural joins, theta joins • Combining operations to form queries • Dependent and independent operations M.P. Johnson, DBMS, Stern/NYU, Sp2004

  19. What is relational algebra? • An algebra for relations • “High-school” algebra is an algebra for numbers • Formalism for constructing expressions • Operations • Operands: Variables, Constants, expressions • Expressions: • Vars & constants • Operators applied to expressions M.P. Johnson, DBMS, Stern/NYU, Sp2004

  20. Why do we care about relational algebra? • Why construct expressions on relations? • The exprs are the form questions about the take • The relations these exprs cash out to are the answers to our questions • First proof of RDBMS/RA concept: System R (1979) • Modern implementation of RA: SQL M.P. Johnson, DBMS, Stern/NYU, Sp2004

  21. Relation operators • Five basic operators: • Union:  • Intersection: • Difference: - • Selection: s • Projection: P • Cartesian Product:  • Derived/auxiliary operators: • Intersection, complement • Joins (natural, equijoin, theta join, semijoin) • Renaming: r M.P. Johnson, DBMS, Stern/NYU, Sp2004

  22. Operators • Relations are sets  have set-theoretic ops • Venn diagrams • Union: R1  R2 • Example: • ActiveEmployees  RetiredEmployees • Difference: R1 – R2 • Example: • AllEmployees – RetiredEmployees = ActiveEmployees M.P. Johnson, DBMS, Stern/NYU, Sp2004

  23. Set operations - example R: S: R  S: M.P. Johnson, DBMS, Stern/NYU, Sp2004

  24. Set operations - example R: S: R - S: M.P. Johnson, DBMS, Stern/NYU, Sp2004

  25. Operators • Intersection: R1  R2 • Example: • UnionizedEmployees  RetiredEmployees • Intersection can be derived from  and – • R1  R2 = R1 – (R1 – R2) • R1  R2 = -(-R1  -R2) (allowed?) M.P. Johnson, DBMS, Stern/NYU, Sp2004

  26. Set operations - example R: S: R  S: M.P. Johnson, DBMS, Stern/NYU, Sp2004

  27. Operators • Selection • Selects all tuples satisfying a condition • Notation: sc(R) • Examples • ssalary > 100000(Employee) • sname = “Smith”(Employee) • The condition c can have • comparison ops:=, <, , >,, <> • boolean ops: and, or M.P. Johnson, DBMS, Stern/NYU, Sp2004

  28. Theater N’hood Title Angelica Village City of God Angelica Village Fog of War Theater N’hood Title Film Forum Village City of God Angelica Village City of God Angelica Village Fog of War Selection example • Select the movies at Angelica: • sTheater=“Angelica”(Showings) M.P. Johnson, DBMS, Stern/NYU, Sp2004

  29. Operators • Projection: op we used for decomposition • Eliminates columns, then removes duplicates • Notation: PA1,…,An(R) M.P. Johnson, DBMS, Stern/NYU, Sp2004

  30. Operators • Cartesian Product • Cross product • Each tuple in R1 combines w/each tuple in R2 • Notation: R1 R2 • If R1, R2 fields overlap, include both and disambiguate: R1.A, R2.A • Fairly rare in practice • used to express joins • Q: Where does the name come from? • Q: If R1 has n1 rows and R2 has n2, how large is R1 x R2? M.P. Johnson, DBMS, Stern/NYU, Sp2004

  31. Cartesian product example Hillary-addresses Hillary-jobs Hillary-addresses x Hillary-jobs M.P. Johnson, DBMS, Stern/NYU, Sp2004

  32. Operators • Natural join: our join up to now • But always merging shared attributes • Notation: R1 ⋈ R2 • Meaning: R1⋈ R2 = Pevery att once(sshared atts =(R1 R2)) • I.e., first compute the cross product R1 x R2 • Next, select the rows in which shared fields agree • Finally, project onto the union of R1 and R2’s fields (remove duplicates) M.P. Johnson, DBMS, Stern/NYU, Sp2004

  33. Natural join example Addresses Jobs Addresses ⋈ Jobs M.P. Johnson, DBMS, Stern/NYU, Sp2004

  34. Natural Join • R S • R ⋈ S= ? • Unpaired tuples called dangling M.P. Johnson, DBMS, Stern/NYU, Sp2004

  35. Natural Join • Given the schemas R(A, B, C, D), S(A, C, E), what is the schema of R ⋈ S ? • Given R(A, B, C), S(D, E), what is R ⋈ S? • Given R(A, B), S(A, B), what is R ⋈ S? M.P. Johnson, DBMS, Stern/NYU, Sp2004

  36. Theta Join • Like natural join, but • includes only rows that satisfy arbitrary condition • Does not project away shared attributes • R1⋈q R2 = sq(R1 R2) • Here q can be any condition • If condition is always satisfies, then theta join becomes natural join M.P. Johnson, DBMS, Stern/NYU, Sp2004

  37. Theta-join example U V U V A<D M.P. Johnson, DBMS, Stern/NYU, Sp2004

  38. Equijoin • A theta join where q is an equality • R1 ⋈A=B R2 = sA=B(R1  R2) • s = lower-case sigma • Example: • Employee ⋈SSN=SSN Dependents • Most useful join in practice M.P. Johnson, DBMS, Stern/NYU, Sp2004

  39. Semijoin • R ⋉ S = P{atts of R}(R ⋈ S) • Q: What does this mean? • Natural join of R and S; • Then project onto R’s atts • A: The rows of R for which >1 row in S agree on shared atts M.P. Johnson, DBMS, Stern/NYU, Sp2004

  40. Semijoin example Dependents Employee network Employee ⋉ Dependents = {employees who have dependents} M.P. Johnson, DBMS, Stern/NYU, Sp2004

  41. Renaming • Changes the schema, not the instance • Notation: rB1,…,Bn(R) • r is spelled “rho”, pronounced “row” • Example: • Employee(ssn,name) • rE2(social, name)(Employee) • Or just: rE(Employee) M.P. Johnson, DBMS, Stern/NYU, Sp2004

  42. Complex RA Expressions • Q: How long was Star Wars (1977)? • Strategy: find the row with Star Wars; then project the length field M.P. Johnson, DBMS, Stern/NYU, Sp2004

  43. Combining operations • Schema: Movies (Title, year, length, filmType, studioName) • Query: select titles and years of movies by Fox that are at least 100 minutes long. M.P. Johnson, DBMS, Stern/NYU, Sp2004

  44. Complex RA Expressions • Reps(ssn, name, etc.) • Clients(ssn, name, rssn) • Q: Find George’s client names • PClients.name(sReps.name=George(sReps.ssn=rssn( Reps x Clients))) • Or: PClients.name(sReps.name=George and Reps.ssn=rssn(Reps x Clients)) • Or: PClients.name(sReps.name=George(Reps x Clients) sReps.ssn=rssn(Reps x Clients)) M.P. Johnson, DBMS, Stern/NYU, Sp2004

  45. For next time • Finish chapter 5 • Come to office hours! M.P. Johnson, DBMS, Stern/NYU, Sp2004

More Related