1 / 20

Anomalies—Design Problems

Anomalies—Design Problems. Redundancy Update Anomalies (modification, insertion, deletion). HasReservationFor. Redundancy: e.g. data value computable from other data values & constraints—e.g., Kennedy

eros
Download Presentation

Anomalies—Design Problems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Anomalies—Design Problems • Redundancy • Update Anomalies (modification, insertion, deletion) HasReservationFor Redundancy: e.g. data value computable from other data values & constraints—e.g., Kennedy Modification Anomaly: e.g., change Kennedy room to have 3 beds instead of 2—must update all redundant values consistently. Insertion Anomaly: e.g., add a new room (the Gold room)—necessarily yields a null. Deletion Anomaly: e.g., G4 cancels reservation—the fact that the Green room is room 5 is lost. Guest RoomNr RName NrBeds G1 1 Kennedy 2 G2 1 Kennedy 2 G3 1 Kennedy 2 G4 5 Green 1 G5 3 Carter 2 G5 2 Nixon 2 G5 4 Blue 1 RoomNr  RName, NrBeds RName  RoomNr, NrBeds

  2. Decomposition Can Resolve Anomalies Guest RoomNr G1 1 G2 1 G3 1 G4 5 G5 3 G5 2 G5 4 RoomNr RName NrBeds 1 Kennedy 2 2 Nixon 2 3 Carter 2 4 Blue 1 5 Green 1 RoomNr  RName, NrBeds RName  RoomNr, NrBeds • Modification Anomaly—2 beds to 3 in Kennedy room, only one change • Insertion Anomaly—Adding Gold room, no null values • Deletion Anomaly—G4 cancels, room 5 is still there • Redundancy—None

  3. Decomposition in General Relation R decomposed into {X1, X2, …, Xn} - X1 R, X2 R, …, Xn R and X1U X2 U … U Xn = R - r1(X1) = πX1R, …, rn(Xn) = πXnR - e.g. R = ABCDE OK: {AB, BCD, AE} NOT OK: {ABCD, DEF} NOT OK: {AB, BCD} Cannot just decompose arbitrarily - May lose information - May not remove redundancy & update anomalies

  4. Lossy Decomposition Guest RoomNr G1 1 G2 1 G3 1 G4 5 G5 3 G5 2 G5 4 RName NrBeds Kennedy 2 Nixon 2 Carter 2 Blue 1 Green 1 Applying |X| = full cross product = original e.g. Asserts (for example): <G1, 1, Blue, 1>, which is false

  5. Overlapping Attributes is Not Enough Guest NrBeds G1 2 G2 2 G3 2 G4 1 G5 2 G5 1 RoomNr RName NrBeds 1 Kennedy 2 2 Nixon 2 3 Carter 2 4 Blue 1 5 Green 1 Applying |X| = Natural Join = original e.g. Asserts (for example): <G1, 3, Carter, 2>, but G1 does not have a reservation for room 3 If overlapping attributes include a superkey of one (or both) of the decomposed relations, the join is lossless. e.g., see original decomposition

  6. Boyce-Codd Normal Form (BCNF) A condition that guarantees no redundancy and no update anomalies (wrt the functional dependencies). Let U be a set of attributes and let F be a set of FDs over U. Let R U. R is in Boyce-Codd Normal Form (BCNF) if for every nontrivial FD XY F+ such that XY R, X is a superkey of R. • Example: for U = Guest RoomNr RName NrBeds • and F = {RoomNr  RName, NrBeds; RName  RoomNr, NrBeds} • Guest RoomNr RName NrBeds is not in BCNF • Guest RoomNr & RoomNr RName NrBeds are both in BCNF

  7. Boyce-Codd Normal Form (BCNF) A condition that guarantees no redundancy and no update anomalies (wrt the functional dependencies). Let U be a set of attributes and let F be a set of FDs over U. Let R U. If for every nontrivial FD XY F+ that applies to R, X is a superkey of R, then R is in BCNF. BCNF tells us both: 1. When we need to decompose 2. How to decompose

  8. How do we decompose? • Answer: Make the attributes XY of the offending FD a schema and remove the Y attributes from the rest, R–Y (this assumes X ∩ Y = Ø; if not, XY–X will be an offending FD) • Since (R–Y) ∩ XY = X when X ∩ Y = Ø, if X is a superkey in XY, we have the condition we need for the decomposition to be lossless. • We are notguaranteed that either R–Y or XY is in BCNF, so we may have to repeat the process. • Hint: To arrive at the end quickly, make Y have as many attributes as possible (but still maintain X ∩ Y = Ø ) • When do we decompose? • Answer: When there is a nontrivial FD, XY  F+ such that XY R and X is not a superkey

  9. Example Consider: Reservations(Guest, RoomNr, RName, NrBeds) With FDs: RoomNrRName, NrBeds RNameRoomNr, NrBeds • Are there any FDs (implied or given) that violate BCNF? • RoomNrRName, NrBeds violates BCNF • RoomNr is not a superkey because RoomNrGuest • Split the relation in two (XY, and R–Y): • 1. XY: The violating FD: {RoomNr, RName, NrBeds} • 2. R–Y: {Guest, RoomNr} And thus minimal keys: Guest RoomNr and Guest RName

  10. Example (cont.) • Consider the new relational schemas: • RoomNr, RName, NrBeds • F+= { RoomNrRName, NrBeds, RNameRoomNr, NrBeds, … NrBedsNrBeds, RoomNr, NrBedsRName} • Keys: RoomNr and RName • All FDs satisfy BCNF • Guest, RoomNr • Any two-attribute relation R(AB) is in BCNF • Proof: • Case1: There are no nontrivial FDs  only nontrivial FDs can violate BCNF. • Case2: AB holds, but BA does not. A is the key, and the only nontrivial FDs are AB (& AAB); no BCNF violation. • Case 3: BA holds, but AB does not. Symmetric to Case 2. • Case 4: Both AB and BA hold. A and B are both keys; no BCNF violation.

  11. Example—Multiple Decompositions F = { AB  C, A  C, C  DE, D  EF } • R = ABCDEF is not in BCNF • e.g., C+ = CDEF so CDEF violates BCNF since CDEF applies and is non-trivial and since C is not a superkey • R–Y = ABC and XY = CDEF • ABC is not in BCNF • e.g., A+ = ACDEF so AC and A is not a superkey • R–Y = AB and XY = AC -- both are in BCNF (2 attributes) • CDEF is not in BCNF • e.g., D+ = DEF so DEF and D is not a superkey • R–Y = CD and XY = DEF -- both are in BCNF • We end up with {AB, AC, CD, DEF}

  12. Dependency Preserving If we decompose R into X1, X2, … then if enforcing the FDs that hold on X1, X2, … is sufficient to guarantee that all FDs on R hold, the decomposition is dependency preserving. • Not Dependency Preserving: F = {ABC, CA} • ABC not in BCNF • {AC, BC} in BCNF • {CA}  F (Since AB+ = AB, ABC is not implied by {CA}.) • Dependency Preserving: F = {ABC, AC} • ABC not in BCNF • {AC, AB} in BCNF • {AC} F (Since AB+ = ABC, ABC is implied by {AC}.)

  13. Is Our Previous Example Dependency Preserving? F = { AB  C, A  C, C  DE, D  EF } • We ended up with {AB, AC, CD, DEF} • {AC, CD, DEF}  F • since AC, ABC holds by augmentation & projection • since CD and DEF, CDE holds by accumulation & projection • {AB, AC, CD, DEF} is dependency preserving

  14. 3NF • BCNF can guarantee no redundancy or update anomalies, but cannot guarantee dependency preserving. • 3NF relaxes the requirements and can guarantee dependency preserving at the cost of not always removing all redundancy. • Let U be a set of attributes and let F be a set of FDs over U. Let R U. R is in Third Normal Form (3NF) if for every nontrivial FD XY  F+ such that XY R, X is a superkey of R or each attribute in Y–X appears in some candidate key of R.

  15. 3NF Examples • ABC {ABC, CA} (the earlier problematic example) • In 3NF (but not in BCNF) • C is not a superkey, but A is in a candidate key • ABC {AB, BC} • Not in 3NF • B is not a superkey, and C is not in a candidate key • ABCD {ABC, BD} • Not in 3NF • B is not a superkey, and D is not in a candidate key

  16. 3NF Motivation PickyGuest makes reservations only for one room Room PickyGuest Date {Room, Date  PickyGuest; PickyGuest  Room} • {Room, Date, PickyGuest}—Not in BCNF: PickyGuest+ {Room, Date, PickyGuest} • So Decompose to: {Date, PickyGuest}, {PickyGuest, Room} • Both in BCNF—no redundancy or update anomolies. • BUT can’t directly enforce: Room,Date  PickyGuest • Options: • Don’t decompose and live with redundancy (i.e., if pickyGuest changes favorite room, must change all pickyGuest reservations) • Decompose and write code to join the relations to check violations whenever there is a modification or insert • Always consider the business implications when making this decision. Maybe you really won’t want to keep the offending constraint (or even have code that checks it).

  17. 3 3 3 3 2 2 2 1 1 Basic Patterns of Violations A B C D E G • BCNF: None of these is OK • 3NF: OK i.e., GA, A is part of a candidate key • 2NF: & OK • 1NF: & & OK • 1NF requires atomic values atomic • i.e., the DB cannot address subatomic elements Address: “12 Maple, Boston, MA” vs. StreetNr, City, State “I do solemnly swear to make every attribute functionally depend on the key, the whole key, and nothing but the key.”

  18. URA: Universal Relation Assumption • A “hidden” assumption about FD theory and normalization—the URA must hold. • URA • Informal: All relationships over any set of attributes of the database have the same meaning (semantic equivalence). Further, there are no nulls and no dangling tuples. • Formal: For any valid database instance, r1(R1), …, rn(Rn), Ri(r1 || … || rn) = ri for 1  i  n.

  19. Semantics & Semantic Equivalence q = Room, Name(r || s) ?? Case 1: q is name of guest staying in room. Room  Guest and Guest  Name implies Room  Name. But the meaning of the implied relationship set, Room  Name, is the composition of the meanings of the relationship sets, Room  Guest and Guest  Name, which may or may not be the meaning of the relationship set q. Case 2: q is name of room.

  20. FD Theory & Normalizationwith and without URA • With URA (Case 1) • q = Room,Name(r || s) • Room  Guest and Guest  Name and Guest is not a key • {Room, Guest, Name} is not in BCNF. • Decompose to: {Room, Guest} and {Guest, Name} • The DB has these two schemas. • Another way to see this is to note that Guest  Name is redundant • Delete it. • Map the two remaining relationships to schemas. • Without URA (Case 2) • q Room,Name(r || s) • Here Guest  Name is not redundant • Keep it. • Map the three relationships to three schemas.

More Related