200 likes | 385 Views
Anomalies—Design Problems. Redundancy Update Anomalies (modification, insertion, deletion). HasReservationFor. Redundancy: e.g. data value computable from other data values & constraints—e.g., Kennedy
E N D
Anomalies—Design Problems • Redundancy • Update Anomalies (modification, insertion, deletion) HasReservationFor Redundancy: e.g. data value computable from other data values & constraints—e.g., Kennedy Modification Anomaly: e.g., change Kennedy room to have 3 beds instead of 2—must update all redundant values consistently. Insertion Anomaly: e.g., add a new room (the Gold room)—necessarily yields a null. Deletion Anomaly: e.g., G4 cancels reservation—the fact that the Green room is room 5 is lost. Guest RoomNr RName NrBeds G1 1 Kennedy 2 G2 1 Kennedy 2 G3 1 Kennedy 2 G4 5 Green 1 G5 3 Carter 2 G5 2 Nixon 2 G5 4 Blue 1 RoomNr RName, NrBeds RName RoomNr, NrBeds
Decomposition Can Resolve Anomalies Guest RoomNr G1 1 G2 1 G3 1 G4 5 G5 3 G5 2 G5 4 RoomNr RName NrBeds 1 Kennedy 2 2 Nixon 2 3 Carter 2 4 Blue 1 5 Green 1 RoomNr RName, NrBeds RName RoomNr, NrBeds • Modification Anomaly—2 beds to 3 in Kennedy room, only one change • Insertion Anomaly—Adding Gold room, no null values • Deletion Anomaly—G4 cancels, room 5 is still there • Redundancy—None
Decomposition in General Relation R decomposed into {X1, X2, …, Xn} - X1 R, X2 R, …, Xn R and X1U X2 U … U Xn = R - r1(X1) = πX1R, …, rn(Xn) = πXnR - e.g. R = ABCDE OK: {AB, BCD, AE} NOT OK: {ABCD, DEF} NOT OK: {AB, BCD} Cannot just decompose arbitrarily - May lose information - May not remove redundancy & update anomalies
Lossy Decomposition Guest RoomNr G1 1 G2 1 G3 1 G4 5 G5 3 G5 2 G5 4 RName NrBeds Kennedy 2 Nixon 2 Carter 2 Blue 1 Green 1 Applying |X| = full cross product = original e.g. Asserts (for example): <G1, 1, Blue, 1>, which is false
Overlapping Attributes is Not Enough Guest NrBeds G1 2 G2 2 G3 2 G4 1 G5 2 G5 1 RoomNr RName NrBeds 1 Kennedy 2 2 Nixon 2 3 Carter 2 4 Blue 1 5 Green 1 Applying |X| = Natural Join = original e.g. Asserts (for example): <G1, 3, Carter, 2>, but G1 does not have a reservation for room 3 If overlapping attributes include a superkey of one (or both) of the decomposed relations, the join is lossless. e.g., see original decomposition
Boyce-Codd Normal Form (BCNF) A condition that guarantees no redundancy and no update anomalies (wrt the functional dependencies). Let U be a set of attributes and let F be a set of FDs over U. Let R U. R is in Boyce-Codd Normal Form (BCNF) if for every nontrivial FD XY F+ such that XY R, X is a superkey of R. • Example: for U = Guest RoomNr RName NrBeds • and F = {RoomNr RName, NrBeds; RName RoomNr, NrBeds} • Guest RoomNr RName NrBeds is not in BCNF • Guest RoomNr & RoomNr RName NrBeds are both in BCNF
Boyce-Codd Normal Form (BCNF) A condition that guarantees no redundancy and no update anomalies (wrt the functional dependencies). Let U be a set of attributes and let F be a set of FDs over U. Let R U. If for every nontrivial FD XY F+ that applies to R, X is a superkey of R, then R is in BCNF. BCNF tells us both: 1. When we need to decompose 2. How to decompose
How do we decompose? • Answer: Make the attributes XY of the offending FD a schema and remove the Y attributes from the rest, R–Y (this assumes X ∩ Y = Ø; if not, XY–X will be an offending FD) • Since (R–Y) ∩ XY = X when X ∩ Y = Ø, if X is a superkey in XY, we have the condition we need for the decomposition to be lossless. • We are notguaranteed that either R–Y or XY is in BCNF, so we may have to repeat the process. • Hint: To arrive at the end quickly, make Y have as many attributes as possible (but still maintain X ∩ Y = Ø ) • When do we decompose? • Answer: When there is a nontrivial FD, XY F+ such that XY R and X is not a superkey
Example Consider: Reservations(Guest, RoomNr, RName, NrBeds) With FDs: RoomNrRName, NrBeds RNameRoomNr, NrBeds • Are there any FDs (implied or given) that violate BCNF? • RoomNrRName, NrBeds violates BCNF • RoomNr is not a superkey because RoomNrGuest • Split the relation in two (XY, and R–Y): • 1. XY: The violating FD: {RoomNr, RName, NrBeds} • 2. R–Y: {Guest, RoomNr} And thus minimal keys: Guest RoomNr and Guest RName
Example (cont.) • Consider the new relational schemas: • RoomNr, RName, NrBeds • F+= { RoomNrRName, NrBeds, RNameRoomNr, NrBeds, … NrBedsNrBeds, RoomNr, NrBedsRName} • Keys: RoomNr and RName • All FDs satisfy BCNF • Guest, RoomNr • Any two-attribute relation R(AB) is in BCNF • Proof: • Case1: There are no nontrivial FDs only nontrivial FDs can violate BCNF. • Case2: AB holds, but BA does not. A is the key, and the only nontrivial FDs are AB (& AAB); no BCNF violation. • Case 3: BA holds, but AB does not. Symmetric to Case 2. • Case 4: Both AB and BA hold. A and B are both keys; no BCNF violation.
Example—Multiple Decompositions F = { AB C, A C, C DE, D EF } • R = ABCDEF is not in BCNF • e.g., C+ = CDEF so CDEF violates BCNF since CDEF applies and is non-trivial and since C is not a superkey • R–Y = ABC and XY = CDEF • ABC is not in BCNF • e.g., A+ = ACDEF so AC and A is not a superkey • R–Y = AB and XY = AC -- both are in BCNF (2 attributes) • CDEF is not in BCNF • e.g., D+ = DEF so DEF and D is not a superkey • R–Y = CD and XY = DEF -- both are in BCNF • We end up with {AB, AC, CD, DEF}
Dependency Preserving If we decompose R into X1, X2, … then if enforcing the FDs that hold on X1, X2, … is sufficient to guarantee that all FDs on R hold, the decomposition is dependency preserving. • Not Dependency Preserving: F = {ABC, CA} • ABC not in BCNF • {AC, BC} in BCNF • {CA} F (Since AB+ = AB, ABC is not implied by {CA}.) • Dependency Preserving: F = {ABC, AC} • ABC not in BCNF • {AC, AB} in BCNF • {AC} F (Since AB+ = ABC, ABC is implied by {AC}.)
Is Our Previous Example Dependency Preserving? F = { AB C, A C, C DE, D EF } • We ended up with {AB, AC, CD, DEF} • {AC, CD, DEF} F • since AC, ABC holds by augmentation & projection • since CD and DEF, CDE holds by accumulation & projection • {AB, AC, CD, DEF} is dependency preserving
3NF • BCNF can guarantee no redundancy or update anomalies, but cannot guarantee dependency preserving. • 3NF relaxes the requirements and can guarantee dependency preserving at the cost of not always removing all redundancy. • Let U be a set of attributes and let F be a set of FDs over U. Let R U. R is in Third Normal Form (3NF) if for every nontrivial FD XY F+ such that XY R, X is a superkey of R or each attribute in Y–X appears in some candidate key of R.
3NF Examples • ABC {ABC, CA} (the earlier problematic example) • In 3NF (but not in BCNF) • C is not a superkey, but A is in a candidate key • ABC {AB, BC} • Not in 3NF • B is not a superkey, and C is not in a candidate key • ABCD {ABC, BD} • Not in 3NF • B is not a superkey, and D is not in a candidate key
3NF Motivation PickyGuest makes reservations only for one room Room PickyGuest Date {Room, Date PickyGuest; PickyGuest Room} • {Room, Date, PickyGuest}—Not in BCNF: PickyGuest+ {Room, Date, PickyGuest} • So Decompose to: {Date, PickyGuest}, {PickyGuest, Room} • Both in BCNF—no redundancy or update anomolies. • BUT can’t directly enforce: Room,Date PickyGuest • Options: • Don’t decompose and live with redundancy (i.e., if pickyGuest changes favorite room, must change all pickyGuest reservations) • Decompose and write code to join the relations to check violations whenever there is a modification or insert • Always consider the business implications when making this decision. Maybe you really won’t want to keep the offending constraint (or even have code that checks it).
3 3 3 3 2 2 2 1 1 Basic Patterns of Violations A B C D E G • BCNF: None of these is OK • 3NF: OK i.e., GA, A is part of a candidate key • 2NF: & OK • 1NF: & & OK • 1NF requires atomic values atomic • i.e., the DB cannot address subatomic elements Address: “12 Maple, Boston, MA” vs. StreetNr, City, State “I do solemnly swear to make every attribute functionally depend on the key, the whole key, and nothing but the key.”
URA: Universal Relation Assumption • A “hidden” assumption about FD theory and normalization—the URA must hold. • URA • Informal: All relationships over any set of attributes of the database have the same meaning (semantic equivalence). Further, there are no nulls and no dangling tuples. • Formal: For any valid database instance, r1(R1), …, rn(Rn), Ri(r1 || … || rn) = ri for 1 i n.
Semantics & Semantic Equivalence q = Room, Name(r || s) ?? Case 1: q is name of guest staying in room. Room Guest and Guest Name implies Room Name. But the meaning of the implied relationship set, Room Name, is the composition of the meanings of the relationship sets, Room Guest and Guest Name, which may or may not be the meaning of the relationship set q. Case 2: q is name of room.
FD Theory & Normalizationwith and without URA • With URA (Case 1) • q = Room,Name(r || s) • Room Guest and Guest Name and Guest is not a key • {Room, Guest, Name} is not in BCNF. • Decompose to: {Room, Guest} and {Guest, Name} • The DB has these two schemas. • Another way to see this is to note that Guest Name is redundant • Delete it. • Map the two remaining relationships to schemas. • Without URA (Case 2) • q Room,Name(r || s) • Here Guest Name is not redundant • Keep it. • Map the three relationships to three schemas.