220 likes | 287 Views
Understand the concept of normalization in databases and the significance of 1NF, 2NF, and 3NF. Learn about the levels of normalization, anomalies in 1NF and 2NF, and how to resolve them. Explore examples and solutions to ensure data integrity.
E N D
Normalization 1NF, 2NF, 3NF
Introduction • In Kp.88 we have the suppliers and parts database • S {S#, SNAME, STATUS, CITY} • P {P#, PNAME, COLOR, WEIGHT, CITY} • SP {S#, P#, QTY} • What happens if the design is changed in some way like • Supplier’s CITY is inserted in SP SCP • See Kp. 408 Fig.11.1
Introduction (cont.1) • In Fig. 11.1(kp.408), the sample value for relvar SCP • There are many redundant information • S1’s city London: 6 times • S4’s city London: 3 times • What’ll be remained if update on S1’s city happens incorrectly due to the redundancy? • Some S1’s city is London • Some S1’s city is Amsterdam • …… good design principle “One fact in one place” = avoiding redundancy
Introduction (cont.2) • Normalization • Concerns about how a given relation containing certain undesirable properties can be converted to a more desirable form • Normal forms • If a relation satisfies a certain specified set of constraints
Normal forms • Levels of normalization • Universe of relvars • Normalized and unnormalized • 1NF relvars • Normalized • 2NF relvars • 3NF relvars • BCNF relvars • 4NF relvars • 5NF relvars • See Kp. 411 Fig.11.2 • The normalization procedure is reversible • Ex) 2NF 3NF • No information is lost More highly normalized
Nonloss Decomposition • Nonloss decomposition satisfies the following two properties • Breaking down a relvar does not lose information • Reversibility • The original relvar is equal to the join of decomposed relvars (join of its projections) • Correct further normalization has to satisfy this property!
Example of nonloss decomposition • S{S#, STATUS, CITY} is decomposed into two ways • SST{S#, STATUS}, SC{S#, CITY} nonloss • SST{S#, STATUS}, STC{STATUS. CITY} lossy
FD diagrams • Pictorial representation of FDs • Ex) FD diagrams for relvars S, SP, and P
First Normal Form • A relvar is in 1NF • Iff every tuple contains exactly one value for each attribute in every legal value of the relvar • Ex) also see fig.11.6 (kp.418) FIRST{S#, STATUS, CITY, P#, QTY} PRIMARY KEY {S#, P#}
Anomalies in 1NF • Let’s think about anomalies due to the FD S#CITY • Insert • We cannot just insert a supplier’s city unless the supplier must supply at least one part • Ex) insertion of <S5, , Athens, , > primary key (S#, p#) value becomes null : not allowed
Anomalies in 1NF (cont.1) • Delete • If we delete a sole tuple for a particular supplier, we lose • not only his shipment • but also his city. • Ex) <S3, 10, Paris, P2, 200>: 4th from the bottom in fig.11.6 (kp.418)
Anomalies in 1NF (cont.2) • Update • If we update the city value for a particular supplier, we may have to update many tuples • Ex) <S1, London> <S1, Amsterdam>: • We have to update 6 tuples • May cause inconsistency if we miss updating any tuple
Solution for the anomalies of 1NF • Decompose relvar FIRST{S#, STATUS, CITY, P#, QTY} into 2 relvars • SECOND{S#, STATUS, CITY} • SP{S#, P#, QTY} • See fig. 11.8 (kp. 420)
Solution for the anomalies of 1NF(cont.) • New relvars • SECOND{S#, STATUS, CITY} • SP{S#, P#, QTY} • See fig. 11.8 (kp. 420) and check • Can we insert <S5, , Athens>? • Can we delete <S3, p2, 200> without deleting S3’s information in SECOND? • Can we update S1’s city in one tuple only?
Second Normal Form • A relvar is in 2NF iff • 1NF and • Every nonkey attribute is irreducibly dependent on the primary key • Ex) • FIRST{S#, STATUS, CITY, P#, QTY} • Not 2NF because of FDs S#CITY, S#STATUS • S# is reduced from the primary key {S#, P#} • SECOND{S#, STATUS, CITY}, SP{S#, P#, QTY} • 2NF
Problem of 2NF • Lack of mutual independence among its nonkey attributes • Ex) in SECOND{S#, STATUS, CITY}, SP{S#, P#, QTY}, • we still have an FD CITYSTATUS • Because S#CITY, CITYSTATUS, • S#STATUS : transitive dependency
Anomalies of SECOND • SECOND{S#, STATUS, CITY}, • FD S#STATUS, S#CITY, CITYSTATUS • Insert • We cannot insert the fact only that a particular city has a particular status (CITY STATUS) • We can insert the fact only when a supplier is actually in the city • Delete • If we delete a tuple in SECOND, we may delete the STATUS information of the CITY • Ex) if we delete <S5, 30, Athens> in Fig.11.8 (kp.420), we lose STATUS information of Athens also.
Anomalies of SECOND (cont.) • SECOND{S#, STATUS, CITY}, • FD S#STATUS, S#CITY, CITYSTATUS • Update • If we update the STATUS value for a particular CITY, we may have to update many tuples • Ex) <London, 20> <London, 30>: • We have to update 2 tuples • May cause inconsistency if we miss updating any tuple
Solution for Anomalies of SECOND • Decompose • SECOND{S#, STATUS, CITY}, • FD S#STATUS, S#CITY, CITYSTATUS • Into SC{S#, CITY}, CS{CITY, STATUS} • The effect of the decomposition is to eliminate the transitive dependencies • See fig.11.10 (kp. 423) for example tables
Third Normal Form • A relvar is 3NF iff • 2NF • Every nonkey attribute is nontransitively dependent on the primary key • In other words, no mutual dependency • Ex) SC, and CS: 3NF