110 likes | 207 Views
CIS 218 Transforming a Data Model into a Relational Design. Normalization Representing relationships Table Definition. Normalization. Normalization is a process of analyzing a relation to ensure that it is well formed
E N D
CIS 218Transforming a Data Model into a Relational Design Normalization Representing relationships Table Definition
Normalization • Normalization is a process of analyzing a relation to ensure that it is well formed • If a relation is normalized, rows can be inserted, deleted, or modified without problems • Most problems are solved by breaking an existing table into two or more tables
The Rules of One • A database will be normalized if it follows the Rules of One: • One theme per table • One item per row • One attribute per column • One value per attribute
One theme per Table Student sid name gender ethnicity advisorID advisorName extension This table stores data about two things: Students and Advisors Student sid name gender ethnicity Advisor advisorID advisorName extension
One theme per table Pet petID name breed min_weight max_weight avg_life_expectancy Breed breed min_weight max_weight avg_life_expectancy Pet petID name
One item per row Item Item# Room Days Times Instructor Item Item# Instructor Item_Detail Item# Room Days Times One item# can have mulitple rooms, days, and times
One Attribute per Column name actually represents two attributes: firstName and last Name Student sid name gender ethnicity advisorID Student sid firstName lastName gender ethnicity advisorID
One Value per attribute If a sid can have more than one ethnicity, then we transform this relation Into two: Student sid firstName lastName gender ethnicity Student sid firstName lastName gender Ethnicity ethnic_code ethnicity
One value per attribute Pet petID name breed gender treatment price Pet petID name breed gender Treatment treatment price
Denormalization • Complexity vs. modification problems • Normalizing relations may significantly increase the complexity of the data structure • Denormalized relations may be preferred • CUSTOMER and ZIP • It is inefficient to read from two different tables to get a customer’s address • Modification problems are infrequent, because zip codes rarely change • Deletion problems are not usually an issue