1 / 45

Improving the Quality of Database Designs

Improving the Quality of Database Designs. (Adapted from David Kroenke, Dabase Processing ). Improving the Quality of Database Designs. Minimizing Redundancy in Database Avoiding Anomalies Function Dependency Normal Forms First Normal Form Second Normal Form Third Normal Form

jalen
Download Presentation

Improving the Quality of Database Designs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)

  2. Improving the Quality of Database Designs • Minimizing Redundancy in Database • Avoiding Anomalies • Function Dependency • Normal Forms • First Normal Form • Second Normal Form • Third Normal Form • Exercise Problems

  3. Minimizing Redundancy in DB • Redundancy • Wastes space • Wastes time • Causes Anomalies (incorrect data)

  4. Avoiding Anomalies • Causes • Update Anomaly • Insertion Anomaly • Deletion Anomaly

  5. Back to UA Back to DA DVD Table Back to IA

  6. Update Anomaly • Situation in whichUpdate in one record requires update in another record. • E.g.Suppose for dvdID #150 (Elizabeth),length is changed to 100.If length values in devID #157 and #125 are not changed also, we have anomalies.To DVD

  7. Insertion Anomaly • Situation in whichAdding a record results in an inconsistency • Suppose another copy of The 39 Steps is added to the table. If its values of genre, length, and rating are not the same as those dvdID #120, we have an anomaly.To DVD

  8. Deletion Anomaly • Situation in whichDeleting one record results in unintended loss of data • Suppose dvdID #172 is removed. Then all data items regarding studio DEF and its country (Poland) —will be lost.To DVD

  9. Functional Dependence • Definition • Given: A and B are attributes of relation (table) R Then B is functionally dependent on A if and only if each value in A has associated with it exactly one value of B in R. • A  B ( A determines B) • I.e., any 2 rows with same value for A will have the same value for B

  10. Functional Dependence (1) • DVD (title, publisher, length, director, pubAddress) • publisher  pubAddress (yes) • title  length (no) • title, publisher  length (yes) Back to 2NF

  11. Functional Dependence (2) • Books (bkID, ISBN, title, author, pubAddress) • ISBN  title (yes) • ISBN  author (yes) • bkID  title (yes) • bkID  author (yes) • bkID  pubAddress (yes) • title, publisher  length (yes) • A primary key determine each nonkey attribute

  12. First Normal Form (1NF) • A relation (table) is in 1NF if • Each row is unique (with primary key) • All attributes are atomic

  13. Second Normal Form (2NF) • A relation (table) is in second normal form if • All nonkey attributes are dependent on all of the key. (This means that the relation is not in 2NF if any nonkey attribute is dependent on only part of the key.) • E.g., in DVD, length is dependent only on title, but not on publisher. To FD1

  14. Back to Problems 2NF? (No) StudentdActivities

  15. Problems • Note • Key: stdID + activities • Attribute fee is dependent only on activities (partial key). • Problems • There are obvious redundancies. • If student 175 is removed, fee($50) for Squash is deleted. • A new activity—say Surfing—cannot be entered until a student is entered To 2NF

  16. Solution • Remove the attribute that is dependent only on part of the key and form a new table • Create a link between the new and the original tables using a foreign key • Note: if a relation (table) is 1NF and the primary key consists of a single attribute, the relation is automatically 2NF.

  17. Solution Activities Fees

  18. Third Normal Form (3NF) • A relation is in 3NF if • It is in 2NF and • There are no transitive dependencies. (I.e., every nonkey attribute is dependent only on the primary key.) • Table satisfying 3NF (in common terms) • Should have a field that uniquely identifies each record • Each field in the table should describe the subject that the table represents

  19. 3NF? (No) StudentHousing Back to Problems

  20. Transitive Dependence • stdID building (I.e., building is dependent on stdID) • building  fee (I.e., fee is dependent on building) • Thus,stdID  building  fee

  21. Problems • StdHousing is in 2NF, but • Redundant data will introduce modification anomaly • Removing stdID 150 deletes fee value for Ingersoll • Fee for a new building—say Barrett—cannot be recorded until a new stdID is enteredTo 3NF

  22. Solution • Remove data that is not dependent on primary key and form new relation • Create a relationship between the new and the original tables using foreign key

  23. Solution StudentResidence ResidenceFee

  24. Try This (Customers Table) Back to Problem

  25. Problem • Note that • custNum  ZIP ZIP  city, stateI.e., custNum  ZIP  city, state • Transitive dependence results in redundancy and modification, insertion, & deletion anomalies. To Customers

  26. Solution

  27. Summary • Examine the attributes of an entity and ask the following questions. If the answer is any “Yes,” an attribute probably belong to another entity. • Does an attribute or attributes describe an entity other than the current one? • Does an attribute of the entity depend (functionally dependent) on only part of the primary key? • Does an attribute depend on something other than the primary key?

  28. Company Database Customers custIdcustNamecustAddresscustCitycustStatecustZipcustPhonecustFaxorderNumorderQuantityorderDate Employees empIdempLastNameempFirstNameempMiddleNameempAddressempCityempStateempZipempPhoneempPagerempPositionempPositionDescripempDateHireempPayRateempDateLastRaise empIdempLastNameempFirstNameempMiddleNameempAddressempCityempStateempZipempPhoneempPagerempPositionempPositionDescripempDateHireempPayRateempDateLastRaise custIdcustNamecustAddresscustCitycustStatecustZipcustPhonecustFaxorderNumorderQuantityorderDateprodIdprodDescripprodCost Products prodIdprodDescripprodCost

  29. Company Database (2) Employees empIdempLastNameempFirstNameempMiddleNameempAddressempCityempStateempZipempPhoneempPager Employees empIdempLastNameempFirstNameempMiddleNameempAddressempCityempStateempZipempPhoneempPagerempPositionempDateHireempPayRateempDateLastRaise EmployeePaysempIdempPositionempPositionDescripempDateHireempPayRateempDateLastRaise

  30. Company Database Customers custIdcustNamecustAddresscustCitycustStatecustZipcustPhonecustFax Customers custIdcustNamecustAddresscustCitycustStatecustZipcustPhonecustFaxorderNumorderQuantityorderDate Orders custIdorderNumorderQuantityorderDate

  31. Quiz • Normalization is the process of grouping data into logically related data into tables to reduce redundancy. (T/F) • Having no duplicate or redundant data in a database, and having everything in the database normalized, is always the best way to go. (T/F) • If data is in the third normal form, it is automatically in the first and second normal forms. (T/F) • What is the major advantage of denormalized database versus a normalized database? • What are some major disadvantages of unnormalized database?

  32. Customers custIdcustNamecustAddresscustcitycustStatecustZipcustPhonecustFax EmployeePays empPayIdempDateHireempPayRateempDateLastRaise Employees empIdempLastNameempFirstNameempMiddleNameempAddressempCityempStateempZipempPhoneempPager Positions os_idpositionposition_descrip Orders orderNumorderQuantityorderDate Exercise:What Type of Relationships Do the Tables Have?

  33. Exercise:Normalize the following data. Take the following data and normalize it. Keep in mind that, in a real DB, there would be many more items than what is given here. Employees: Angela Smith, secretary, RR 1 Box 73, Greensburg, IN, 47890, $9.50/hour, started Jan. 22, 1996, SSN is 323149669 Jack Lee Nelson, salesman, 3334 N. Main St., Brownsburg, IN, 45687, 317-852-9901, $35,000.00/year, data started 10/28/95, SSN is 312567342 Customers: Robert’s Games & Things, 5612 Lafayette Rd., Indianapolis, IN, 46224, 317-291-7888, customer ID is 432A Reed’s Dairy Bar, 4556 W 10th St., Indianapolis, IN, 46245, 317-271-9823, customer ID is 117A CustomerOrders:Customer ID is 117A, date of last order is 2/20/1997, product ordered was napkins, and product ID is 661

  34. Tables Customers Employees Orders SsnlastNamefirstNamestreetcitystatezipphoneNumsalaryhourlyRatestartDateposition customerIDnamestreetcitystatezipphoneNum orderIDcustomerIDproductIDproductDescripdateOrdered

  35. Normalization Case Study • A database named “Movie Rentals” keeps track of which customer checked which movies.

  36. Rentals Table • 1NF • Each cell value is atomic • No repeating data fields • Is the table in 1NF?

  37. To Satisfy 1NF Requirements (1) Lets simplify the primary key

  38. To Satisfy 1NF Requirements (2) Make “movieTitle” and “category” atomic. (Note that fullName has also been split.)

  39. To Satisfy 1NF Requirements (3) Eliminate “repeating” phone data. Customers Phones Rentals

  40. To Satisfy 2NF Requirements (1) • 2NF • Is in 1NF • Each nonkey field value depends on the entire primary key

  41. To Satisfy 2NF Requirements (2) • Note that 2NF is a concern only with composite primary key.

  42. To Satisfy 3NF Requirements (1) • For 3NF • In 2NF • All nonkey field values depend only on primary key (i.e., no transitive dependency)

  43. To Satisfy 3NF Requirements (2) • In Customers table • Cust → nameFirst → title. I.e., tile depends on nameFirst

  44. To Satisfy 3NF Requirements (3) Titles Customers

  45. Titles Customers 1 N 1 1 Rentals N Phones N

More Related