1 / 44

Normalization

Normalization.

rhulsey
Download Presentation

Normalization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Normalization Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for each relation in the relational schema. Normalization can be applied to individual relation so that database can be normalized to a specific form to prevent the possible occurrence of update anomaly.

  2. Data Redundancy and Update Anomalies The main purpose of database design is to identify the optimal grouping of attributes in order to minimize data redundancy which affect on saving space for data storage. Data redundancy always causes UPDATE ANOMALIES which are classified into 3 types: Insertion anomalies Deletion Anomalies Modification Anomalies

  3. Insertion Anomalies Deletion Anomalies Modification Anomalies

  4. Insertion Anomalies Class_Info To insert the details of new students into the Class_Info relation, we must include the details of the lecturer and subject in order to avoid null value. Deletion Anomalies If we delete a lecturer from the Class_Info relation, the details of students and subjects are also lost from the database. Modification Anomalies If we want to change the value of one of the attributes of a particular student in the Class_Info relation, we must update all rows which associate to the student. If this modification is not carried out on all the appropriate rows of the Class_Info relation, the database will become inconsistent.

  5. Insertion Anomaly Class_Info Insert new records may cause data redundancy and null value in some fields.

  6. Insertion Anomaly Class_Info Insert new records may cause data redundancy and null value in some fields.

  7. Insertion Anomaly Class_Info Insert new records may cause data redundancy and null value in some fields.

  8. Deletion Anomaly Class_Info Deletion Anomaly may cause loss other necessary data.

  9. Modification Anomaly Class_Info If we want to change the value of one of the attributes of a particular entity in the relation, we must update all rows that relate to this entity. If this modification is not carried out on all the appropriate rows ,the data base will become inconsistent.

  10. To solve update anomalies, a relation must be normalized by using normalization process to remove existing data redundancy.

  11. Functional Dependency One of the main concepts associated with normalization is functional dependency, which describes the relationship between attributes. Functional Dependency describes the relationship between attributes in a relation. For example, if A and B are attributes (or set of attributes) of relation R, B is functionally dependent on A (denoted AB), if each value of A is associated with exactly one value of B. The symbol of Functional Dependency (AB can be described as followings: B is functionally dependent on A or A determines B or B depends on A

  12. Functional Dependencies One of the main concepts associated with normalization is functional dependency, which describes the relationship between attributes. (Definition of Functional Dependency) Suppose that B is an attribute and A is another one, we said that Bis functionally dependent on A (denoted A  B), if each value of A is associated with exactly one value of B. ( A and B may each consists of one or more attributes.) The symbol of functional dependence (A  B) means B is functionally dependent on A or A functionally definesB or B depends on A

  13. If the functional dependency    holds on schema R, in any legal relation r, for all pairs of tuples t1 and t2 in r such that t1[] = t2[], it is also the case that t1[] = t2[]. Given a relation r, attribute y of r is dependent on attribute x if and only if whenever two tuples of R agree on their x-value, they must necessarily agree on their y-value. For every tuple in the relation r, if the value of attribute  in tuples are the same, DBMS guarantees that the value of the attribute  in those tuples must be the same. That is If   holds on R and if t1[] = t2[] DBMS must guarantee that t1[] = t2[]

  14. B is functionally dependent on A A B When a functional dependency exists, the attribute or group Of attributes on the left-hand side of the arrow is called the determinant. Position is functionally dependent on Staff_No Staff_No Position SL21 System Engineer Staff_No is not functionally dependent on Position Position Staff_No SL21 SG5 System Engineer

  15. ( LID, Subject,SID ) Lname, Salary, Dept, Credit, Sname, GPA LID  Lname, Salary, Dept Subject  Credit SID  Sname, GPA

  16. Utilization of FD to decompose a relation Lecturer Student Subject

  17. Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for each relation in the relational schema. Unnormalized Form 1st Normal Form 2nd Normal Form 3rd Normal Form Boyce-Codd Normal Form Normalization can be applied to individual relation so that database can be normalized to a specific form to prevent the possible occurrence of update anomaly. The process of normalization is a formal method that identifies relations based on primary key (or candidate keys in the case of BCNF the functional dependencies among their attributes).

  18. Relationships of Normal Forms Higher Normal forms

  19. Case Study The DreamHome company manages property on behalf of the owners, and as part of this service, the company takes care of the property’s rental. To simplify this example, we assume that a customer rents a given property only once, and cannot rent more than one property at any one time. Unnormalized form (UNF) : A table that contains one or more repeating groups. Customer_Rental Relation A repeating group is an attribute or group of attributes within a table that occurs with multiple values for a single occurrence of the key attribute (s) for that table. The term key refers to the attribute (s) that uniquely identify each row within the unnormalized table.

  20. Case Study The DreamHome company manages property on behalf of the owners, and as part of this service, the company takes care of the property’s rental. To simplify this example, we assume that a customer rents a given property only once, and cannot rent more than one property at any one time. Adjust Unnormalized form to 1st NF by removing of repeating groups in order to form relational data model (data are conceptually structured in the form of table) . Customer_Rental Relation CR76 John Kay Aline Stewart CR56 Aline Stewart CR56

  21. First normal form (1NF) : A relation in which the intersection of each row and column contains one and only one value. Customer_Rental Relation For the relational data model, it is important to recognize that it is only first normal form(1NF) that is critical in creating appropriate relations. All the subsequent normal forms are optional. However, to avoid the update anomalies, it is recommended that we proceed to at least 3NF.

  22. Set of the Functional Dependency of Customer_Rental relation fd1 Customer_No, Property_No RentStart, RentFinish(Primary key) fd2 Customer_No  CName(Partial dependency) fd3 Property_No  PAddress, Rent, Owner_No, OName(Partial dependency) fd4 Owner_No  OName (Transitive dependency) fd5 Customer_No, RentStart  Property_No, PAddress, RentFinish, Rent, Owner, OName(Candidate key) fd6 Property_No, RentStart  Customer_No, CName, RentFinish(Candidate key)

  23. Customer_No Property_No CName PAddress RentStart RentFinish Rent Owner_No OName fd1 (Primary key) fd2 (Partial dependency) (Partial dependency) fd3 fd4 (Transitive dependency) fd5 (Candidate key) fd6 (Candidate key)

  24. Second Normal Form (2NF) : A relation that is in the first normal form and every non-primary key attribute is fully functionally dependenton the primary key. Full functional : Indicates that if A and B are attributes of a relation, B is fully functionally dependent dependency on A if B is functionally dependent on A, but not on any proper subset of A. ถ้า B เป็น Non-Key attribute ซึ่งมีฟังก์ชั่นการขึ้นต่อกันอยู่กับส่วนใดส่วนหนึ่งของคีย์หลัก เราจะเรียกว่า B partial dependence on A. Partial dependency ต้องถูกขจัดออกโดยการแยก ออกไปตั้งเป็นตารางใหม่ เพื่อให้ Non-Key attribute ตัวนี้ fully dependent on คีย์หลัก RentFinish Rent Owner_No Addr OName Customer_No Property_No CName PAddress RentStart fd1 (Primary key) fd2 (Partial dependency) fd3 (Partial dependency)

  25. Customer (Customer_No, CName) Rental (Customer_No, Property_No, RentStart, RentFinish) Property_Owner (Property_No, PAddress, Rent, Owner_No, Oname, address) Rental Relation Customer Relation Property-Owner Relation 2NF applies to relations with composite keys, that is, relations with a primary key that composed of two or more attributes. A relation with a single attribute primary key is automatically in at least 2NF.

  26. Transitive dependency Customer (Customer_No, CName) Rental (Customer_No, Property_No, RentStart, RentFinish) Property_Owner (Property_No, PAddress, Rent, Owner_No, Oname, address) Transitive dependency Property-Owner Relation Rental Relation Customer Relation

  27. Transitive dependency : A condition where A, B, and C are attributes of a relation such that if A  B and B  C, then C is transitively dependent on A via B (provided that A is not functionally dependent on B or C). Definition of Third Normal Form: A relation that is in first and second normal form, and in which no non-primary key attribute is transitively dependent on the primary key. Customer (Customer_No, CName) Rental (Customer_No, Property_No, RentStart, RentFinish) Property_Owner (Property_No, PAddress, Rent, Owner_No, OName) Property-for-Rent Relation Owner Relation

  28. Customer_Rental Relation Customer (Customer_No, CName) Rental (Customer_No, Property_No, RentStart, RentFinish) Property (Property_No, PAddress, Rent, Owner_No) Owner (Owner_No, Oname, address)

  29. Customer_Rental 1NF Property_Owner 2NF Customer Rental Property_for_Rent Owner 3NF Rental Customer Property-Owner Owner

  30. From 3NF to Boyce-Codd Normal Form (BCNF) BCNF is based on functional dependencies that take into account all candidate keys in a relation. For a relation with only one candidate key, 3NF and BCNF are equivalent. The difference between 3NF and BCNF is that for a functional dependency AB, 3NF allows this dependency in a relation if B is a primary-key attribute and A is not a candidate key. Whereas, BCNF insists that for this dependency to remain in a relation, A must be a candidate key. Therefore, BCNF is a stronger form of 3NF, such every relation in BCNF is also in 3NF. Boyce-Codd :A relation is in BCNF if and only if every determinant is normal form (BCNF)a candidate key. • Violation of BCNF is quite rare, since it may only happen under specific conditions. The potential to violate BCNF may occur in relation that • contains two (or more) composite candidate keys and • which overlap, that is share at least one attribute in common

  31. Case Study In this example, Client_Interview relation is presented. It contains details of the arrangements for interviews of clients by members of staff of the DreamHome company. The members of staff involved in interviewing clients are allocated to a specific room on the day of interview. However, a room may be allocated to several members of staff as required throughout a working day. A client is only interviewed once on a given date, but may be requested to attend further interviews at later dates. This relation has three candidate keys: (Client_No, Interview_Date), (Staff_No, Interview_Date, Interview_Time), and (Room_No, Interview_Date, Interview_Time). Therefore the Client_Interview relation has three composite candidate keys, which overlap by sharing the common attribute Interview_Date. We select Client_No, Interview_Date) to act as the primary key for this relation.

  32. Client_Interview (Client_No, Inverview_Date, Interview_Time, Staff_No, Room_No) The Client_Interview relation has the following functional dependencies : Fd1 Client_No, Interview_Date Interview_Time, Staff_No, Room_No (Primary key) Fd2 Staff_No, Interview_Date, Interview_Time  Client_No(Candidate key) Fd3 Room_No, Interview_Date, Interview_Time  Staff_No,Client_No(Candidate key) Fd4 Staff_No, Interview_Date  Room_No Client_Interview Relation

  33. Interview (Client_No, Interview-Date, Interview_Time, Staff_No) Staff_Room (Staff_No, Interview-Date, Room_No) Interview Relation Staff_Room Relation

  34. Review of Normalization (1NF to BCNF) The DreamHome company manages property on behalf of the owners, and as part of this service the company undertakes regular inspections of the property by members of staff. When staff are required to undertake these inspections, they are allocated a company car for use on the day of the inspections. However, a car may be allocated to several members of staff, as required throughout the working day. A member of staff may inspect several properties on a given date, but a property is only inspected once on a given date. Property_Inspection Relation Property_Inspection (Property_No, PAddress, IDate, ITime, Comments, Staff_No, SName, OName)

  35. 1NF : Property_Inspection Relation Property_Inspection (Property_No, IDate, ITime, PAddress,Comments, Staff_No, SName, OName) FD1 (Primary key) (Partial dependency) FD2 (Transitive dependency) FD3 FD4 (Candidate key) FD5 FD6 (Candidate key)

  36. FD1 (Primary key) (Partial dependency) FD2 Remove Partial dependency (decompose the relation) to obtain 2NF Property Relation Property_Inspection Relation

  37. Property Relation (Property_No, PAddress) Property_Inspection Relation FD1 (Primary key) FD3 (Transitive dependency) FD4 (Candidate key) FD5 (Candidate key) FD6

  38. Property Relation Remove Transitive dependency (decompose the relation) to obtain 3NF Staff Relation Property_Inspection Relation

  39. Staff Relation Property Relation Remove remaining anomalies from functional dependencies to obtain BCNF Property_Inspection Relation Staff_Car (Staff_No, IDate, Car_Reg) Inspection (Property_No, IDate, ITime, Comments, Staff_No)

  40. From BCNF to Fourth Normal Form (4NF) Although BCNF removes any anomalies due to functional dependencies, further research led to the identification of another type of dependency called multi-valued dependency (MVD), which can cause similar design problems for relations in terms of data redundancy. ตารางต่อไปนี้เป็น BCNF แต่ยังเกิดปัญหา update anomalies Lect_Sub_Research Relation

  41. Multi-valued : Represents a dependency between attributes (for example, A, dependency B, and C) in a relation, such that for each value of A there is a (MVD) set of values for B, and a set of values for C. However, the set of values for B and C are independent of each other. Lecturer > Subject Lecturer > Research A > B A > C Lec_Sub_Research Relation Lec_Sub Relation Lec_Research Relation

  42. Unnormalized form (UNF) Remove repeating groups First normal form (1NF) Remove partial dependencies Second normal form (2NF) Remove transitive dependencies Third normal form (3NF) Remove remaining anomalies From functional dependencies Boyce-Codd form (BCNF) Remove multi-valued dependencies Fourth normal form (4NF)

More Related