1 / 85

Database Management Systems (DBMS)

Database Management Systems (DBMS). DeSiaMore www.desiamore.com/ifm Bachelor of Computer Science and BSc IT Year 1. 1. Data Normalization. Outline Introduction Keys Relationships First Normal Form (INF) Second Normal Form (2NF) Third Normal Form (3NF) Boyce-Codd Normal Form (BCNF)

jalen
Download Presentation

Database Management Systems (DBMS)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Database Management Systems (DBMS) DeSiaMore www.desiamore.com/ifm Bachelor of Computer Science and BSc IT Year 1 www.desiamore.com/ifm 1

  2. Data Normalization • Outline • Introduction • Keys • Relationships • First Normal Form (INF) • Second Normal Form (2NF) • Third Normal Form (3NF) • Boyce-Codd Normal Form (BCNF) • Fourth Normal Form (4NF) • Firth Normal Form (5NF) • Domain Key Normal Form (DKNF) • Denormalization • summary www.desiamore.com/ifm

  3. Data Normalization… • Normalization is the process of building database structures to store data. • Normalization is a formal process of developing data structures in a manner that eliminates redundancy and promotes integrity. • Normalization is an essential part of database design www.desiamore.com/ifm

  4. Data Normalization… • Purpose of Normalization • Minimize redundancy in data • Remove insert, Delete and Update anomaly during database activities • Reduce the need to reorganise data when it is modified or enhanced www.desiamore.com/ifm

  5. Data Normalization… • Keys • A key uniquely identifies a row in a table • There are two types of keys • Intelligent Keys • Non-intelligent Keys • An intelligent key is based upon data values such as date, a last name or combination of values • A non-intelligent key is completely arbitrary, having no function or meaning other than identification of the row www.desiamore.com/ifm

  6. Data Normalization… • Keys .. • A primary key is a column in the table whose purpose is to uniquely identify records from the same table • A foreign key is a column in a table that uniquely identifies the records from a different table • Primary and foreign keys therefore have a symbiotic relationship, in which the foreign key is used to store values of another table’s primary key to describe the relationship between data from different tables www.desiamore.com/ifm

  7. Data Normalization… • Relationships • There are three possible relationships between related entities or tables, these are • One-to-one • One-to-many • Many-to-many • Relational databases are designed to work most effectively with one-to-many relationships between tables and expressed using primary and foreign keys www.desiamore.com/ifm

  8. Data Normalization… • Relationships… • One-to-one relationships are rare, because data elements related in this way are normally placed in the same table • When a one-to-one relationship is desired, it can be expressed simply as a one-to-many relationship using primary and foreign keys www.desiamore.com/ifm

  9. Data Normalization… • Relationships… • One-to-many relationships are most common. • A primary key is the ‘one’ side of the relationship and the foreign is the ‘many’ side of the relationship • A primary key value is unique in its own table • Each pair of primary and foreign columns is a one-to-many relationship • The data values stored in these columns determine which rows from the table containing the foreign key column are associated with an individual row in the table containing the primary key column • Defining primary and foreign key columns in the database promotes data integrity www.desiamore.com/ifm

  10. Data Normalization… • Relationships… • Many-to-many relationships are problematic and cannot be adequately expressed directly in a relational database. • Many-to-many relationships are usually expressed using intersection tables. An intersection table contains two or more foreign keys, relating the primary key values of two or more tables to each other. • The role of intersection table is to convert the many-to-many relationship into two one-to-many relationships that can be easily handled by database www.desiamore.com/ifm

  11. Data Normalization… • Definition of Functional Dependence (FD) www.desiamore.com/ifm

  12. Data Normalization… • Suppose there is a relation R, an attribute X is said to be functionally dependent on attribute Y of R, if for a given value of Y, there exists one and only one value of X at any one time • Example, in figure above, the attributes, Aqualification, Aname, and Astatus of relation Author (A) functionally depend on A#, since there exists {for a given A#} one and only one Aname, Aqualification and Astatus www.desiamore.com/ifm

  13. Data Normalization… Similarly A.A# A.Aqualification Or A.A# A.(Aname, Aqualication, Astatues) • Symbolically, it is represented as; • A.A# A.Aname (i.e. Aname functionally depends on A# in relation A) www.desiamore.com/ifm

  14. Data Normalization… • Normal Forms • 1NF • 2 NF • 3NF • BCNF • 4NF • 5NF • Domain Key Normal Form (DKNF) • Denormalization www.desiamore.com/ifm

  15. Data Normalization…. • First Normal Form (1NF) • First normal form is a relation in which intersection of each row and column contains one and only one value. • The unnormalized table (a table that contains one or more repeating groups) can be transformed to first normal form by identifying and removing the repeating groups • A repeating group is a set of columns that store similar information that repeats in the same table. • Generally, the multi-valued attributes ( called repeating groups ) should be removed i.e. elimination of repeating groups www.desiamore.com/ifm

  16. Data Normalization…. • 1NF… • Consider the following table, CREATE TABLE CONTACTS (CONTACT_ID INTEGER NOT NULL, L_NAME VARCHAR(20) NOT NULL, F_NAME VARCHAR (20), CONTACT_DATE1 DATE, CONTACT_DESC1 VARCHAR(50), CONTACT_DATE2 DATE, CONTACT_DESC2 VARCHAR(50)); www.desiamore.com/ifm

  17. Data Normalization… • 1NF… • The above structure contains a repeating group of the date and description of two conversations. • To eliminate the repeating group, the group is removed to another table , which then related to the parent table. • The primary key of the parent table (CONTACT_ID) is stored in the second table • The primary and foreign key relationships are defined to ensure that conversations do not relate to a contact are not recorded. www.desiamore.com/ifm

  18. Data Normalization… • 1NF… www.desiamore.com/ifm

  19. Data Normalization… • 1NF…. • The new arrangement of the tables shown in figure 1 can be created using the DDL statements for creating the table as follows, CREATE TABLE CONTACTS (CONTACT_ID INTEGER NOT NULL, L_NAME VARCHAR(20) NOT NULL, F_NAME VARCHAR(20)); CTEATE TABLE COVERSATION (CONTACT_ID INTEGER NOT NULL, CONTACT_DATE DATE, CONTACT_DESC VARCHAR(50)); www.desiamore.com/ifm

  20. Data Normalization… • 1NF… ALTER TABLE CONTACTS ADD CONSTRAINT CONTACT_ID_PK PRIMARY KEY (CONTACT_ID); ALTER TABLE CONVERSATION ADD CONSTRAINT CONTACT_ID_FK FOREIGN KEY (CONTACT_ID) REFERNCES CONTACTS (CONTACT_ID); • Summary, every table should have a primary key, and each set of repeating groups should appear in its own table. • These criteria must be satisfied to achieve first normal form www.desiamore.com/ifm

  21. Data Normalization… • Second Normal Form (2NF) • A relation is in second normal form if it is in first normal form and • Every non-key attribute is fully and functionally dependent on the primary key. • In generally, no non-key attribute is functionally dependent on the primary key www.desiamore.com/ifm

  22. Data Normalization… • 2NF… • A relation in the first normal form will be in the second normal form if one of the following conditions is satisfied • The primary key consists of only one attribute ( field or column) • No non-key attributes exist in the relation, or all attributes in the relation are components of the primary key • Every non-key attribute is functionally dependent on the full set of primary key attributes www.desiamore.com/ifm

  23. Data Normalization… • 2NF… • Example, consider a relation, –EMPLOYEE with the attributes Emp_ID, Department, F_Name, L_Name, Salary and Date-of-birth. • Assume that the primary key for this relation is the composite key Emp_ID + Department, www.desiamore.com/ifm

  24. Data Normalization… • 2NF… • Example… • In this case the non-key attributes such as F_Name, L_Name, Salary and Date-of-birth are functionally dependent on part of the primary key (Emp_ID) • A partial functional dependency is a functional dependency in which one or more non-key attributes (i.e. F_Name, L_Name, Salary or Date-of-birth) are functionally dependent on part (but not all) of the primary key www.desiamore.com/ifm

  25. Data Normalization… • 2NF… • Example.. • The partial functional dependency in the EMPLOYEE table creates a redundancy in that relation, this results in anomalies when the table is updated • Redundant data –data that is unnecessarily expressed multiple times Or that depends only on part of a multi-value key, or when a column’s value is dependent on the valueof the one column in the table, but not another, it is considered redundant www.desiamore.com/ifm

  26. Data Normalization… • 2NF… • Example, CREATE TABLE EMPLOYEE (EMP_NO INTEGER NOT NULL, L_NAME VARCHAR(20) NOT NULL, F_NAME VARCHAR(20), DEPT_CODE INTEGER, DESCRIPTION VARCHAR(50)); This table contains redundant data, description –depends only on the DEPT_CODE and doesn’t vary based on the value of the EMP_NO –primary key of the table www.desiamore.com/ifm

  27. Data Normalization… • 2NF… • Example.. • To eliminate the redundancy the department code and description can be in different table CREATE TABLE EMPLOYEE (EMP_NO INTEGER NOT NULL, L_NAME VARCHAR(20) NOT NULL, F_NAME VARCHAR(20), DEPT_CODE INTEGER); www.desiamore.com/ifm

  28. Data Normalization… CREATE TABLE DEPARTMENT (DEPT_CODE INTEGER NOT NULL, DESCRIPTION VARCHAR(50) NOT NULL); This is shown below, www.desiamore.com/ifm

  29. Data Normalization… www.desiamore.com/ifm

  30. Data Normalization… • Third Normal Form (3NF) • A relation is in third normal form if it is in second normal form and • No transitive dependencies exist • A transitive dependency in a relation is a functional dependency between two or more non-key attributes www.desiamore.com/ifm

  31. Data Normalization… • 3NF… • The columns in each table should be a group of columns in which the data in each column contributes to the description of each row in the table • For a given row with a unique key, each column appearing in that row should contribute to the description of that row www.desiamore.com/ifm

  32. Data Normalization… • 3NF… • Example, the columns F_NAME and L_NAME contribute to describing a specific contact using the primary key CONTACT_ID. • But the COMPANY_NAME and COMPANY_LOCATION do not contribute to describing the record with a given CONTACT_ID • Since it identifies an individual and not a company www.desiamore.com/ifm

  33. Data Normalization… • 3NF… • Example… CREATE TABLE CONTACTS (CONTACT_ID INTEGER NOT NULL, L_NAME VARCHAR(20) NOT NULL, F_NAME VARCHAR(20), COMPANY_NAME VARCHAR(20), COMPANY_LOCATION VARCHAR(50)); www.desiamore.com/ifm

  34. Data Normalization… • 3NF… • The above relation, CONTACT_ID –primary key therefore all the remaining attributes are functionally dependent on this attribute • Transitive dependency –COMPANY_LOCATION is dependent on COMPANY_NAME and COMPANY_NAME is functionally dependent on CONTACT_ID www.desiamore.com/ifm

  35. Data Normalization… • 3NF… • As a result of the transitive dependency, there are update anomalies in the CONTACTS table as follows • Insertion anomaly –a new company cannot be inserted to the CONTACTS table until a contact person has been assigned to that company www.desiamore.com/ifm

  36. Data Normalization… • 3NF… • Deletion anomaly –if a company that has only one contact person is deleted from the table, we will lose the information about that company, as the company information is associated with that person • Modification anomaly –if a company changes its location then we will have to make the change in all the records where the company name appears www.desiamore.com/ifm

  37. Data Normalization… • 3NF… • The transitive dependency can be removed by decomposing the above table into two tables as shown below. www.desiamore.com/ifm

  38. Data Normalization… CREATE TABLE CONTACTS (CONTACT_ID INTEGER NOT NULL, L_NAME VARCHAR(20) NOT NULL, F_NAME VARCHAR(20), COMPANY_ID INTEGER); ___________________________________________ CREATE TABLE COMPANY (COMPANY_ID INTEGER NOT NULL, COMPANY_LOCATION VARCHAR(50)); www.desiamore.com/ifm

  39. Data Normalization… www.desiamore.com/ifm

  40. Data Normalization, Review Any Questions www.desiamore.com/ifm

  41. Database Management Systems (DBMS) Abdallah Seleman dullextz@gmail.com Block D,55 Consultation Hours Tuesday & Wednesday 10:00 AM-12:00PM Bachelor of Computer Science and BSc IT Year 1 www.desiamore.com/ifm 06 September 2014 AS(CIT, IFM) DBMS 41

  42. Data Normalization… • Boyce –Codd Normal Form (BCNF) • Database relations are designed –they have neither partial dependencies nor transitive dependencies • These types of dependencies result in update anomalies • A functional dependency describes the relationship between attributes in a relation www.desiamore.com/ifm

  43. Data Normalization… • For example, if ‘A’ and ‘B’ are attributes in relation R, then ‘B’ is functionally dependent on ‘A’ ( denoted as A B) • This works if each value of ‘A’ is associated with exactly one value of ‘B’ • Example, in CONTACTS table, L_NAME, F_NAME and COMPANY_ID are functionally dependent on CONTACT_ID www.desiamore.com/ifm

  44. Data Normalization… • These dependencies can be shown as follows; • CONTACT_ID L_NAME • CONTACT_ID F_NAME • CONTACT_ID COMPANY_ID • CONTACT_ID {L_NAME, F_NAME, COMPANY_ID} • {L_NAME, F_NAME, COMPANY_ID} CONTACT_ID www.desiamore.com/ifm

  45. Data Normalization… • The left-hand side and the right-hand side of a functional dependency are sometimes called –determinant and dependent respectively but both determinant and dependent are attributes • when the set contains more than one attribute, the braces are used to enclose them as shown below www.desiamore.com/ifm

  46. Data Normalization… • A functional dependency A B is full functional dependency if removal of any attribute from ‘A’ results in the dependency not being sustained and any more • A functional dependency A B is partially dependency if there is some attribute that can be removed from ‘A’ and dependency still holds www.desiamore.com/ifm

  47. Data Normalization… • For example, {L_NAME, F_NAME, COMPANY_ID} CONTACT_ID • Each value of {L_NAME, F_NAME and COMPANY_ID } is associated with a single value of CONTACT_ID • However, it is not full functionally dependent on a subset of {L_NAME, F_NAME, COMPANY_ID} namely {L_NAME and F_NAME} www.desiamore.com/ifm

  48. Data Normalization… • BCNF… • In generally, we can say that, Boyce-Codd Normal Form (BCNF) is based on functional dependencies that take into account all candidate keys in a relation • But, for a relation with only one candidate key, third normal form and BCNF are equivalent www.desiamore.com/ifm

  49. Data Normalization… • BCNF… • A relation is in BCNF if and only if every determinant is a candidate key. • To test whether a relation is in BCNF we identify all the determinants and make sure that they are candidate keys • A determinant is an attribute or a group of attribute on which some other attribute is full functionally dependent www.desiamore.com/ifm

  50. Data Normalization… • BCNF… • The difference between 3NF and BCNF is that, • For a functional dependency A B, the 3NF allows this dependency in a relation if ‘B’ is a primary key attribute and ‘A’ is not a candidate key • Whereas BCNF, a dependency to remain in a relation, ‘A’ must be a candidate key, this implies that BCNF is a stronger form of 3NF www.desiamore.com/ifm

More Related