270 likes | 440 Views
Database Management. Relational Databases and Data Normalization By: Prof. Thomas G. Re Nassau Community College. Tables in a relational DB. Tables can be created in a relational database where the columns (Fields) of the database are the labels on the top of the table.
E N D
Database Management Relational Databases and Data Normalization By: Prof. Thomas G. Re Nassau Community College
Tables in a relational DB • Tables can be created in a relational database where the columns (Fields) of the database are the labels on the top of the table. • The Records of the database are the rows within the table. • For Example lets look at the customer table below.
Tables Continued • From the previous example you can see that the headings at the top of the table describes the data in each column. • Each record will be one row with in the table (excluding the table headings starting with Bill) . • What is important in database design is that you pick a Field that Uniquely identifies one row of data. • From the previous example this can be very difficult.
Key fields in a DB • Primary Keys are Key fields that uniquely identifies one row of data in a relational DB table. • That is you must find one column that uniquely identifies one record in the table. • Can we find a Column with the previous example that Uniquely identifies a row? • Of course not Since. • First, Middle, Last Names are not Unique • Addresses are not Unique • City, State and Zip codes are also not Unique
Primary Keys Continued • A Surrogate Key can be created in order to provide a Primary key for the Customer table. • In most cases a Database Management Systems provides a means for doing this with an auto number. • An Auto number is an auto generated number provided by the database giving a Unique number for each record entered into the database. • This number is not related to the data within the table but provides a way to Uniquely identify each record within the table.
Primary Keys Continued • For example. Our customer table may now look like the following. • As you can see with the addition of the primary key field of CustID, we now have a column that Uniquely identifies each row in the table. • That is each customer as a Unique Customer ID.
Primary Keys Continued • Some tables will have multiple columns, when combined that will Uniquely identify a row of data. • These Columns together make up the primary key. • This type of primary key is called a composite key and is illustrated in the table bellow.
Primary Keys Continued • From the previous example, both the CustID and Phone Number Columns become the Unique identifier for each row of data. • You can see this with the first two entrees where the CustID may be the same, but the Phone numbers are different. • You can also see this with the last two entries where the CustID ‘s are different but the Phone number is the same (They may live in the same house).
Choices for Primary keys • Primary keys must be Unique so the choices of what should be a primary key may be tricky. • Poor choices for primary keys include • Phone Numbers – from the previous example you can have multiple customers with the same phone number. • Date of Birth • Names • Addresses • Social Security Numbers – because of privacy issues and laws that may be on the books, you are better off not using it.
Designing a Database • When designing a database, you must know the rules of the business in order to design your DB. • The rules of the business may dictate the columns you will be creating in your relational database tables. • Often examining old forms and witnessing how transactions are made within the business will help determine the Fields that will be required when collecting the data. • In planning the tables you must write them out and start organizing the columns in to tables that relate to each other.
Database design continued • Once you have collected each of the fields and organized them into tables, you examine each of the tables to start the normalization process. • If you notice, you have repeated groupings of data in the table above.
First Normal form • First Normal form of a table is to remove all repeating groups of data (data redundancy) and place them into separate tables • With the previous example we can split the data into two tables and call them • A customer table • An Orders table • By removing the repeating groups from the table you now have two tables in First Normal Form.
First Normal Form Continued • You would have two tables that would look like the following. • As you can see the repeating groups have been removed and the redundant data for customers have been eliminated
Second Normal Form • For second normal form a table must be • In First Normal Form • All non key fields must relate to the primary key The Customer table we created is also in Second Normal Form. • Notice that all columns have data that relate to a specific customer.
Second Normal Form Continued • If we look at the Orders table we have • Repeating groups have been removed, but we still have columns that are not related to the key field for Orders. • Product ID is not related to Order • Quantity sold is not related • Price etc. • These non related fields should be then removed and placed into a separate table
Second Normal Form Continued • What we have that is related is the • CustID • Date Purchased • The other data should be in a separate tabled labeled Order Details The order table would then look like
Second Normal Form Continued • Your order details would look like • Here we further normalized the tables so that all data related to the Orders table is in the Orders table and all the data that is in the Order Details table is in the order details table.
Third Normal Form • In order to be in Third Normal Form you must first be in Second Normal Form. • In Third normal form you look to remove hidden dependencies. • That is all Columns are related to the primary key and not just part of the primary key (this applies especially to composite keys). • Both the Orders table and Customer Info Table are also in Third Normal Form.
Third Normal Form Continued Lets look at the Order Details table • The Primary key of this table is the Product ID and OrderID. • As you can see, there are some columns that are not related to the OrderID and Product ID. Namely there are columns that are related to the Product ID Only.
Third Normal Form Continued • Therefore we need to create a separate tabled called Products which will hold all information that related to the product. Our tables would now look like the following • Notice that the Subtotal and the Total have also been removed • This is because you should remove all calculated values from a DB in order to save on storage space and search time of the data Calculations can be when retrieving the data.
Fourth Normal Form • In order to be in Fourth Normal Form you must first be in Third Normal Form. • In addition you need to ensure that all Columns that are in the table are related to the key and nothing but the primary key. • That is further hidden dependencies may exist due to Business rules. • For example with Employees. • You may have a table that displays an Employees specialty and tools they can use.
Fourth Normal Form Continued • We can have Employee relate to Specialty • We can have Employee relate to Tools used • The hidden dependency here becomes that Specialty and Tools are not related to each other but separately to the each employee and therefore should be placed into a separate table.