260 likes | 291 Views
Learn the importance of normalization in database organization, avoid data redundancy issues, and create efficient relationships between tables for improved performance. Examples included for better understanding.
E N D
LSP 121 Normalization Queries (contd)
* Normalization • Normalization is the process of efficiently organizing your data • Normalizing your database can drastically improve its performance • This becomes especially important with very large databases • Normalization techniques include: • Avoiding data redundancy (ie avoid repetition of data) • Ensuring proper data dependencies • See example…
Normalization • Let’s create a database for a car club • What if one person owns multiple cars? (One owner can have many cars, so this is 1:M relationship) • Table: Members: In this table, we would have to repeat all of the person’s info (MemberID, Name, Phone, etc, etc for EVERY car they own) • This is called data redundancy and is very BAD! • Instead, let’s create an additional separate table just for the cars. We’ll call this table Cars • We can still connect the car with its owner by including a field in the Cars table MemberID field from the Owners table • This is called establishing a “relationship” between the tables
* Data Redundancy • A major no-no in database design. • Can lead to all kinds of problems down the road. • For example, suppose you had a database in which you store a person’s phone number in three different tables. Now, suppose you updated a member’s phone number in one table, but forgot to do it in another? This is a very common and can really mess things up. • You should try to limit fields to one place only. • However, you can “associate” fields between two tables. • Eg: In the ‘Cars’ table, you can refer to the MemberID from the ‘Members’ table.
Normalization Example Car Club (original table) Member ID (primary key) Member Name Member Address Member City Member State Member Zip Member Phone Dues Paid? National Member? Model of Car Make of Car Year of Car Note: If a member has >1 car, the first 9 fields will be repeated multiple times This is pointless – and dangerous! MemberInfo Member ID (primary key) Name Address City State Zip Phone Dues Paid?National Member? Relationship Cars Model of Car Make of Car Year of Car Member ID (not a primary key here!) Primary Key in MemberInfo table “Foreign Key” in Cars table? 1:M (“one-to-many”) relationship Why isn’t MemberID a primary key in the CarTable table?
Data Redundancy • Data Redundancy: A bad idea! • Key Point: Do every thing you can to avoid repeating data. • Database people do everythng they can to avoid data redundancy. • For example, each time you add a new car for a user, you should not have to repeat all of the user’s personal info (name, address, phone, etc, etc) all over again. • Instead, place the user’s personal info in one table (e.g. “Members”) and have a separate table for cars owned by the members of the club. • We then create a relationship between the two tables.
Another Normalization Example Student ClassRecords Student ID (primary key) Name Address City State Zip Phone Major Minor Degree Sought Class Name Grade Number Credits The first 10 fields are repeated for each course Student Info Student ID Name Address City State Zip Phone Major Minor Degree Sought Courses Class Name Grade Number Credits Student ID After • StudentID is a primary key in the StudentInfo table and becomes a “foreign key” in the Courses table. • There is a 1:M (one-to-many) relationship between students and courses. That is, one student can have many courses. Before
Practice Example Imagine a table of your customers and all of your sales to them. Would you change the design of your database? If so, how many tables would you want in this case? Primary keys? Foreign keys? Customer ID Customer Last Name Customer Phone Customer Address Customer City Customer State Customer Zip Sales Transaction Date Sales Amount Item Clearance Item?
Practice Example Table: Customers CustomerID (primary key) LastName Phone Address City State Zip CustomerID Customer Last Name Customer Phone Customer Address Customer City Customer State Customer Zip Sales Transaction Date Sales Amount Item Clearance Item? Table: Sales InvoiceID (primary key) CustomerID (foreign key) SalesTransactionDate SalesAmount Item ClearanceItem?
Relationships You can create a relationship between any tables by hand Click Tools Relationships Add the two tables to the view, click on one of the fields (e.g. StudentID) drag it over to the other table’s identical field (StudentID) and un-click Check Enforce Referential Integrity (you don’t want children records without parents)
Relationships • Access automatically creates a relationship between the two tables if: • you create two tables • the first table has a primary key • you carry that primary key over to the second table as a foreign key • the primary key and the foreign key are spelled the same and have the same type
Data • 1 Smith 555-5555 Palos Heights • 2 Chen 666-6666 LaGrange • 3 Wilson 777-7777 Chicago • 3/3/09 20.45 Shirt N 1 • 3/3/09 5.99 Scarf N 1 • 3/4/09 29.99 Jeans Y 3 • Important: Note the last column in the second table references the primary key (e.g. customerID) from the first table. • Let’s do the first part of today’s activity.
Simple Queries - contd • To create an Access query, don’t use the query wizard. Instead, create query in Design view • Let’s see how Access does it • Copy the Pets database from the Basic Information page to your desktop (or My Documents) • Then open the Pets database • If you don’t see any tables/queries etc on the left, click on the down-arrow and choose ‘All Access Objects’
* Query on Dates • You can query based on dates, but only if the data was stored as date/time • E.g. to search for dates after Jan 1 2004, you would type: >1/1/2004 • In a query, dates should be entered with # before and after the date. Note that dates can be in written many different formats, ie #1/1/2004#, #January 1, 2004#, #1-Jan-2004# • Access typically puts these in FOR you • Different databases have different ways of dealing with dates.
** Queries – Using ‘OR’ • In Access, put each criterion in a different row • If you put criteria in the same row, the query will work as an AND query (discussed next) • Customers with City=“Chicago” OR first name = “Jack”
Another way of using ‘OR’ • If you are looking for different values within a single field (e.g. State), you can simply type the word ‘OR’ between each value: • E.g. You can look for records in the state of Indiana or Tennessee or Ohio by saying “IL” OR “TN” OR “OH” • E.g. Show all pets that are birds or snakes:
** Queries – Using ‘AND’ • Logical AND - you can make multiple entries in the query boxes. • E.g. In the Type field enter “Dog” and in the Color field enter Brown • In Access, putting criteria in the same row is how you accomplish an AND effect • Recall that this is different from an OR query, where the different values must be on separate rows • E.g. Show all brown dogs:
** Queries – Using ‘AND’ • In Access, putting criteria in the SAME row is a way of accomplishing an AND effect • This example shows customers with city = “Chicago” AND first name = “Jack”
AND Queries contd • Logical AND - You can also use an AND in one field. • For example, in the Size field you can enter >=3 AND <=9 • Possible operators include =, <>, <, >, <=, >=
Queries That Calculate • When performing a query, you can aggregate data together from a series of records • E.g. Find all customers born before 1970 and calculate their average sales. • You can do various basic statistical calculations such as: Count, Sum, Avg, Max, Min, Standard deviation, etc • Certain calculations can be performed only on certain datatypes. • E.g. You can not calculate the average of FirstName
Viewing Totals • To see these values, you need to click on Design ‘Totals’ • Totals can be found in the ‘Show/Hide’ box • You will now see an additional row called ‘Total’ in your query design view • Under Total, the ‘Group By’ criteria is important…
Example • Say you have a database for a vet that treats dogs. Each dog treated has an entry including ID, weight, and height • If you want to find the average weight and height of all pets • * IMPORTANT: Note that under ‘Total’, we set ‘Group By’ ‘Count’ for type (it will show how many pets there were) and to ‘Avg’ for weight (shows the average)
Example • What if you want to find the average height and weight for all dogs?
Example • What if you want to find the minimum and maximum weight for all dogs? Note that you have to include the Weight field twice.
More Examples • You can also perform totals on groups of records. • For example, suppose you want to count how many different types of pets the vet has on record This example is not as intuitive as the others. For example, you would not know that the ‘Type’ field should not be grouped. Learning to skillfuly query a database takes some reading and practice.
Further Examples? • Experiment with more queries