340 likes | 451 Views
RELATIONSHIPS AND SIMPLE QUERIES. Database management systems. Redundancy. A central aim of Database Management Systems is to avoid redundancy – having the same data in more than one place (duplication), which causes: wastage of disk space wastage of user time
E N D
RELATIONSHIPS AND SIMPLE QUERIES Database management systems
Redundancy • A central aim of Database Management Systems is to avoid redundancy – having the same data in more than one place (duplication), which causes: • wastage of disk space • wastage of user time • difficulties in updating correctly • hence the possibility of inconsistent data • decreased operational speed
Referential Integrity (1) • Referential (adj) -From Reference (n) - The state of being related or referred: with reference to; in reference to. • Integrity (n) -The state of being unimpaired; soundness. -The quality or condition of being whole or undivided; completeness. • Thus, Referential Integrity– a property of a database meaning that things that refer to or are connected to one another are sound, reliable, unimpaired – i.e. trustworthy as data sources.
Referential Integrity (2) Within a RDMS this means that: • The data contained in a primary key field used in a table relationship must be matched in that same field in the secondary table. • In other words, data entered in a field that is used to link two tables must match from one table to another. • E.g. If a course record contained reference to a student number which did not exist in the main students table, this would be a violation of referential integrity.
Referential Integrity (3) • In order to fulfil referential integrity, if we delete a student record from the master student table, then we should also delete all records related to that student. • Similarly, if we added a new record referencing a student, referential integrity requires that the student number reflects an existing record in the master students table. • Put differently, Referential Integrity makes sure there are no orphan records (records that don’t relate or connect to anything else)
Verification and Validation • Validation • Ensures that any data entered into a field is of a valid data type. • Verification • Ensures that any data entered is accurate. • Data can be invalid, but accurate • E.g. • Name: Fred • Date of Birth: Twenty-second of February, nineteen eighty one. • Data can be valid, but inaccurate • E.g. • Chemical Symbol: I • Chemical Name: Iron • Think of an ATM. Your debit card identifies a valid account, while your pin verifies you are the owner of the card. • Validation is easily performed by a DBMS. Verification is significantly more difficult.
Database Validation • Checking for violations of the rules set up when you create a table • All the details in the Field Properties section of the Design View are rules for data entry (e.g. Length, Format, Default Value, Required etc.) • Protection - you cannot enter the wrong type of data • E.g. Cannot enter “Light Blue” as your date of birth. • In a nutshell: • Referential Integrity keeps you from deleting data that is needed in another table and referring to data that doesn’t exist. • Could also be considered Limited Verification. • Validation keeps the database clean and ensures that any data entered is (at least) valid.
Some Verification Rules • The legal data type • The legal length • Is the field required? • Is there a default value? • Has the primary key been set?
Relationship Types • One-to-Many / Many-to-One • The primary key in one table can be referenced multiple times in another table. • One car registration may have multiple parking tickets. On the contrary, one parking ticket can only be linked to one car registration. • Many-to-Many • A common relationship type which, while typically easy to cope with, can add significant complexity. • One course can have many students enrolled in it. One student may be enrolled in many courses. • The best solution is to create an intermediate table with the student and course as foreign keys, keeping their details in separate tables, while storing details of enrolment in the intermediate table. • One-to-One • Quite rare in practice, but not impossible. • In many cases, it is acceptable to combine both tables into a single table. This is probably not the case below however. • One husband can have only one wife. One wife may also only have one husband. • In cases like this, you can use the method described to solve Many-to-Many relationships.
∞ 1 One Winery can produce many Wines Relationships: one-to-many (1) • The most common The primary key in one table is a foreign key in another
Relationships: one-to-many (2) There is ONE Winery Code for each winery in the Wineries table But there are many wines from a winery, so in the Wines table the Winery code will be repeated once for every wine from that winery
What relationships prevent • Why is referential integrity important? • You cannot enter a value in the foreign key field of the related table that does not exist in the primary key field of the primary table. • You cannot delete a record from a primary table without deleting related records in a related table. • Solved through Cascading Deletes • You cannot change a primary key value in the primary table, without changing the foreign key values in related records. • Solved through Cascading Updates
Implementing Relationships:How Microsoft Access manages relationships (1) Select Relationships from the Database Tools Tab
Implementing Relationships: How Microsoft Access manages relationships (2) You will be presented with an Add Tables dialog. Choose the tables you want to involve in the creation of the relationship and click on Add. Close when you are finished.
Implementing Relationships: How Microsoft Access manages relationships (3) To create a relationship between two tables, click on the primary key of one table that occurs as a foreign key in the other table (in this case ID and CourseID) and drag and drop the primary key right onto the foreign key in the other table.
Implementing Relationships: How Microsoft Access manages relationships (4) Referential Integrity can be enforced by selecting Enforce Referential Integrity, and then selecting whether to cascade updates and/or deletes. Clicking on Cascade Update ensures that if a change is made in the Primary Key field, other tables that field is linked to will be updated. Clicking on Delete cascade ensures that that if a record is deleted in the Primary Key field, Access will delete all the records that are linked to that field. If this option is not selected you will not be allowed to delete a record that is linked to other records.
Implementing Relationships: How Microsoft Access manages relationships (5) Specifying Join Type tells access how to treat the records of the two tables in queries.
Implementing Relationships: How Microsoft Access manages relationships (6) • A line (or “Join”) will appear joining the two tables in Query Design view – this represents the permanent relationship you have created between the two tables. • It is a one-to-many relationship - one course can be taken by many Students, but each student can only take one course. • Obviously, this isn’t very practical – it is however just an example.
Table Lookup • You can also create one-to-many relationships between tables within the Tables Design View, by selecting Lookup Wizard from the field Data Types selection box. • The Lookup Wizard “looks up” the fields data type and values from another table, and creates a One-to-Many relationship with that table. • It is similar to the Lookup function in Excel.
Lookup Example – Months of the Year (1) • Imagine you had a field intended to capture (for what ever reason) a specific month of the year: January, February etc. • In order to use the Lookup Wizard to solve this problem, we first need a table with the names of the months in it. • Note: We need the ID field to ensure the Months are in the correct order. If we used the months name as a primary key, Access would sort them sequentially, which is undesirable.
Lookup Example (2) • In the Design View of the table which will look up these values: • Navigate to the field which will look up the months of the year. • Select Lookup Wizard in the Data Type selection box of that field.
Lookup Example (3) • In the Lookup Wizard, you can either Look up values in a table or query, or type the data in manually. • We will be looking months up from a table for the purpose of example, but manual entry is typically ideal for unchanging options (like month, gender, etc.) • In the next window, select the table or query you wish to get data from.In this case it is MonthOfYear.
Lookup Example (4) • We want to lookup the ID of the field (because its the Primary Key and that’s what we want to store), but we also want to get the Months actual name. • We sort the fields by ID so months are in the order we expect.
Lookup Example (5) • Verify that the right data was selected. • Note that the primary key has been automatically hidden, so you will only see the month names. However, the ID field will be what is actually stored in the lookup field. • After you name the field, you can select the Month in your lookup field.
Database Normalisation • Normalization is a systematic way of ensuring that a database structure is: • suitable for general-purpose querying • free of certain undesirable characteristics—insertion, update, and deletion anomalies—that could lead to a loss of data integrity. • Introduced by Edgar F. Codd (the inventor of the relational model) in 1970. • A Database is often considered normalised once it is in Third Normal Form (3NF).
Normalisation continued • Un-normalised databases (0NF): • Cannot be queried efficiently, if at all. • May result in inconsistent data when one field is updated, but another isn’t (Update Anomaly) • When an employees email address is stored in multiple tables/records, and only some are properly updated. • May prevent a record from being inserted (Insertion Anomaly) • i.e. If License-plate numbers were used as Primary Keys in an employee table, but some employees did not have cars. • May result in a loss of important data when deleting records (Deletion Anomaly). • i.e. If Purchasing Records were stored within the employee table, and we deleted an employee record, we would lose the record of the purchase as well.
First Normal Form (1NF) • A database is in 1NF if it contains no repeating groups. • E.g. • When one field can contain more than one value, resulting in the need for duplicate fields. • How many fields do you add? Repeating Fields
Converting to 1NF • Create 2 separate tables. 1 ∞ • NOTE: Employee Skills Table needs a Composite key because: • An employee might have multiple skills • A skill might be known by multiple employees • Thus neither Employee ID nor Skill ID is guaranteed to be unique over multiple records. • This can be avoided by introducing an artificial primary key, such as an Autonumber.
Second Normal Form (2NF) • A Table is in 2NF when: • It is in 1NF • It has no attributes which are only partially dependent on the primary key. • Note that when a 1NF table has no composite candidate keys (candidate keys consisting of more than one attribute), the table is automatically in 2NF. Description is only Partially Dependent on the Composite Primary Key. It is dependent on Skill ID, but unrelated to Employee ID. 1 ∞
Converting to 2NF NOTE: This is the best way to deal with Many-To-Many relationships. Just create an intermediate table with 1-To-Many relationships on either side. • Create another table. 1 ∞ 1 ∞ 1 ∞ Now Skill Description is fully dependent on the primary key.
Third Normal Form (3NF) • A Table is in 3NF if: • It is in 2NF • It has no Transitive Dependencies • i.e. No attribute is dependent on another attribute which is not itself a primary key. • NOTE: because the table is already in 2NF, no field can be partially dependent on a primary key, so it is either fully dependent on the primary key, or not dependent on it. 1 ∞ 1 ∞ Department Name is dependent on Department ID, which is not the primary key of the Employee Table.
Converting to 3NF • Surprise! Create another table. 1 ∞ 1 ∞ 1 ∞ 1 ∞ ∞ 1 Department Name is dependent on the primary key.
Action Replay 1 ∞ ∞ 1 1 ∞ Partial Dependency Transitive Dependency Repeating Groups 0NF 1NF 2NF 3NF
Worth Noting • A table which has repeating groups is un-normalised, regardless of whether it has partial or transitive dependencies. (why?) • You do not have to start with un-normalised tables. Create them in 3rd normal form (try paper – it helps). • An un-normalised or partially-normalised database is NOT wrong – sometimes there are good reasons not to normalise (e.g. avoiding unnecessary complication)