360 likes | 444 Views
Database Modelling. Lecture 4 (b): Database Integrity, Keys & Constraints Akhtar Ali. Learning Objectives. To consider new Assessment Regulation To consider Referential Integrity & Foreign Keys To consider Referential Integrity Constraints in SQL
E N D
Database Modelling Lecture 4 (b): Database Integrity, Keys & Constraints Akhtar Ali
Learning Objectives • To consider new Assessment Regulation • To consider Referential Integrity & Foreign Keys • To consider Referential Integrity Constraints in SQL • To consider Ad Hoc Constraints in principle • To consider Ad Hoc Constraints in SQL • To consider other aspects of integrity constraints in SQL.
New Assessment Regulation • The new regulation means that students MUST ATTEMPT ALL ASSESSMENT on their module. They cannot pass the module if they do not attempt all parts of it. This means even if they have 80% overall, if they have missed one part worth 5% of the module, they will not be able to pass the module. This will come into effect for all assessments this academic year 2009-2010.
Purpose of Referential Integrity • To ensure that two different relations are consistent with each other. • Example: Consider the previous CAR relation. We know that it does not make sense for an owner in CAR not to be an employee in EMPLOYEE.
Consistent Cross-referencing So every CAR ‘Owner’ attribute value must appear in the EMPLOYEE ‘ENo’ attribute. So the set of CAR ‘Owner’ attribute values must be a subset of the EMPLOYEE ‘ENo’ values. Consistency between a DB’s relations is often required.
Definition of Referential Integrity The values in a referencing attribute must be the same set or a (proper) subset of the values in the referenced attribute. Thus both attributes must be drawn from the same underlying data type.
Full Definition • Referential integrity can be generalised to apply between two corresponding sets of attributes, where the sets may contain more than one attribute. • Hence the full definition is :- • The set of n-tuples in the referencing set of n-attributes must be the same set or a (proper) subset of the n-tuples in the referencedset of n-attributes. • Thus both sets of n-attributes must be based on the same set of underlying data types.
Foreign Keys The referencing (set of) attribute(s) is called a Foreign Key. The Foreign Key gets its name because traditionally the referenced (set of) attribute(s) is always a candidate key. In SQL, the referenced(set of) attribute(s) must be a primary or alternate key. However this is not logically necessary, and in principle a Foreign Key may reference any (set of) attribute(s) with the same underlying data type(s).
Key Overlap A foreign key can occur within a candidate key, or overlap with it. Example: Foreign Key within a Candidate Key
Self-Referential Integrity A foreign key can reference an attribute(s) in the same relation as itself. Example A manager must also be an employee
Supervise Relation • Usually the foreign key is in a different relation to the referenced candidate key, • This is not a logical necessity, as the above example demonstrates. • The SUPERVISE relation shows which employees each manager supervises. • However each manager is an employee, and will be supervised by their own manager. • Therefore ‘Mger’ is a foreign key that references the candidate key ‘Emp’.
Handling the Top • A hierarchical management organisation is assumed here. • The managing director (with ‘Emp’ value ‘E1’) is at the root of the management hierarchy and is not supervised by anyone. • How is this absence of his/her manager, i.e. this missing value, to be dealt with ? Either: • Use a ‘special value’ to represent the missing manager. • Use null; the null would be in the foreign key not the primary/candidate key. • State that the managing director’s manager is him/herself.
Further Considerations • The link between foreign and candidate keys is asymmetric. • It is the foreign key that is dependent on the candidate key for its values, not vice versa. • A foreign key can, and often does, have values that are replicated in more than one tuple. • The set of foreign key values must be a subset of the candidate key’s set of values.
Missing Foreign Key Values • Is it essential that the foreign key always have a candidate key value, or is it permissible for it to be missing ? • For CAR and EMPLOYEE relations, not every employee may be a car owner. • Indeed a foreign key value can be missing, unless it is part of a candidate key. • In some cases though e.g. STUDENT:COURSE • would expect every student to be on a course.
SQL Referential Integrity • Referential integrity constraints can be named in the same way as primary/alternate key constraints. • Foreign Keys are allowed to be NULLs (unless additionally specified not to). • A foreign key can be declared • for a single attribute in the same sub-statement in which it is assigned its data type, or • for one or more attributes in a separate sub-statement at the end of a Create Table statement.
SQL Foreign Key Assignment 1 1. Assign a single attribute to be a foreign key in the same sub-statement in which it is assigned its data type. After the data type, append:
SQL Foreign Key Assignment 2 2. Assign one or more attributes to a foreign key in a separate sub-statement at the end of a Create Table statement
Example of an SQL Foreign Key (1) Give the relation CAR a foreign key, attribute Owner, referencing attribute ENo of EMPLOYEE. Both are acceptable in Oracle.
Example of an SQL Foreign Key (3) Ensure that the first row entered is the manager that manages him/herself ! (I.e. the root of the hierarchical tree). This is because self-referentiality applies. It prevents any row being put into the table whose employee does not have a manager.
Why Ad Hoc Constraints? • These are other constraints on values held in a relation. • They need to be applied to ensure the database holds accurate data. • They usually fall into two classes : • To ensure physical reality applies. • Example: Weights are always positive. • To ensure Business Rules are kept - this reflects the way the organisation works. • Example: It is company policy to only buy cars with an engine capacity of less than 2 litres.
The Nature of Ad Hoc Constraints • In general, an integrity constraint is a limitation on the permissible values that a DB relation can hold. • It could be expressed as:IF a tuple appears in a DB relationTHEN that tuple satisfies a certain condition. • The condition must always be a binary condition, that evaluates to true or false • Attribute type, candidate key, and referential integrity constraints are special cases.
Ad Hoc Constraints in SQL • They are applied using the Check option. • The format isCheck (condition) • A whole variety of conditions can be inserted. • As with other constraints, ad hoc constraints (Check constraints) • can be part of an attribute sub-statement or • in a separate sub-statement at the end of the Create Table statement.
Types of conditions • The condition used in the Check must be a binary condition. • The conditions are typically comparisons of attribute values. • However there is no limit to what may be written in a condition. • Conditions may include calculations. • They may also consist of several sub-conditions • linked together by logical ANDs and ORs; • and logical NOT may also be used.
Example Check Conditions (1) Constraining the values in a relation about products. The check that a price is not zero or a negative may be considered common sense (or a check to avoid corruption?) rather than explicitly a business rule or physical reality check. The Value constraint uses 2 attributes : therefore it cannot appear in the sub-statement of either attribute but must appear in its own sub-statement at the end.
Example Check Conditions (2) Put more complex constraints on the product relation instead. In the revised example above, an upper limit has also been put on weights. Price is now checked to see if it falls into one of three price ranges.
Multiple Integrity Constraints Although not illustrated yet, it is possible and sometimes desirable in SQL to assign more than one integrity constraint in an attribute sub-statement. Example : consider a different version of the PRODUCT relation. • In the above example, the Weight attribute is assigned 4 integrity constraints in one sub-statement : • its data type, • that it cannot be null, • that it must have a positive weight, • 4. that its value must occur in the Wts attribute of relation LIST_WEIGHTS (presumably in effect a list of permissible weights).
Limitations of Ad Hoc Constraints • Suppose the relation EMPLOYEE were to contain an attribute that holds the ages of employees • and there are age limits on who can be employed to meet legal requirements. • To reflect this, the following ad hoc constraint could be added to the Age attribute : Check ( Age >= 16 And Age <= 65 ) • However, it may be legal and normal to employ some school children and retired people. • So constraint may prevent addition of data that is valid under some circumstances.
Limitations of Integrity Constraints • Suppose the following tuple were to appear in the EMPLOYEE relation : (‘E2’, ‘Fenwick’, ‘M’, 40000) • Assume this tuple satisfies all the integrity constraints, which are properly specified. • If the company has no employee called ‘Fenwick’, then the DB still has false data in it, despite all the integrity constraints. • No integrity constraint can possibly check all the names that are permissible in the DB.
Additional Validation • The integrity constraints form a limited approximation to the real world constraints that would ideally apply. • Additional validation is often required, typically provided by • human intervention or • some specially designed input system such as stored procedures. • The Closed World Assumption applies to DBs. Any data in the DB must be true; any not in it must be false.
SQL Default Values • It is possible to specify that an attribute has a default value. This means that :IF a tuple is to be put into a relationAND IF no value is specified for that attribute in the tupleTHEN the attribute is given the default value in that tuple. • Default values are not integrity constraints • but they must satisfy the integrity constraints. • SQL syntax for defaults is: Defaultdefault_value
Example Default Value Self evidently, the default value must be chosen so that it does satisfy all the integrity constraints! Example : Create Table EMPLOYEE ( ENo Char(2), Ename Varchar(30), M_S Char(1) Default ‘S’ Check( M-S In (‘S’, ‘M’, ‘W’, ‘D’)) ) ;
SQL Alter Table • It can be useful, or even necessary, to alter relations some time after they have been created. • SQL provides the Alter Table command for this. • Alter Table provides the following ways of changing a relation. • Add or drop an attribute; • Add or drop an existing integrity constraint; • Set or modify a default value for existing attribute.
Restrict/Cascade • When dropping an attribute or constraint • there are two options in the statement: Restrict and Cascade. • If Restrict is chosen, which is the default, then it will not be dropped if it is referenced by anything else; thus an alternate key attribute will not be dropped if it is referenced by a foreign key. • If Cascade is chosen, then anything referencing the dropped item will be dropped as well.
Example Alter Table Statements DROP TABLE EMP CASCADE CONSTRAINTS; CREATE TABLE EMP ( EMP_NO CHAR(2) CONSTRAINT PKEY_EMP PRIMARY KEY, EMP_NAME CHAR(10), SALARY INTEGER CONSTRAINT MIN_SAL CHECK (SALARY >= 6000) ); ALTER TABLE EMP ADD MARITAL_STATUS CHAR(1) DEFAULT '?' CONSTRAINT MAR_STATS CHECK (MARITAL_STATUS IN ('S', 'M', 'W', 'D', '?')); ALTER TABLE EMP MODIFY MARITAL_STATUS DEFAULT NULL ; /* cannot drop a default, but can set it to null */ ALTER TABLE EMP ADD CONSTRAINT SENSIBLE_SAL CHECK (Salary > 5000 AND Salary < 50000);
Side effects of DROP TABLE • In Oracle 10g / 11g, any table dropped can be restored: • It creates a copy of the dropped table in a recycle bin, • e.g., DROP TABLE EMP; will create a table in the user’s DB: • BIN$HCoDqI8ZSEe9zuFgS6irDw==$0 • To restore a table use FLASHBACK • e.g., FLASHBACK TABLE EMP TO BEFORE DROP; • You cannot specifically drop the BIN... table. • But you can use PURGE RECYCLEBIN; to clear the bin. • To drop a table altogether use: PURGE • e.g., DROP TABLE EMP PURGE;