490 likes | 633 Views
Chapter 9. Integrity. Topics in this Chapter. Predicates and Propositions Internal vs. External Predicates Correctness vs. Consistency A Constraint Classification Scheme Keys Triggers SQL Facilities. Integrity. Correctness Currency Completeness Validity--does it make sense?
E N D
Chapter 9 Integrity
Topics in this Chapter • Predicates and Propositions • Internal vs. External Predicates • Correctness vs. Consistency • A Constraint Classification Scheme • Keys • Triggers • SQL Facilities
Integrity • Correctness • Currency • Completeness • Validity--does it make sense? • Negative weights, for example Database specific rules, business rules
Integrity • Originally focused on keys, the theory of integrity has evolved to focus on constraints in general • An integrity constraint is a boolean expression associated with a database that is required to evaluate at all times to true • An integrity constraint can regarded as a formal expression of a business rule • Constraints may be a priori or a posteriori
A priori vs a posteriori constraints • Type implies an a priori constraint • Since every attribute of every relvar is of some type, the collection of types is a priori for the relvar • Business rule constraints – of the sort that are represented by uniqueness and value constraints, are a posteriori, that is, a result of decisions after the fact
Business Rules--a posteriori Constraints • Every supplier status value is in the range 1 to 100 inclusive. • Every supplier in London has status 20. • If there are any parts at all, at least one of them is blue. • No two distinct suppliers have the same supplier number. • Every shipment involves an existing supplier. • No supplier with status less than 20 supplies any part in a quantity greater than 500.
Constraint Example CONSTRAINT SC1 FORALL SX ( SX.STATUS > 1 AND SX.STATUS < 100 ) ; Every supplier status value is in the range 1 to 100 inclusive. Represents a business decision about acceptable values, as opposed to the type of STATUS (be it INTEGER or STATUS), which represents the set of possible values Must evaluate to TRUE
Constraint Example CONSTRAINT SC1 FORALL SX ( SX.STATUS > 1 AND SX.STATUS < 100 ) ; CONSTRAINT SC1 IS_EMPTY ( S WHERE STATUS < 1 OR STATUS > 100) ; CONSTRAINT SC1 NOT EXISTS SX ( SX.STATUS < 1 OR SX.STATUS > 100) ; Equivalent formulations (in Tutorial D)
Predicates and Propositions • An expression is a predicate • Its variables are parameters to the predicate • When we instantiate the variables, we are passing arguments to the predicate, and so turning it into: • A proposition, which is either true or false • A constraint is an expression, and therefore a predicate, which is checked by passing it arguments, and testing the proposition
The Golden Rule • A relvar predicate is the conjunction (logical AND) of all constraints associated with any of its components • Golden Rule: No update operation must ever assign to any relvar a value that causes its relvar predicate to evaluate to false • A database predicate is the conjunction of all predicates of its relvars • A database predicate is also golden
Checking the Constraints • Constraints should be checked before attempting any insert or update or delete • This is equally true from an implementation perspective and from the model • To do otherwise is inefficient, and violates the Golden Rule
Internal vs. External Predicates –The Closed World Assumption • Internal predicates are those understood and enforced by the system • External predicates are those understood and implemented by the user • Internal predicates should reflect external predicates • If an otherwise valid tuple does not appear in a relvar, its corresponding proposition is false: The Closed World Assumption
Correctness vs. Consistency • The Closed World Assumption is logically valid, but is unenforceable by the system • External predicates are not understood by the system; therefore the system can enforce consistency, but not truth • The external predicate for a relvar is its intended interpretation • Thus a database may be populated by valid but false propositions
A Constraint Classification Scheme • Constraints can apply to a database, a relvar, an attribute, or a type • Type constraints check format and values immediately • Attribute constraints are inherited from those of the declared type • Relvar and database constraints inherit constraints from attributes and add business rule constraints in addition
Business Rules--a posteriori Constraints • Every supplier status value is in the range 1 to 100 inclusive. • Every supplier in London has status 20. • If there are any parts at all, at least one of them is blue. • No two distinct suppliers have the same supplier number. • Every shipment involves an existing supplier. • No supplier with status less than 20 supplies any part in a quantity greater than 500. type relvar database
S +------+-------+--------+--------+ | snum | sname | status | city | +------+-------+--------+--------+ | S1 | Smith | 20 | London | | S2 | Jones | 10 | Paris | | S3 | Blake | 30 | Paris | | S4 | Clark | 20 | London | | S5 | Adams | 30 | Athens | +------+-------+--------+--------+ SP +------+------+------+ | snum | pnum | qty | +------+------+------+ | S1 | P1 | 300 | | S1 | P2 | 200 | | S1 | P3 | 400 | | S1 | P4 | 200 | | S1 | P5 | 100 | | S1 | P6 | 100 | | S2 | P1 | 300 | | S2 | P2 | 400 | | S3 | P2 | 200 | | S4 | P2 | 200 | | S4 | P4 | 300 | | S4 | P5 | 400 | +------+------+------+ P +------+-------+-------+--------+--------+ | pnum | pname | color | weight | city | +------+-------+-------+--------+--------+ | P1 | Nut | Red | 12.0 | London | | P2 | Bolt | Green | 17.0 | Paris | | P3 | Screw | Blue | 17.0 | Rome | | P4 | Screw | Red | 14.0 | London | | P5 | Cam | Blue | 12.0 | Paris | | P6 | Cog | Red | 19.0 | London | +------+-------+-------+--------+--------+ The suppliers and parts database (sample values)
Transition Constraints • Transition constraints can apply to a database or a relvar, but not an attribute, or a type • Transition constraints constrain certain actions, for example forbidding an update to change a status from “married” to “never married”
Keys – Topics • Candidate Keys • Superkeys • Primary Keys • Alternate Keys • Foreign Keys • Referential Integrity • Referential Actions
Keys – Candidate Keys • Let K be a set of attributes of relvar R. Then K is a candidate key for R if and only if it has both of the following properties: • Uniqueness: No legal value of R ever contains two distinct tuples with the same value for K • Irreducibility: No proper subset of K has the uniqueness property
Keys – Candidate Keys A candidate key must possess: a) Uniqueness: No two tuples in a legal relation will have the same value for that key. b) Irreducibility: (If composite) No proper subset of the key has the uniqueness property.
Keys – Candidate Keys Every relation has at least one candidate key. A relation may be “all key.” A superset of a candidate key is a superkey. A superkey has the uniqueness property but not necessarily the irreducibility property.
Keys – Superkeys • If SK is a superkey for relvar R and A is an attribute of R then SK implies A • A superkey has the uniqueness property: no two tuples will have the same value • But it does not have the irreducibility property: it can contain a subset that has the uniqueness property • A superkey can contain subset superkeys; a candidate key cannot
Keys – Primary Keys and Alternate Keys • If a relvar has two or more candidate keys, one must be chosen to be the primary key • The others are then designated as alternate keys • This choice is logically arbitrary • Logically, candidate keys are of paramount importance; choosing the primary key is ancillary
Example Assume the relation S has no two suppliers with the same name (supplier name is unique). Then: STATUS is not a candidate key. Not unique. Same for City. S# is a candidate key. Unique and irreducible. SNAME is a candidate key. Unique and irreducible. The pair S#, SNAME is not a candidate key. Unique but not irreducible. The pair S#, SNAME is a superkey for S. If S# is chosen as the primary key, then SNAME becomes an alternate key. And vice versa.
Keys – Foreign Keys • Loosely, a foreign key is a set of attributes of some relvar R2 whose values are required to match values of some candidate key of some relvar R1 • R1 and R2 may be the same relvar, in the case of a recursive constraint, for example employee and manager • Since relvars can be both referenced and referencing, the database contains referential paths connecting relvars in chains
Referential Diagrams S SP P There is a foreign key in SP that refers to S. There is a foreign key in SP that refers to P. S# P# S SP P R1 R2 R3 There is a referential path from R1 to R3
Keys – Foreign Keys – Referential Integrity • The database must not contain any unmatched foreign key values • Originally foreign keys were defined in terms of the primary key in the referenced relvar, but this qualification is superfluous, although perhaps desirable in practice • The relationship in one directional; it is not a requirement that every referenced candidate key value appear in the foreign keys of the referrer
S +------+-------+--------+--------+ | snum | sname | status | city | +------+-------+--------+--------+ | S1 | Smith | 20 | London | | S2 | Jones | 10 | Paris | | S3 | Blake | 30 | Paris | | S4 | Clark | 20 | London | | S5 | Adams | 30 | Athens | +------+-------+--------+--------+ SP +------+------+------+ | snum | pnum | qty | +------+------+------+ | S1 | P1 | 300 | | S1 | P2 | 200 | | S1 | P3 | 400 | | S1 | P4 | 200 | | S1 | P5 | 100 | | S1 | P6 | 100 | | S2 | P1 | 300 | | S2 | P2 | 400 | | S3 | P2 | 200 | | S4 | P2 | 200 | | S4 | P4 | 300 | | S4 | P5 | 400 | +------+------+------+ If we were to delete the “S4” row from S, SP would have these unmatched foreign key values. “dangling references”
So, in addition to “business rules”--integrity rules that are specific to a particular database-- There are two “general rules” that apply to all databases
Referential Integrity Rule: The database must not contain any unmatched foreign key values. Entity Integrity Rule: No component of the primary key of a base relation is allowed to accept nulls.
Keys – Foreign Keys – Referential Actions • DELETE may not violate the referential integrity constraint • RESTRICT limits the action of the DELETE to just those tuples that do not have referring tuples in another relvar • CASCADE broadcasts the DELETE to include any tuples that reference the affected tuples • UPDATE requires similar behavior
SQL Facilities • SQL does not support type constraints, nor attribute constraints, nor relvar constraints, nor database constraints • SQL supports base table constraints, which are a superset of a subset of relvar and database constraints • SQL supports, for base tables: PRIMARY KEY, UNIQUE, FOREIGN KEY, CHECK, ASSERTION
SQL Facilities Base Table Constraints Candidate keys: A candidate key definition takes the form UNIQUE ( <column name commalist> ) or the form PRIMARY KEY ( <column name commalist> )
SQL Facilities Base Table Constraints Foreign keys: A foreign key definition takes the form FOREIGN KEY ( <column name commalist> ) REFERENCES <base table name> [ ( <column name commalist> ) ] [ ON DELETE <referential action> ] [ ON UPDATE <referential action> ]
SQL Facilities Base Table Constraints Check constraints: An SQL "check constraint definition" takes the form CHECK ( <conditional expression> )
CREATE TABLE SP ( S# SNUMBERS NOT NULL, P# PNUMBERS NOT NULL, QTY QUANTITY NOT NULL, DOCK DOCKS NOT NULL, CLERK CLERKS NOT NULL, UNIQUE ( DOCK, CLERK ), PRIMARY KEY ( S#, P# ), FOREIGN KEY ( S# ) REFERENCES S ON DELETE CASCADE ON UPDATE CASCADE, FOREIGN KEY ( P# ) REFERENCES P ON DELETE CASCADE ON UPDATE CASCADE, CHECK ( QTY > 0 AND QTY < 5001 ) ) ;
SQL Facilities General Constraints (Assertions) Syntax: CREATE ASSERTION <constraint name> CHECK ( <conditional expression> ) ; DROP ASSERTION <constraint name> ;
SQL Facilities Examples: Every part has a positive weight: CREATE ASSERTION IC18 CHECK (NOT EXISTS ( SELECT * FROM P WHERE NOT (P.WEIGHT > 0 ) ) ) ; Every supplier has status at least five: CREATE ASSERTION IC13 CHECK ( ( SELECT MIN ( S.STATUS ) FROM S ) > 4 ) ;
All red parts must be stored in London: CREATE ASSERTION IC99 CHECK ( NOT EXISTS ( SELECT * FROM P WHERE P.COLOR = ‘Red’ AND P.CITY <> ‘London’ ) ) ; No shipment has a total weight (part weight times shipment quantity) greater than 20,000: CREATE ASSERTION IC46 CHECK (NOT EXISTS ( SELECT * FROM P, SP WHERE P.P# = SP.P# AND ( P.WEIGHT * SP.QTY ) > 20000 ) ) ;
No supplier with status less than 20 can supply any part in a quantity greater than 500: CREATE ASSERTION IC95 CHECK ( NOT EXISTS ( SELECT * FROM S, SP WHERE S.S# = SP.S# AND S.STATUS < 20 AND SP.QTY > 500 ) ) ;
Deferred Checking • SQL constraint verification can be DEFERRABLE or NOT DEFERRABLE • NOT DEFERRABLE means the check will be immediate • DEFERRABLE offers the option of SET IMMEDIATE
Triggers • Triggers were used to implement constraints in the days when the database software didn’t • They are used more now for auditing, or to carry out corresponding actions beyond those handled via referential integrity • They should be avoided if possible, because they quickly become hard to manage: triggers can become chained, and ultimately may become recursive unintentionally
SQL Trigger Syntax an SQL update operation CREATE TRIGGER <trigger name> <BEFORE or AFTER> <event> ON <base table name> [REFERENCING <naming commalist>] [FOR EACH <row or statement>] [WHEN <bool exp>] <action> ; any single (possibly compound) SQL statement
Key Constraints • The general formulation for a relvar constraint is as follows: If a certain tuple appears in a certain relvar then that tuple satisfies a certain condition • This applies equally to value constraints, uniqueness constraints and key constraints • More generally, if certain tuples appear in certain relvars then those tuples satisfy a certain condition
Key Constraint Formal Definition –Beginning • FORALL x# S#, xn NAME, xt INTEGER, xc CHAR, Y# S#, yn NAME, yt INTEGER, yc CHAR (IF { S# x#, SNAME xn, STATUS xt, CITY xc} S AND { S#, y#, SNAME yn, STATUS yt, CITY yc} S
Key Constraint Formal Definition –Conclusion • THEN (IF x# = y# THEN xn = yn, AND xt = yt, AND xc = yc ) ) • This expresses that S# is a superkey, and possibly a candidate key