110 likes | 129 Views
This lecture explores the importance of data integrity in Geo-Databases, including methods for defining integrity constraints, enforcing referential integrity, and using the CHECK clause. It also discusses the role of assertions and triggers in ensuring data integrity.
E N D
Geo-Databases: lecture 6Data Integrity Prof. Dr. Thomas H. Kolbe Institute for Geodesy and Geoinformation Science Technische Universität Berlin Credits: This material is mostly an english translation of the course module no. 8 (‘Geo-Datenbanksysteme‘) of the open e-content platform www.geoinformation.net.
Motivation • “Invalid“ states must be avoided: • A student is assigned the mark -6 (in Germany from 1 to 6). • A lecture is assigned a nonexistent lecturer. • A lecturer is neither assigned a lecture nor is he listed in the table “sabbatical semester“. real world real world‘ mini-world mini-world‘ • Integrity constraints describe valid states of the system. Their compliance is controlled by the DBMS.
Data Integrity – Methods presented so far • Already mentioned methods for the definition of integrity constraints are: • Specification of the value domain for each column:ZIP NUMERIC(5,0) • A postal code (ZIP number) is allowed a maximum length of 5 digits (in Germany). • Prohibitionof NULL values:Name VARCHAR(30) NOT NULL • Guarantees that no tuple will have a NULL entry in “Name“(or vice-versa: every tuple must have a “Name“ value different from NULL) • Primary key: • Guarantees that there are no two tuples with identical key attributes • No tuple is allowed the NULL value in a primary key attribute
foreign key foreign key integrity violation! Referential Integrity - Motivation The constraint “Every lecture is held by a lecturer.“ is formally soecified wrt. the database as follows : For each tuple within Vorlesungen (lectures) there is a tuple within professors, such that Vorlesungen.PersNr = Professoren.PersNr In general: referential integrity Referential integrity cannot be guaranteed by the the previously introduced integrity constraints.
Referential integrity in SQL (1) We need language constructs that can be used to introduce primary/foreign key relations to the system. • When to check? • At the end of a data manipulation operation (default) • At the end of a transaction • Principle: • The database is in a state of referential integrity before the manipulation • Changes are only applied, if they are “valid“ • The database is in a state of referential integrity after the manipulation
foreign key foreign key foreign key Referential integrity in SQL (2) • Default strategy: Rejection of invalid changes • 2nd strategy: cascading changes …analogous ON UPDATE CASCADE for cascading UPDATE
Referential integrity in SQL (3) • 3. strategy: insertion of a default value • Similar syntax to ON DELETE / UPDATE CASCADE: • ON DELETE / UPDATE SET NULL: …on deletion / updating the foreign key is set to NULL • ON DELETE / UPDATE SET DEFAULT: …on deletion / updating the foreign key is set to a default value
CHECK Clause • Allows additional constraints on the level of attributes and tables. • Example: Graduates must have studied for at least 8 semesters • CREATE TABLE Graduates ( ...,Semester INTEGER CHECK Semester >= 8) • CHECK conditions can be as complex as the WHERE conditions: • Only Professors are allowed to hold an exam. • CREATE TABLE Exams( Name VARCHAR(30) NOT NULL,...CHECK (Name IN (SELECT Name FROM Professoren)) • Attention! The CHECK constraint is carried out only in case of data • manipulations of the respective table (no checking if “Professoren“ is changed)!
Outlook • If an integrity constraint is not only to be checked on the change of a single table, it has to be formulated on the database schema level. In order to ensure integrity propagation of changes might become necessary. • Assertions • CREATE ASSERTION <name> CHECK <condition> • <condition> like in the WHERE clause • Check is performed on changes to any of the tables specified in <condition> • Resolving of change propagations to ensure integrity (Trigger) • User-defined procedures that are launched automatically if a certain condition is fulfilled • Since SQL:1999 standardised • For more details, see literature
References • Overview: • Hector Garcia-Molina, Jeffrey D. Ullman, Database Systems: The Complete Book, Prentice Hall, 2002 • Alfons Kemper, André Eickler, Datenbanksysteme - Eine Einführung, Oldenbourg Verlag, München, 1996 • Jim Melton, Alan R. Simon, SQL 1999: Understanding Relational Language Components, Morgan Kaufmann Publishers, 2001 • Gottfried Vossen, Datenbankmodelle; Datenbanksprachen und Datenbankmanagement-Systeme, Oldenbourg Verlag, München, 1999