400 likes | 734 Views
Basic Concepts of Relational Database. Chapter 5. Instructor: Churee Techawut. CS (204)321 Database System I. Outlines. 1) Relational Database 2) Relational Data Model 3) Relational Database Constraints and Relational Database Schemas 4) Update Operations on Relations
E N D
Basic Concepts of Relational Database Chapter 5 Instructor: Churee Techawut CS (204)321 Database System I
Outlines 1) Relational Database 2) Relational Data Model 3) Relational Database Constraints and Relational Database Schemas 4) Update Operations on Relations 5) Relational Algebra
Relational Database • Definition (by C.J. Date) “ A relational database is a database that is perceived by its users as a collection of tables. It is not just a database in which the data is physically stored as tables. ” Note: Relation is a mathematical term for a table.
Relational Data Model • Concepts • The relational model of data is based on the concept of a Relation. • A relation is a mathematical concept based on the ideas of sets. • The strength of the relational approach to data management comes from the formal foundation provided by the theory of relations. • We review the essentials of the relational approach in this chapter.
Relational Data Model • What is “Relation” ? RELATION: A table of values • A relation may be thought of as a set of rows. • A relation may alternately be thought of as a set of columns. • Each row represents a fact that corresponds to a real-world entity or relationship. • Each row has a value of an item or set of items that uniquely identifies that row in the table. • Sometimes row-ids or sequential numbers are assigned to identify the rows in the table. • Each column typically is called by its column name or column header or attribute name.
Relational Data Model • Relation properties • A relation is a named, 2-dimentional table of data. • Not all tables qualify as relation. • Every relation has a unique name. • Each column has a unique name. • No 2 rows are identical. • Ordering of rows is not significant. • Ordering of columns is not significant. • All attribute values are atomic. (There always exists precisely one value, never a set of values) • Column values are of the same kind.
Relational Data Model • Relation schema R (A1, A2, .....An) R is the name of relation. Degree of relation is the numbers of attributes of its relation schema. Relation schema R is defined over attributes A1, A2, .....An • For Example: • CUSTOMER (Cust-id, Cust-name, Address, Phone#) CUSTOMER is a relation defined over the four attributes Cust-id, Cust-name, Address, Phone#, each of which has a domain or a set of valid values. Degree of relation is 4.
Relational Data Model • Domain “ A domain D is a set of atomic values, all of the same type. ” All values in a column come from the same domain. • For Example: • Cust-id : The set of valid 6-digit numbers. • Cust-name : The set of customer names. • Address : The set of home address where customers live. • Phone# : The set of 10-digit phone numbers valid in Thailand.
Relational Data Model • Domain (Continued) • A domain has a logical definition. e.g., “USA_phone_numbers” are the set of 10 digit phone numbers valid in the U.S. • A domain may have a data-type or a format defined for it. e.g., The USA_phone_numbers may have a format: (ddd)-ddd-dddd where each d is a decimal digit. e.g., Dates have various formats such as monthname, date, year or yyyy-mm-dd, or dd mm,yyyy etc. • An attribute designates the role played by the domain. e.g., The domain Date may be used to define attributes “Invoice-date” and “Payment-date”.
Relational Data Model • Domain (Continued) • Each attribute in the model should be assigned domain information which includes: - Data type - Length - Data format (e.g., Date format is dd/mm/yy) - Range - Constraints (special restrictions on allowable values) - Null support - Default value (if any)
Relational Data Model • Relation instance • A relation instance, r , of the relation schema R (A1, A2, .....An), denoted by r(R) is a mathematical relation of degree n on the domains dom(A1), dom(A2),…, dom(An), which is the subset of the Cartesian Product of the domains that define R. R is also called the intension of a relation. r is also called the extension of a relation.
Relational Data Model • Tuple • A tuple is an ordered set of values. • Each value is derived from an appropriate domain. • Each row in the CUSTOMER table may be referred to as a tuple in the table and would consist of four values. • <632895, "John Smith", "101 Main St. Atlanta, GA 30332", "(404) 894-2000">is a tuple belonging to the CUSTOMER relation. • A relation may be regarded as a set of tuples (rows). • Columns in a table are also called attributes of the relation.
Relational Data Model • Tuple (Continued) We refer to component values of a tuple t by t[Ai] = vi (the value of attribute Ai for tuple t). Similarly, t[Au, Av, ..., Aw] refers to the subtuple of t containing the values of attributes Au, Av, ..., Aw, respectively.
Relational Data Model • Let S1 = {0,1} • Let S2 = {a,b,c} • Let • Then for example: is one possible ‘state’ or ‘population’ or ‘extension’ r of the relation R, defined over domains S1 and S2. It has three tuples.
Relational Data Model • Terminology
Relational Data Model • Example
Relational Data Model • Key fields • Keys are special fields that serve two main purposes: - Primary keys are unique identifiers of the relation.Examples can use SSN as a primary key. This is how we can guarantee that all rows are unique. - Foreign key is a column or columns whose values are the same as a primary key of another table. • Keys can be simple (a single field) or composite (more than one field) • Keys usually are used as indexes to speed up the response to user queries.
Relational Constraints • What are relational constraints? “ Constraints are conditions that must hold on all valid relation instances. ” (Elmasri&Navathe, 2000) “ Restrictions on data that can be specified on a relational database schema. ” (Date, 2000) • There are three main types of constraints: • Key constraints • Entity integrity constraints • Referential integrity constraints
Relational Constraints • Key constraints • Superkey of R: A set of attributes SK of R such that no two tuples in any valid relation instance r(R) will have the same value for SK. That is, for any distinct tuples t1 and t2 in r(R), t1[SK] t2[SK]. • Key of R: A "minimal" superkey; that is, a superkey K such that removal of any attribute from K results in a set of attributes that is not a superkey. • Example: The CAR relation schema: • CAR(State, Reg#, SerialNo, Make, Model, Year) • has two keys Key1 = {State, Reg#}, Key2 = {SerialNo}, which are also superkeys. {SerialNo, Make} is a superkey but not a key.
Relational Constraints • Key constraints (Continued) • The CAR relation with two candidate keys: LicenseNumber and EngineSerialNumber.
Relational Constraints • Key constraints(Continued) • A relation schema may have more than 1 key. Each of the keys is called a candidate key. One of the candidate keys are designated as the primary key of the relation. • If a relation has several candidate keys, one is chosen arbitrarily to be the primary key. The primary key attributes are underlined.
Relational Constraints • Entity integrity constraints • Relational Database Schema: A set S of relation schemas that belong to the same database. S is the name of the database. S = {R1, R2, ..., Rn} • Entity Integrity: The primary key attributesPK of each relation schema R in Scannot have null values in any tuple of r(R). This is because primary key values are used to identify the individual tuples. • t[PK] null for any tuple t in r(R) • Note: Other attributes of R may be similarly constrained to disallow null values, even though they are not members of the primary key.
Relational Constraints • Entity integrity constraints(Continued) • A Null is created by making no entry at all, so a null denotes the absence of a value. • A null can have any of the following meanings: • The value does not exist • The value exists, but it is not known. • The value is unknown, or it is not applicable.
Relational Constraints • Referential integrity constraints • A constraint involving two relations (the previous constraints involve a single relation). • Used to specify a relationship among tuples in two relations: the referencing relation and the referenced relation. • Tuples in the referencing relationR1 have attributes FK (called foreign key attributes) that reference the primary key attributes PK of the referenced relationR2. A tuple t1in R1 is said to reference a tuple t2 in R2 if t1[FK] = t2[PK]. • A referential integrity constraint can be displayed in a relational database schema as a directed arc from R1. FK to R2.
Relational Constraints • Referential integrity constraints (Continued) • Statement of the constraint • The value in the foreign key column (or columns) FK of the the referencing relation R1 can be either: • (1) a value of an existing primary key value of the corresponding primary key PK in the referenced relation R2, or.. • (2) a null. • In case (2), the FK in R1 should not be a part of its own primary key.
Referential integrity constraints displayed on the COMPANY relational database schema Relational Constraints
Relational Constraints • Summary of relational constraints Key constraints (columns) Entity integrity constraints (rows) Referential integrity constraints (between tables)
Update Operations on Relations • UPDATE operations consist of: INSERT a tuple DELETE a tuple MODIFY a tuple • Integrity constraints should not be violated by the update operations. • Several update operations may have to be grouped together. • Updates may propagate to cause other updates automatically. This may be necessary to maintain integrity constraints.
Update Operations on Relations • In case of integrity violation, several actions can be taken: • Cancel the operation that causes the violation (REJECT option) • Perform the operation but inform the user of the violation • Trigger additional updates so the violation is corrected (CASCADE option, SET NULL option) • Execute a user-specified error-correction routine e
Update Operations on Relations • Insert operation • Insert <‘Cecilia’, ‘F’, ‘Kolonsky’, null, ‘1960-04-05’, ‘6357 Windy Lane, Katy, TX’, F, 28000, null, 4> into EMPLOYEE. This insertion violates the entity integrity constraint (null for the primary key SSN), so it is rejected. • Insert <‘Cecilia’, ‘F’, ‘Kolonsky’, ‘677678989’, ‘1960-04-05’, ‘6357 Windy Lane, Katy, TX’, F, 28000, null, 4> into EMPLOYEE. This insertion satisfies all constraints, so it is acceptable.
Update Operations on Relations • Delete operation • Delete the WORKS_ON tuple with ESSN = ‘999887777’ and PNO = 10. This deletion is acceptable. • Delete the EMPLOYEE tuple with SSN = ‘999887777’ This deletion is not acceptable, because tuples in WORKS_ON refer to this tuple. Hence, if the tuple is deleted, referential integrity violations will result.
Update Operations on Relations • Modify/Update operation • Update the SALARY of the EMPLOYEE tuple with SSN = ‘999887777’ to 28000. This update is acceptable. • Update the SSN of the EMPLOYEE tuple with SSN = ‘999887777’ to ‘987654321’. This update is unacceptable, because it violates primary key and referential integrity constraints.
Relational Algebra • Relational algebra is a set of operations to manipulate data and enable the user to specify basic retrieval requests. • The result of a retrieval is a new relation, which may have been formed from one or more relations. • Three groups of relational algebra operations: • Unary relational operations • Binary relational operations • Set theoretic operations
Relational Algebra • Unary relational operations: • The SELECT operation • The PROJECT operation