220 likes | 245 Views
Learn about Relational Model proposed by E.F. Codd in 1970, data independence, normalization, set-oriented manipulation language, key terminology, mathematical relations, database schemas, attributes, keys, and integrity constraints.
E N D
Relational Model (CB Chapter 4) CPSC 356 Database Ellen Walker Hiram College
Relational Model • Proposed by E.F. Codd, 1970 • Goals • Data independence (physical vs. conceptual) • Normalization (avoiding redundancy) • Set-oriented data manipulation language (relational algebra) • Examples of Relational DBMS: • DBMS System R (IBM), Ingres (UC Berkeley) • Oracle (all versions), Access, RDB, DB2 …
Terminology • A relation (instance) is a table with columns and rows. • (don’t confuse with relationships in ER models) • An attribute is a named column of a table. • (corresponds to simple attribute in ER) • A domain is the set of allowable values for an attribute • A tuple is a row of a table.
More Terminology • The degree or arity of a relation is its number of attributes (columns) • The cardinality of a relation is its number of tuples (rows) • A relational database is a collection of normalized relations with distinct names. • We’ll get to normalization later!
Example Relation (Phone Book) • Attributes: First, last, dept, email, etc. • Domain for Title: {Prof, Inst, Assoc, Asst} • Tuple: (Obie,Slotterbeck,CS,Obie,5275,Prof) • Degree = 6, Cardinality = 4
Order Doesn’t Matter! • This is the same phone book relation.
Sets and Subsets • A set is an unordered collection of unique elements Set S = {1,2,3} “1 is an element of S” {a,b,c} = {a,c,b} • A subset of a set is another set whose elements all come from the original set. {a,b} is a subset of {a,c,b} {1,2,3} is a subset of {1,2,3} {1,2,4} is not a subset of {1,2,3} {} (the empty set) is a subset of every set!
Mathematical Relations • Cartesian product: a set of ordered pairs, where each contains one element from each original set {1,2,3} x {a, b} = {(1,a), (1,b), (2,a), (2,b), (3,a), (3,b)} • Relation: any subset of a Cartesian product • R1 = {(1,a),(2,b),(3,a)} • R2 = {(1,a), (1,b), (2,b), (3,b)} • R3 = { }
Bigger Relations • First, take the cross product of n sets • If n = 3, get triples; if n=4, get quadruples • For arbitrary n, get “n-tuples” • Now, take a subset of that big cross product • Several n-tuples • Arrange them in rows, so the commas line up • Looks familiar… • Domains of attributes are sets from the cross product.
Database Schemas • A schema describes a database, but not the data • Relation schema: set of attribute and domain name pairs (First: char-string) (Last: char-string) (Dept: {CS,Math,Comm,Econ,Mgmt, etc.}) (Phone: 4-digit-string) (email: char-string) (Title: {Inst,Asst,Assoc,Prof}) • Relational Database Schema: set of relational schemas, each with a distinct name and integrity constraints (later)
Properties of (Database) Relations • Every relation has a unique name • Every attribute has a unique name • Attribute’s values are all from its domain • Because relations are sets… • No duplicate tuples • Order of tuples doesn’t matter (theoretically) • Every cell contains a single value (single-valued attributes)
Single-valued Attributes • “Oldest-child” is an ok attribute • “Children” is not, because it is likely to be multi-valued • Break up composite attributes into multiple attributes • Name --> First-name, Middle-init, Last-name • Use “relationships” in the ER sense for multivalued attributes.
Keys • Superkey: set of attributes that uniquely identifies a tuple. • (The whole tuple is always a superkey - why?) • Candidate Key: a superkey that doesn’t have a proper subset that is also a superkey • Primary Key: the candidate key that is selected by the designer (often generated) • Foreign Key: Set of attributes in one relation that matches the Primary key of another.
Properties of a Candidate Key • Uniqueness • The set of attribute values in the candidate key uniquely identifies a tuple • Irreducibility • If one attribute is removed from the candidate key, then uniqueness no longer holds
NULL • NULL is considered a member of every domain • NULL means: • Value is currently unknown • Value is not applicable for this entry (e.g. “spouse” of a single person) • Don’t confuse NULL with • Zero • Blank or empty string
Relational Schema “Shorthand” • List the relation name first • Attribute names in parentheses following the relation name • Primary key attribute(s) underlined • Example • Branch (BranchNo, street, city, postcode) • Staff (StaffNo, fName, position, sex, DOB, salary, BranchNo) • Arrow from foreign key to its referent
Integrity Constraints • Rules that apply to all instances of any database • If these rules are violated, the instance is not a legal database (it doesn’t “make sense”) • Some come from business rules of enterprise • “Students can take only 1 course in 3-week term” • Others from the schema design • “InstructorID cannot be NULL and must match the ID of a listed faculty member”
Key Constraint • Specifies which attributes compose a candidate key (or primary key) for the relation • Example: • {first, last} and {email} are both keys of the Phone Book relation • Generally, only the primary key is specified • In practice, an auto-generated id number is used as the key
Basic Integrity Constraints • Entity Integrity: No attribute of a primary key can be NULL for any tuple. • This constraint is implicit in the KEY constraint in SQL • Referential Integrity: If a relation has a foreign key, either the foreign key is NULL, or the foreign key matches the primary key of an existing tuple in another relation. • Attribute names do not have to be the same
Maintaining Referential Integrity • Adding a new tuple • Add with NULL foreign keys • Add tuple in the other relation (with primary key matching the foreign key) first • Deleting a tuple • Find all tuples in other relations whose foreign keys match this one • Either delete those tuples, or change their foreign keys to NULL • Modifying a tuple • If a foreign key is changed, make sure that the new value points to an existing tuple or is NULL
Summary of Constraints • Domain constraint • Each attribute’s values come from its domain • Key constraint • Every relation has a primary key • Entity integrity constraint • No NULL primary key attributes • Referential integrity constraint • Foreign keys are NULL or refer to existing tuples’ primary keys • Semantic constraints • Any additional constraints that must be satisfied based on the client’s needs
Views • “Relations” that appear to exist, but actually combine attributes from multiple base (implemented) relations. • When a base table changes, the view changes & vice versa • Useful for • Security: limit which attributes a user sees • Customization • Simplification