530 likes | 625 Views
Housekeeping – Jan 25, 2013. Today:. Quick review of the various relational algebra concepts using small example tables Monday: work through exercises based on our running example of the transit scheme (I will finish off our first draft of the table scheme today)
E N D
Today: • Quick review of the various relational algebra concepts using small example tables • Monday: work through exercises based on our running example of the transit scheme (I will finish off our first draft of the table scheme today) • Goal: see if we can retrieve the information needed to support our scenarios
Quiz Confusion (a.k.a. don’t try to remember how you wrote a quiz on less than 4 hours sleep) • Q6. What is the role of a foreign key in a table • Links two or more tables together • Uniquely identifies each record in the table for which it is the foreign key • Uniquely identifies each record in the table from which it originates • A and C
Key selection • A candidate key is any set of one or more columns whose combined values are unique among all occurrences (i.e., tuples or rows). Since a null value is not guaranteed to be unique, no component of a candidate key is allowed to be null. • There can be any number of candidate keys in a table • The primary key of any table is any candidate key of that table which the database designer arbitrarily designates as "primary". The primary key may be selected for convenience, comprehension, performance, or any other reasons. • The alternate keys of any table are simply those candidate keys which are not currently selected as the primary key. • A foreign key is a set of one or more columns in any table (not necessarily a candidate key, let alone the primary key, of that table) which may hold the value(s) found in the primary key** column(s) of some other table. • ** see slide 7
Entity Integrity Constraint • No primary key value can be null. • Need to be able to identify individual tuples in the relation – if allowed to have a null value for the primary key, may not be able to distinguish between two or more tuples
Referential Integrity Constraint • Specified between two relations • Used to maintain consistency among tuples of the two relations • A tuple in one relation that refers to another relation must refer to an EXISTING TUPLE in that relation • Foreign keys specify a referential integrity constraint between the two relation • A set of attributes FK in relation schema R1 is a foreign key of R1 if: • The attributes in FK have the same domain as the primary key attributes PK of another relation R2 (FK references R2) • A value of FK in a tuple t1 of R1 either occurs as a value of PK for some tuple T=t2 in R2 or is null. • E.g. if dept # is the foreign key of employee, an employee must either be assigned a value of the primary key in Dept or the value can be Null in employee if no department is currently assigned
Does a foreign key have to link to a primary key? • Slightly controversial. Remember that there could be several candidate keys from which a single primary key was chosen. • To ensure that each record in the referencing table references exactly one record in the referenced table, the referenced column(s) in the referenced table must have a primary key constraint or have both unique and not-null constraints. Having a unique index is not sufficient. • Under the hood: When you specify a primary key constraint for a table, the Database Engine enforces data uniqueness by automatically creating a unique index for the primary key columns. This index also permits fast access to data when the primary key is used in queries. If a primary key constraint is defined on more than one column, values may be duplicated within one column, but each combination of values from all the columns in the primary key constraint definition must be unique. • More info on foreign key constraints (but remember, this is DB specific): http://msdn.microsoft.com/en-us/library/ms175464(v=sql.105).aspx
Relational Algebra CSCI 2141 W 2013
What is Relational Algebra? • Relational Algebra is formal description of how relational database operates. • It is a procedural query language, i.e. user must define both “how” and “what” to retrieve. • It consists of a set of operators that consume either one or two relations as input. An operator produces one relation as its output.
Introduction to Relational Algebra • Introduced by E. F. Codd in 1970. • Codd proposed such an algebra as a basis for database query languages.
Relational Query Languages • Formal: relational algebra, relational calculus, Datalog • Actual: SQL, Quel, Query-by-Example (QBE) • In ALL languages, a query is executed over a set of relations, get single relation as the result
Terminology • Relation - a set of tuples. • Tuple - a collection of attributes which describe some real world entity. • Attribute - a real world role played by a named domain. • Domain - a set of atomic values. • Set - a mathematical definition for a collection of objects which contains no duplicates.
Algebra Operations • Unary Operations - operate on one relation. These include select, project and rename operators. • Binary Operations - operate on pairs of relations. These include union, set difference, division, cartesian product, equality join, natural join, join and semi-join operators.
Select Operator • The Select operator selects tuples that satisfies a predicate; e.g. retrieve the employees whose salary is 30,000 бSalary = 30,000 (Employee) • Conditions in Selection: Simple Condition: (attribute)(comparison)(attribute) (attribute)(comparison)(constant) Comparison: =,≠,≤,≥,<,> Condition: combination of simple conditions with AND, OR, NOT
Select Operator Example Person бAge≥34(Person) бAge=Weight(Person)
Select Operator Example Person бAge≥34(Person) бAge=Weight(Person)
Project Operator • Project (∏) retrieves a column. Duplication is not permitted. • e.g., name of employees: ∏name(Employee) e.g., name of employees earning more than 80,000: ∏name(бSalary>80,000(Employee))
Project Operator Example Employee ∏name(Employee)
Project Operator Example Employee ∏name(Employee)
Project Operator Example:∏name(бSalary>80,000(Employee)) бSalary>80,000(Employee) Employee ∏name(бSalary>80,000(Employee))
Project Operator Example бSalary>80,000(Employee) Employee ∏name(бSalary>80,000(Employee))
Cartesian Product • In mathematics, it is a set of all pairs of elements (x, y) that can be constructed from given sets, X and Y, such that x belongs to X and y to Y. • It defines a relation that is the concatenation of every tuple of relation R with every tuple of relation S.
Cartesian Product Note: • Relation schema is the union of schemas for R and S • Resulting schema may be ambiguous • Use R.A or S.A to disambiguate an attribute that occurs in both schemas
Cartesian Product Example City Person Person X City
Cartesian Product Example City Person Person X City
Example R S A B B C D 1 2 2 5 6 x 3 4 4 7 8 9 10 11
Example R S A B B C D A R.B S.B C D 1 2 2 5 6 1 2 2 5 6 x 3 4 4 7 8 1 2 4 7 8 9 10 11 1 2 9 10 11 3 4 2 5 6 3 4 4 7 8 3 4 9 10 11
Rename Operator • In relational algebra, a rename is a unary operation written as ρ a / b(R) where: a and b are attribute names R is a relation • The result is identical to R except that the b field in all tuples is renamed to an a field. • Example, rename operator changes the name of its input table to its subscript, • ρ EmployeeName / Name (Employee) • Changes the Name field of the Employee table to EmployeeName
Rename Operator Example ρEmployeeName / Name(Employee) Employee
Union Operator • The union operation is denoted U as in set theory. It returns the union (set union) of two compatible relations. • For a union operation r U s to be legal, we require that, r and s must have the same number of attributes. The domains of the corresponding attributes must be the same. • As in all set operations, duplicates are eliminated.
Union Operator Example Professor Student Student U Professor
Intersection Operator • Denoted as . For relations R and S, intersection is R S. • Defines a relation consisting of the set of all tuples that are in both R and S. • R and S must be union-compatible.
Intersection Operator Example Professor Student Student Professor
Set Difference Operator • For relations R and S, • Set difference R - S, defines a relation consisting of the tuples that are in relation R, but not in S. • Set difference S – R, defines a relation consisting of the tuples that are in relation S, but not in R. Note: R-S S-R!
Set Difference Operator Example Professor Student Professor - Student Student - Professor
Set Difference Operator Example Professor Student Professor - Student Student - Professor
Try R S, R S, R - S R name address gender birthdate Carrie Fisher 123 Maple St., Hollywood F 9/9/99 Mark Hamil 456 Oak Rd., Brentwood M 8/8/88 S name address gender birthdate Carrie Fisher 123 Maple St., Hollywood F 9/9/99 Harrison Ford 789 Palm Dr., Beverly Hills M 7/7/77
Sample Operations R S name address gender birthdate Carrie Fisher 123 Maple St., Hollywood F 9/9/99 R S name address gender birthdate Carrie Fisher 123 Maple St., Hollywood F 9/9/99 Harrison Ford 789 Palm Dr., Beverly Hills M 7/7/77 Mark Hamil 456 Oak Rd., Brentwood M 8/8/88 R - S name address gender birthdate Mark Hamil 456 Oak Rd., Brentwood M 8/8/88
Natural Join Operator Natural join is a dyadic operator that is written as R lXl S where R and S are relations. The result of the natural join is the set of all combinations of tuples in R and S that are equal on their common attribute names. If r and s from r(R) and s(S) are successfully paired, result is called a joined tuple
Natural Join Example For an example, consider the tables Employee and Dept and their natural join: Employee Employee lXl Dept Dept
Example R S • Resulting schema has attributes from R, either R or S (i.e., joining attribute(s)), and S • Tuples that fail to pair with any tuple of the other relation are called dangling tuples A B B C D 1 2 2 5 6 join 3 4 4 7 8 9 10 11
Example R S A B B C D A B C D 1 2 2 5 6 1 2 5 6 join 3 4 4 7 8 3 4 7 8 9 10 11
Join Operations Theta Join (binary) • R joinC S, where C is an arbitrary join condition • Step 1: take the product of R and S • Step 2: Select from the product only those tuples that satisfy condition C • As with the product operation, the schema for the result is the union of the schemas of R and S
Example U V A B C B C D 1 2 3 2 3 4 joinA<D AND U.BV.B 6 7 8 2 3 5 9 7 8 7 8 10
Example U V A B C B C D 1 2 3 2 3 4 joinA<D AND U.BV.B 6 7 8 2 3 5 9 7 8 7 8 10 A U.B U.C V.B V.C D 1 2 3 7 8 10
Additional slides w/ other types of joins For more information of the various relational algebra operations, see: http://en.wikipedia.org/wiki/Relational_algebra
Semijoin Operator The semijoin is joining similar to the natural join and written as R⋉ S where R and S are relations. The result of the semijoin is only the set of all tuples in R for which there is a tuple in S that is equal on their common attribute names.
Semijoin Example For an example consider the tables Employee and Dept and their semi join: Employee Employee ⋉ Dept Dept
Outerjoin Operator Left outer join The left outer join is written as R =X S where R and S are relations. The result of the left outer join is the set of all combinations of tuples in R and S that are equal on their common attribute names, in addition to tuples in R that have no matching tuples in S. Right outer join The right outer join is written as R X= S where R and S are relations. The result of the right outer join is the set of all combinations of tuples in R and S that are equal on their common attribute names, in addition to tuples in S that have no matching tuples in R.
Left Outerjoin Example For an example consider the tables Employee and Dept and their left outer join: Employee Employee =X Dept Dept