710 likes | 1.46k Views
Lecture 3. CS157B. RELATIONAL ALGEBRA and Tuple Calculus. Prof. Sin-Min LEE Department of Computer Science. Codd’s Relational Algebra. A set of mathematical operators that compose, modify, and combine tuples within different relations
E N D
Lecture 3 CS157B RELATIONAL ALGEBRA and Tuple Calculus Prof. Sin-Min LEE Department of Computer Science
Codd’s Relational Algebra • A set of mathematical operators that compose, modify, and combine tuples within different relations • Relational algebra operations operate on relations and produce relations (“closure”) f: Relation Relation f: Relation x Relation Relation
A Set of Logical Operations: The Relational Algebra • Six basic operations: • Projection (R) • Selection (R) • Union R1UR2 • Difference R1 – R2 • Product R1X R2 • (Rename) ->b (R) • And some other useful ones: • Join R1 ⋈ R2 • Semijoin R1 ⊲ R2 • Intersection R1Å R2 • Division R1¥ R2
Data Instance for Operator Examples STUDENT COURSE Takes PROFESSOR Teaches
Rename, ab • The rename operator can be expressed several ways: • The book has a very odd definition that’s not algebraic • An alternate definition: ab(x) Takes the relation with schema Returns a relation with the attribute list • Rename isn’t all that useful, except if you join a relation with itself Why would it be useful here?
Deriving Intersection Intersection: as with set operations, derivable from difference A Å B ≡(A [ B) – (A – B) – (B – A) ≡ A – (A – B) A-B B-A A B
Division • A somewhat messy operation that can be expressed in terms of the operations we have already defined • Used to express queries such as “The fid's of faculty who have taught all subjects” • Paraphrased: “The fid’s of professors for which there does not exist a subject that they haven’t taught”
Division Using Our Existing Operators • All possible teaching assignments: Allpairs: • NotTaught, all (fid,subj) pairs for which professor fidhas not taught subj: • Answer is all faculty not in NotTaught: fid,subj (PROFESSOR xsubj(COURSE)) Allpairs - fid,subj(Teaches ⋈ COURSE) fid(PROFESSOR) - fid(NotTaught) ´ fid(PROFESSOR) - fid( fid,subj (PROFESSOR xsubj(COURSE)) - fid,subj(Teaches ⋈ COURSE))
Division: R1¸ R2 • Requirement: schema(R1) ÷schema(R2) • Result schema: schema(R1) – schema(R2) • “Professors who have taught all courses”: • What about “Courses that have been taught by all faculty”? fid (fid,subj(Teaches ⋈ COURSE) ¸subj(COURSE))
DIVISION - The division operator is used for queries which involve the ‘all’ qualifier such as “Find the names of sailors who have reserved all boats”. - The division operator is a bit tricky to explain, and perhaps best approached through examples as will be done here. • Cartesian Product (R1 ×R2) combines two relations by concatenating their tuples together, evaluating all possible combinations. If the name of a column is identical for two relations, this ambiguity is resolved by attaching the name of each relation to a column. e.g., Emp × Dept • (SS#, name, age, salary, Emp.dno, Dept.dno, dname, floor, mgrSS#) • If t(Emp) and t(Dept) is the cardinality of the Employee and Dept relations respectively, then the cardinality of Emp × Dept is: t(Emp) × t(Dept)
DIVISION Interpretation of the division operation A/B: - Divide the attributes of A into 2 sets: A1 and A2. - Divide the attributes of B into 2 sets: B2 and B3. - Where the sets A2 and B2 have the same attributes. - For each set of values in B2: - Search in A2 for the sets of rows (having the same A1 values) whose A2 values (taken together) form a set which is the same as the set of B2’s. - For all the set of rows in A which satisfy the above search, pick out their A1 values and put them in the answer.
DIVISION Example: Find the names of sailors who have reserved all boats: (1) A = sid,bid(Reserves). A1 = sid(Reserves) A2 = bid(Reserves) (2) B2 = bid(Boats) B3 is the rest of B. Thus, B2 ={101, 102, 103, 104} (3) Find the rows of A such that their A.sid is the same and their combined A.bid is the set B2. Thus we find A1 = {22} (4) Get the set of A2 corresponding to A1: A2 = {Dustin}
FORMAL DEFINITION OF DIVISION The formal definition of division is as follows: A/B = x(A) - x((x(A) B) – A)
CARTESIAN PRODUCT (Cont…) • Example: Emp table: Dept table: SS# Name age salary dno dno dname floor mgrSS#
CARTESIAN PRODUCT (Cont…) SS# Name age salary Emp.dno Dept.dno dname floor mgrSS# • Cartesian product of Emp and Dept: Emp × Dept:
CARTESIAN PRODUCT • Example: retrieve the name of employees that work in the toy department:
CARTESIAN PRODUCT • Example: retrieve the name of employees that work in the toy department: • ∏name(бEmp.dno=Dept.dno(Emp × бdname=‘toy’(Dept)))
CARTESIAN PRODUCT (Cont…) • ∏name(бdname=‘toy’(бEmp.dno=Dept.dno(Emp × Dept))) SS# Name age salary Emp.dno Dept.dno dname floor mgrSS#
CARTESIAN PRODUCT (Cont…) • ∏name(бdname=‘toy’(б Emp.dno=Dept.dno(Emp × Dept))) SS# Name age salary Emp.dno Dept.dno dname floor mgrSS#
CARTESIAN PRODUCT (Cont…) • ∏name(бdname=‘toy’(б Emp.dno=Dept.dno(Emp × Dept))) SS# Name age salary Emp.dno Dept.dno dname floor mgrSS#
CARTESIAN PRODUCT (Cont…) • ∏name(бdname=‘toy’(б Emp.dno=Dept.dno(Emp × Dept))) Name
EQUALITY JOIN, NATURAL JOIN, JOIN, SEMI-JOIN • Equality join connects tuples from two relations that match on certain attributes. The specified joining columns are kept in the resulting relation. • ∏name(бdname=‘toy’(Emp Dept))) • Natural join connects tuples from two relations that match on the specified common attributes • ∏name(бdname=‘toy’(Emp Dept))) • How is an equality join between Emp and Dept using dno different than a natural join between Emp and Dept using dno? • Equality join: SS#, name, age, salary, Emp.dno, Dept.dno, … • Natural join: SS#, name, age, salary, dno, dname, … • Join is similar to equality join using different comparison operators • A S op = {=, ≠, ≤, ≥, <, >} att op att (dno) (dno)
EXAMPLE JOIN • Equality Join, (Emp Dept))) Dept EMP (dno)
EXAMPLE JOIN • Natural Join, (Emp Dept))) Dept EMP (dno)
EXAMPLE JOIN • Join, (Emp ρx(Emp)))) Dept EMP Salary > 5 * salary
EQUALITY JOIN, NATURAL JOIN, JOIN, SEMI-JOIN (Cont…) • Example: retrieve the name of employees who earn more than Joe: • ∏name(Emp (sal>x.sal)бname=‘Joe’(ρ x(Emp))) • Semi-Join selects the columns of one relation that joins with another. It is equivalent to a join followed by a projection: • Emp (dno)Dept ≡∏SS#, name, age, salary, dno(Emp Dept)
JOIN OPERATORS Condition Joins: - Defined as a cross-product followed by a selection: R ⋈cS = σc(R S) (⋈ is called the bow-tie) where c is the condition. - Example: Given the sample relational instances S1 and R1 The condition join S ⋈S1.sid<R1.sid R1 yields
JOIN OPERATORS Condition Joins: - Defined as a cross-product followed by a selection: R ⋈cS = σc(R S) (⋈ is called the bow-tie) where c is the condition. - Example: Given the sample relational instances S1 and R1 The condition join S ⋈S1.sid<R1.sid R1 yields
Equijoin: Special case of the condition join where the join condition consists solely of equalities between two fields in R and S connected by the logical AND operator (∧). Example: Given the two sample relational instances S1 and R1 The operator S1 R.sid=Ssid R1 yields
Natural Join - Special case of equijoin where equalities are implicitly specified on all fields having the same name in R and S. - The condition c is now left out, so that the “bow tie” operator by itself signifies a natural join. - N. B. If the two relations have no attributes in common, the natural join is simply the cross-product.
Selection: TR[R] • Suppose we have (e’), where e’ is another RA expression that translates as: TR[e’]= {<x1,x2, …, xn>| p} • Then the translation of c(e’) is {<x1,x2, …, xn>| p’}where ’ is obtained from by replacing each attribute with the corresponding variable • Example: TR[#1=#2 #4>2.5R] (if R has arity 4) is {<x1,x2, x3, x4>| < x1,x2, x3, x4> R x1=x2 x4>2.5}