930 likes | 1.11k Views
The Relational Algebra and Relational Calculus. Chapter 6. Chapter Outline. Relational Algebra Unary Relational Operations Relational Algebra Operations From Set Theory Binary Relational Operations Additional Relational Operations Examples of Queries in Relational Algebra
E N D
Chapter Outline • Relational Algebra • Unary Relational Operations • Relational Algebra Operations From Set Theory • Binary Relational Operations • Additional Relational Operations • Examples of Queries in Relational Algebra • Relating SQL and Relational Algebra • Relational Calculus (will be skipped) • Tuple Relational Calculus • Relating Tuple Relational Calculus to SQL • Domain Relational Calculus
Relational Algebra Overview • Relational algebra is the basic set of operations for the theoretical relational model presented in Chapter 3 • SQL (chapters 4 and 5) is the standard language for practicalrelational databases • Relational algebra operations are used in DBMS query optimization • SQL is converted to relational operations for optimization and processing (see chapter 19)
Relational Algebra Overview (cont.) • Input to a relational algebra operation is one or more relations • These operations can specify basic retrieval requests (or queries) • The result of an operation is a new relation, derived from the input relations • This property makes the algebra “closed” (all objects in relational algebra are relations)
Relational Algebra Overview (cont.) • A query will typically require multiple operations • Result of one operation can be further manipulated using additional operations of the same algebra • A sequence of relational algebra operations forms a relational algebra expression • Final result is a relation that represents the result of a database query (or retrieval request)
Relational Algebra Overview (cont.) • Relational Algebra consists of several groups of operations • Unary Relational Operations – one input table • SELECT (symbol: (sigma)) • PROJECT (symbol: (pi)) • RENAME (symbol: (rho)) • Relational Algebra Operations From Set Theory • UNION ( ), INTERSECTION ( ), DIFFERENCE (or MINUS, – ) • CARTESIAN PRODUCT ( x ) • Binary Relational Operations – two input tables • JOIN (several variations of JOIN exist) • DIVISION • Additional Relational Operations • OUTER JOINS, OUTER UNION • AGGREGATE FUNCTIONS (These compute summary of information: for example, SUM, COUNT, AVG, MIN, MAX) • Other operations
Database State for COMPANY • Examples in slides refer to the COMPANY database schema (Figure 3.7 (shown on next slide)) • Examples of results of relational operations refer to the database state in Figure 3.6 – shown after schema
Unary Relational Operations: SELECT • The SELECT operation (denoted by (sigma)) selects a subset of the tuples from a relation based on a selection condition. • The selection condition acts as a filter • Keeps only those tuples that satisfy the qualifying condition • Tuples satisfying the condition are selected whereas the other tuples are discarded (filtered out) • Important Note: The SELECT operation is different from the SELECT-clause of SQL (presented in Chapters 3 and 4) • Examples: • Select the EMPLOYEE tuples whose department number is 4: DNO = 4 (EMPLOYEE) • Select the employee tuples whose salary is greater than $30,000: SALARY > 30,000 (EMPLOYEE)
SELECT operation (cont.) • In general, the select operation is denoted by <selection condition> (R) where • the symbol (sigma) is used to denote the select operator • the selection condition is a Boolean (conditional) expression specified on the attributes of relation R • tuples that make the condition true are selected • appear in the result of the operation • tuples that make the condition false are filtered out • discarded from the result of the operation
SELECT operation (cont.) • SELECT Operation Properties • SELECT operation <selection condition>(R) produces a relation S that has the same schema (same attributes) as R • SELECT is commutative: • <condition1>(< condition2> (R)) = <condition2> (< condition1> (R)) • Because of commutativity property, a cascade (sequence) of SELECT operations may be applied in any order: • <cond1>(<cond2> (<cond3> (R)) = <cond2> (<cond3> (<cond1> ( R))) • A cascade of SELECT operations may be replaced by a single selection with a conjunction of all the conditions: • <cond1>(< cond2> (<cond3>(R)) = <cond1> AND < cond2> AND < cond3>(R))) • The number of tuples in the result of a SELECT is less than (or equal to) the number of tuples in the input relation R
Unary Relational Operations: PROJECT • PROJECT Operation is denoted by (pi symbol) • Keeps certain columns (attributes) from a relation and discards the other columns. • PROJECT creates a vertical partitioning • The list of specified columns (attributes) is kept in each tuple • The other attributes in each tuple are discarded • Example: List each employee’s first and last name and salary: LNAME, FNAME,SALARY(EMPLOYEE)
PROJECT operation (cont.) • General form of project operation is: <attribute list> (R) • (pi) is the symbol used to represent the project operation • <attribute list> is the desired list of attributes from relation R. • The operation removes any duplicate tuples • This is because the result of the project operation must be a set of tuples • Mathematical sets do not allow duplicate elements.
PROJECT operation (cont.) • PROJECT Operation Properties • Number of tuples in result of projection <list>(R) less or equal to the number of tuples in R • If the list of projected attributes includes a key of R, then number of tuples in result of PROJECT is equal to the number of tuples in R • PROJECT is not commutative • <list1> (<list2> (R) ) = <list1> (R) as long as <list2> contains the attributes in <list1>
Example • Next slide (Figure 6.1) shows the result of some SELECT and PROJECT operations when applied to the database state in Figure 3.6
Relational Algebra Expressions • To apply several relational algebra operations • Either we write the operations as a single relational algebra expression by nesting the operations in parentheses, or • We apply one operation at a time and create intermediate result relations. • In the latter case, we must give names to the relations that hold the intermediate results.
Single expression versus sequence of relational operations (Example) • To retrieve the first name, last name, and salary of all employees who work in department number 5, we must apply a select and a project operation • Single relational algebra expression: • FNAME, LNAME, SALARY(DNO=5(EMPLOYEE)) • OR sequence of operations, giving a name to each intermediate relation: • DEP5_EMPS DNO=5(EMPLOYEE) • RESULT FNAME, LNAME, SALARY (DEP5_EMPS)
Relational SELECT and PROJECT Operations in SQL • The SELECT-clause of SQL lists the projection attributes, and the WHERE_clause includes the selection conditions • Example: The relational algebra expression: • FNAME, LNAME, SALARY(DNO=5(EMPLOYEE)) • Corresponds to the SQL query: SELECT FNAME, LNAME, SALARY FROM EMPLOYEE WHERE DNO=5 ;
Unary Relational Operations: RENAME • The RENAME operator is denoted by (rho symbol) • Used to rename the attributes of a relation or the relation name or both • Useful when a query requires multiple operations • Necessary in some cases (see JOIN operation later)
RENAME operation (cont.) • The general RENAME operation can be expressed by any of the following forms: • S (B1, B2, …, Bn )(R) changes both: • the relation name to S, and • the column (attribute) names to B1, B1, …, Bn • S(R) changes: • the relation name only to S • (B1, B2, …, Bn )(R) changes: • the column (attribute) names only to B1, B1, …..Bn
RENAME operation (cont.) • For convenience, we also use shorthand for renaming an intermediate relation: • If we write: • R(FN,LN,SAL) FNAME,LNAME,SALARY (DEP5_EMPS) • R will rename FNAME to FN, LNAME to LN, and SALARY to SAL • If we write: • R(F,M,L,S,B,A,SX,SAL,SU,D)EMPLOYEE • Then EMPLOYEE is renamed to R and the 10 attributes of EMPLOYEE are renamed to F, M, L, S, B, A, SX, SAL, SU, D, respectively
Example • Next slide (Figure 6.2) shows the result of the following two expressions when applied to the database state in Figure 3.6 • Single relational algebra expression (Fig 6.2(a)): FN, LN, SAL(FNAME, LNAME, SALARY(DNO=5(EMPLOYEE))) • Sequence of operations, giving a name to each intermediate relation (Fig 6.2(b)): • TEMP DNO=5(EMPLOYEE) • R(First_name, Last_name, Salary) FNAME, LNAME, SALARY (TEMP)
Relational RENAME Operation in SQL • In SQL, RENAME is achieved by using AS • Example: The relational algebra expression: FN, LN, SAL(FNAME, LNAME, SALARY(DNO=5(EMPLOYEE))) • Corresponds to the SQL query: SELECT FNAME AS FN, LNAME AS LN, SALARY AS SAL FROM EMPLOYEE WHERE DNO=5 ;
Relational Algebra Operations fromSet Theory: UNION • UNION Operation • Binary operation, denoted by • The result of R S, is a relation that includes all tuples that are either in R or in S or in both R and S • Duplicate tuples are eliminated • The two operand relations R and S must be “type compatible” (or UNION compatible) • R and S must have same number of attributes • Each pair of corresponding attributes must be type compatible (have same or compatible domains)
UNION operation (cont.) • Example: • To retrieve the social security numbers of all employees who either work in department 5 (RESULT1 below) or directly supervise an employee who works in department 5 (RESULT2 below) • We can use the UNION operation as follows: DEP5_EMPS DNO=5 (EMPLOYEE) RESULT1 SSN(DEP5_EMPS) RESULT2(SSN) SUPERSSN(DEP5_EMPS) RESULT RESULT1 RESULT2 • The union operation produces the tuples that are in either RESULT1 or RESULT2 or both (see Fig 6.3)
Type Compatibility • Type Compatibility of operands is required for the binary set operation UNION , (also for INTERSECTION , and SET DIFFERENCE –, see next slides) • R1(A1, A2, ..., An) and R2(B1, B2, ..., Bn) are type compatible if: • they have the same number of attributes, and • the domains of corresponding attributes are type compatible (i.e. dom(Ai)=dom(Bi) for i=1, 2, ..., n). • The resulting relation for R1R2 (also for R1R2, or R1–R2, see next slides) has the same attribute names as the first operand relation R1 (by convention)
Relational Algebra Operations from Set Theory: INTERSECTION • INTERSECTION is denoted by • The result of the operation R S, is a relation that includes all tuples that are in both R and S • The attribute names in the result will be the same as the attribute names in R • The two operand relations R and S must be “type compatible”
Relational Algebra Operations from Set Theory: SET DIFFERENCE • SET DIFFERENCE (also called MINUS or EXCEPT) is denoted by – • The result of R – S, is a relation that includes all tuples that are in R but not in S • The attribute names in the result will be the same as the attribute names in R • The two operand relations R and S must be “type compatible”
Examples, Set Operations in SQL • Figure 6.4 (next slide) shows some examples of set operations on relations • SQL has UNION, INTERSECT, EXCEPT operations (see Chapter 4) • These operations work with sets of tuples (duplicate tuples are eliminated) • In addition, SQL has UNION ALL, INTERSECT ALL, EXCEPT ALL for multisets (duplicates are allowed)
Some properties of UNION, INTERSECT, and DIFFERENCE • Both union and intersection are commutative operations; that is • R S = S R, and R S = S R • Both union and intersection can be treated as n-ary operations applicable to any number of relations as both are associative operations; that is • R (S T) = (R S) T • (R S) T = R (S T) • The minus operation is not commutative; that is, in general • R – S ≠ S – R
Relational Algebra Operations from Set Theory: CARTESIAN PRODUCT • CARTESIAN (or CROSS) PRODUCT Operation • Different from the other three set operations • Used to combine tuples from two relations in a combinatorial fashion. • Denoted by R(A1, A2, . . ., An) x S(B1, B2, . . ., Bm) • Result is a relation Q with degree n + m attributes: • Q(A1, A2, . . ., An, B1, B2, . . ., Bm), in that order. • The resulting relation state has one tuple for each combination of tuples—one from R and one from S. • Hence, if R has nR tuples (denoted as |R| = nR ), and S has nS tuples, then R x S will have nR * nS tuples. • R and S do NOT have to be "type compatible”
CARTESIAN PRODUCT operation (cont.) • Generally, CROSS PRODUCT is not a meaningful operation • Can become meaningful when followed by other operations • Example (not meaningful): • FEMALE_EMPS SEX=’F’(EMPLOYEE) • EMPNAMES FNAME, LNAME, SSN (FEMALE_EMPS) • EMP_DEPENDENTS EMPNAMES x DEPENDENT • EMP_DEPENDENTS will contain every combination of tuples from EMPNAMES and DEPENDENT • whether or not the tuples are actually related
CARTESIAN PRODUCT operation (cont.) • To keep only combinations where the DEPENDENT is related to the EMPLOYEE by the condition ESSN=SSN, we add a SELECT operation • Example (meaningful, see Figure 6.5, next two slides): • FEMALE_EMPS SEX=’F’(EMPLOYEE) • EMPNAMES FNAME, LNAME, SSN (FEMALE_EMPS) • EMP_DEPENDENTS EMPNAMES x DEPENDENT • ACTUAL_DEPS SSN=ESSN(EMP_DEPENDENTS) • RESULT FNAME, LNAME, DEPENDENT_NAME (ACTUAL_DEPS) • RESULT will now contain the name of female employees and their dependents
CARTESIAN PRODUCT Operation in SQL • By leaving out the WHERE-clause (Chapter 4) SELECT *FROM EMPLOYEE, DEPARTMENT ; • OR, by using CROSS JOIN in joined tables (Chapter 5) SELECT * FROM (DEPARTMENT CROSS JOIN EMPLOYEE) ; • In both examples, every department-employee record combination appears in the result (whether or not the employee works for the department)
Binary Relational Operations:JOIN • The JOIN Operation • Denoted by • Combine related tuples from two relations into single “longer” tuples • General join condition of the form <condition> AND <condition> AND...AND <condition> • Example:
Binary Relational Operations:JOIN (cont’d.) • THETA JOIN • Each <condition> of the form Ai θ Bj • Ai is an attribute of R • Bj is an attribute of S • Ai and Bj have the same domain • θ (theta) is one of the comparison operators: • {=, <, ≤, >, ≥, ≠}
Variations of JOIN: EQUIJOIN • EQUIJOIN Operation is the most common use of join; involves join conditions with equality comparisons only (Ai = Bj) • The result of an EQUIJOIN will always have one or more pairs of attributes (whose names need not be identical) that have identical values in every tuple. • The JOIN in the previous example was an EQUIJOIN; every tuple in the result will have SSN=MGRSSN