1 / 41

Chapter 7

Chapter 7. Relational Algebra. Topics in this Chapter. Closure Revisited The Original Algebra: Syntax and Semantics What is the Algebra For? Further Points and Additional Operators Grouping and Ungrouping. Relational Algebra.

kalb
Download Presentation

Chapter 7

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 7 Relational Algebra

  2. Topics in this Chapter • Closure Revisited • The Original Algebra: Syntax and Semantics • What is the Algebra For? • Further Points and Additional Operators • Grouping and Ungrouping

  3. Relational Algebra • The relational algebra is a collection of operators that take relations as their operands and return a relation as their result • Eight operators, in two groups of four • Union, intersect, difference, Cartesian product • Restrict, project, join, divide • The set of possible relational operators is essentially unlimited • The operators are “read only”

  4. Fig. 7.1 The original eight operators (overview)

  5. Closure Revisited • The output from any relation operator is another relation: the closure property • Relation expressions can be nested (analogously to arithmetic expressions) • Every relation has a head and a body; relational algebra must address both • Attribute type inference must be supported • RENAME changes the name of an attribute without changing its type or content

  6. RENAME S RENAME CITY AS SCITY +------+-------+--------+--------+ | S# | SNAME | STATUS | SCITY | +------+-------+--------+--------+ | S1 | Smith | 20 | London | | S2 | Jones | 10 | Paris | | S3 | Blake | 30 | Paris | | S4 | Clark | 20 | London | | S5 | Adams | 30 | Athens | +------+-------+--------+--------+

  7. RENAME expression (S RENAME CITY AS SCITY) +------+-------+--------+--------+ | S# | SNAME | STATUS | SCITY | +------+-------+--------+--------+ | S1 | Smith | 20 | London | | S2 | Jones | 10 | Paris | | S3 | Blake | 30 | Paris | | S4 | Clark | 20 | London | | S5 | Adams | 30 | Athens | +------+-------+--------+--------+ value

  8. The Syntax of the Original Algebra The algebra exists because of the nature and definition of relations. The algebra is independent of its description. Note that the Date text defines the algebra using words rather than symbols. Most texts use symbols to describe the syntax of the algebra. Some who prefer that say that it “looks more scientific.” More about the symbols later.

  9. The Syntax of the Original Algebra BNF grammar for the relational algebra: ::= “is defined as” < > to indicate category names | “or” […] to indicate something optional upper case words such as WHERE are elements of the language { and } are symbols in the language, not BNF use of “commalist” for repetition

  10. The Syntax of the Original Algebra • Each operator returns a relation, and operates on a relation • Each operator assigns a relation value to the new relation, based on alterations to the tables being operated upon • Generically: <relation expression>:= RELATION { <tuple expression commalist> }

  11. The Syntax of the Original Algebra –General Format <relation expression> := RELATION {<tuple expression commalist>} | <relvar name> | <relation operator invocation> | <with expression> | <introduced name> | ( <relation expression>)

  12. <relation operation invocation> ::= <project> | <nonproject> <project> :: = <relation expression> { [ ALL BUT ] <attribute name commalist> } (The <relation expression> must not be a <nonproject>) <nonproject> ::= <rename> | <union> | <intersect> | <minus> | <times> | <where> | <join> | <divide>

  13. <rename> ::= <relation expression> RENAME <renaming commalist> (The <relation expression> must not be a <nonproject>) <union> ::= <relation expression> UNION <relation expression> (The <relation expression>s must not be <nonproject>s, except either or both can be another <union>)

  14. <intersect> ::= <relation expression> INTERSECT <relation expression>   (The <relation expression>s must not be <nonproject>s) (Except either or both can be another <intersect>) <minus> ::= <relation expression> MINUS <relation expression>   (The <relation expression>s must not be <nonproject>s) <times> ::= <relation expression> TIMES <relation expression>   (The <relation expression>s must not be <nonproject>s) (Except either or both can be another <times>)

  15. <where> ::= <relation expression> WHERE <boolean expression>   (The <relational expression> must not be a <nonproject>) <join> ::= <relation expression> JOIN <relation expression>   (The <relation expression>s must not be <nonproject>s) (Except either or both can be another <join>)

  16. <divide> ::= <relation expression> DIVIDEBY <relation expression> PER <per>   (The <relation expression>s must not be <nonproject>s) <per> ::= <relation expression> | (<relation expression>, <relation expression> )   (The <relation expression>s must not be <nonproject>s) <with expression> ::= WITH <name intro commalist> : <expression> <name intro> ::= <expression> AS <introduced name>

  17. Semantics of the Original Algebra –Union • Union operates on two sets and returns a set that contains all elements belonging to either • Both sets must be of the same type - formerly known as union compatibility • Relations cannot have duplicate tuples; we say loosely that UNION “eliminates duplicates”

  18. Semantics of the Original Algebra –Intersect and Difference • Intersect operates on two sets and returns a set that contains all tuples belonging to both • Difference operates on two sets and returns a set containing all tuples occuring in one but not the other, using MINUS • For both Intersect and Difference, the sets operated upon must be of the same type - formerly known as union compatibility

  19. Semantics of the Original Algebra –Cartesian Product • A Cartesian Product is the set of all ordered pairs such that in each pair, the first element comes from the first set, and the second element comes from the second set • However, since the result of a relational operator is a relation, the result of each pair is a single tuple containing all the elements of both of the source tuples • Uses keyword TIMES

  20. Semantics of the Original Algebra –Restrict • Yields a horizontal subset – a/k/a “SELECT” • a WHERE p • p is called the restriction condition • p is a predicate, and returns boolean • If it can be evaluated by examining a single tuple it is simple; otherwise it is nonsimple

  21. Semantics of the Original Algebra –Project • Yields a vertical subset • The general form is a commalist of attributes to be kept in the result • For all attributes kept, all tuples are kept • An alternative specification is to name the attributes to be excluded: • P { ALL BUT WEIGHT}

  22. Semantics of the Original Algebra –Join – Natural Join • When unqualified, join means “natural join” • For any two relations with at least one matching attribute, the join operator returns a relation with a single tuple of all the attributes for each match • Attributes that do not match from each source relation are retained • If no attributes match, result is a Cartesian product • If all attributes match, result is an Intersect

  23. Semantics of the Original Algebra –Join – Theta Join • Used to join relations based on matching attributes, where the values are not equal • Given relations a and b, and attributes X and Y, this can be expressed as follows: • (a TIMES b) WHERE X theta Y • When theta is set to = the result can be made to be that of natural join (project away the duplicate attribute, and rename the kept one)

  24. Semantics of the Original Algebra –Divide • Used to “divide one relation into another” • Small Divide uses one relation expression as divisor, Great Divide uses two • For small divide: • a DIVIDEDBY b PER c • where a is the dividend, b is the divisor, and c is the mediator • Used to determine who in a relates to the complete set in b

  25. Semantics of the Original Algebra –Divide - Example • Let S be a relation of suppliers, P one of parts, and SP the mediator • S JOIN ( S {S#} DIVIDEDBY P {P#} PER SP {S#, P#} ) • Will return a relation with suppliers who supply all parts, only

  26. Examples Get supplier names for suppliers who supply part P2. In SQL: SELECT SNAME FROM S WHERE S# IN (SELECT S# FROM SP WHERE P# = ‘P2’); In relational algebra: ( ( SP JOIN S ) WHERE P# = P# (‘P2’) ) { SNAME }

  27. Get supplier names for suppliers who supply at least one red part. SELECT SNAME FROM S WHERE S# IN (SELECT S# FROM SP WHERE P# IN (SELECT P# FROM P WHERE COLOR = ‘RED’) ); ( ( ( P WHERE COLOR = COLOR (‘RED’) ) { P# } JOIN SP ) { S# } JOIN S ) {SNAME}

  28. Get supplier names for suppliers who do not supply part P2. SELECT SNAME FROM S WHERE NOT EXISTS ( SELECT S# FROM SP WHERE S# = S.S# AND P# = ‘P2’ ) ; ( ( S {S#} MINUS ( SP WHERE P# = ‘P2’ ) { S# } ) JOIN S ) { SNAME }

  29. Get all pairs of supplier numbers where the two suppliers are located in the same city. SELECT FIRST.S#, SECOND.S# FROM S FIRST, S SECOND WHERE FIRST.CITY = SECOND.CITY AND FIRST.S# < SECOND.S#; ( ( ( S RENAME S# AS FIRSTS# ) {FIRSTS#, CITY} JOIN (S RENAME S# AS SECONDS# ) {SECONDS#, CITY} ) WHERE FIRSTS# < SECONDS# ) { FIRSTS#, SECONDS# }

  30. Get supplier names for suppliers who do not supply part P2. SELECT SNAME FROM S WHERE NOT EXISTS ( SELECT S# FROM SP WHERE S# = S.S# AND P# = ‘P2’ ) ; ( ( S {S#} MINUS ( SP WHERE P# = ‘P2’ ) { S# } ) JOIN S ) { SNAME }

  31. <divide> ::= <relation expression> DIVIDEBY <relation expression> PER <per>   (The <relation expression>s must not be <nonproject>s) <per> ::= <relation expression> | (<relation expression>, <relation expression> )   (The <relation expression>s must not be <nonproject>s) <with expression> ::= WITH <name intro commalist> : <expression> <name intro> ::= <expression> AS <introduced name>

  32. Semantics of the Original Algebra –Divide • Used to “divide one relation into another” • Small Divide uses one relation expression as divisor, Great Divide uses two • For small divide: • a DIVIDEDBY b PER c • where a is the dividend, b is the divisor, and c is the mediator • Used to determine who in a relates to the complete set in b

  33. Fig. 7.8 Division Examples

  34. The “Symbolic” Form Names of Suppliers located in Paris: π SNAME ( σCITY = ‘Paris’ (S) ) (S WHERE CITY = CITY (‘Paris’) ){SNAME}  Names of Suppliers of part ‘P2’: π SNAME ( σP# = ’P2’ (S SP) ) ((S JOIN SP) WHERE P# = ‘P2’) {SNAME}

  35. What is the Algebra for? • The purpose of the algebra is to allow the writing of relational expressions • Applications of the algebra: retrieval, update, defining integrity constraints, derived relvars, stability and security • An implemented language can be said to be relationally complete if it is at least as powerful as the algebra

  36. The Original Algebra • Many operators are associative: Union, intersect, times, join, but not minus • Many operators are commutative: Union, intersect, times, join, but not minus • Join, union, intersect were originally defined as dyadic, but are now seen to operate on any number of relations, including DEE and DUM

  37. Additional Relational Operators • Semijoin is used to perform a partial join based on restrictions (Join for a specific part number, for example) • Semidifference is similar (Obtain suppliers who do not supply a particular part, e.g.) • Extend adds an attribute dynamically, but does not alter the underlying relvar • Summarize performs vertical or attribute-wise computations

  38. Semijoin A SEMIJOIN B is equivalent to: (A JOIN B) { X, Y } The JOIN of A and B projected over the attributes of A. The tuples of A that have “counterparts” in B.

  39. Grouping… • Required because relations can have attributes that are themselves relations • Provides a map between such relations and “flat” relations • SP GROUP {P#, QTY} AS PQ • Will return quantities of parts by supplier, which is the unnamed co-conspirator

  40. …and Ungrouping • Returns the original relation • In the example, the original SP relation • If you group, you can always ungroup, but the converse is not necessarily true • This occurs when the relations being ungrouped were not validly grouped in the first place

More Related