1 / 34

Lecture 4: Relational algebra

Learn about the core relational algebra, how to write queries using relational algebra, and the power of relational algebra. Explore formal relational query languages and understand their role in SQL query processing.

kcalvin
Download Presentation

Lecture 4: Relational algebra

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 4:Relational algebra www.cl.cam.ac.uk/Teaching/current/Databases/

  2. Today’s lecture • What’s the (core) relational algebra? • How can we write queries using the relational algebra? • How powerful is the relational algebra?

  3. Relational query languages • Query languages allow the manipulation and retrieval of data from a database • The relational model supports simple, powerful query languages • Strong formal foundation • Allows for much (provably correct) optimisation • NOTE: Query languages are not (necessarily) programming languages

  4. Formal relational query languages • Two formal query languages • Relational Algebra • Simple ‘operational’ model, useful for expressing execution plans • RelationalCalculus • Logical model (‘declarative’), useful for theoretical results • Both languages were introduced by Codd in a series of papers • They have equivalent expressive power They are the key to understanding SQL query processing!

  5. Preliminaries • A query is applied to relation instances, and the result of a query is also a relation instance • Schema of relations are fixed (cf. types) • The query will then execute over any valid instance • The schema of the result can also be determined

  6. Example relation instances • A database of boats, sailors, and reservations S2 R1 S1 B1

  7. Core relational algebra • Five basic operator classes: • Selection • Selects a subset of rows • Projection • Picking certain columns • Renaming • Renaming attributes • Set theoretic operations • The familiar operations: union, intersection, difference, … • Products and joins • Combining relations in useful ways

  8. Selection • Selects rows that satisfy a condition, written R1 = c(R2) • where c is a condition involving the attributes of R2, e.g. rating>8(S2) returns the relation instance

  9. Selection cont. • Note: • The schema of the result is exactly the same as the schema of the input relation instance • There are no duplicates in the resulting relation instance (why?) • The resulting relation instance can be used as the input for another relational algebra operator, e.g. sname=“Julia”(rating>8(S2))

  10. Projection Deletes fields that are not in the projection list R1=A(R2) where A is a list of attributes from the schema of R2, e.g. sname,rating(S2) returns the relation instance

  11. Projection cont. • Note: • Projection operator has to eliminate duplicates (why?) • Aside: Real systems don’t normally perform duplicate elimination unless the user explicitly asks for it (why not?)

  12. Renaming R1= A:=B(R2) • Returns a relation instance identical to R2 except that field A is renamed B • For example, sname:=nom(S1)

  13. Familiar set operations • We have the familiar set-theoretic operators, e.g. , , - • There is a restriction on their input relation instances: they must be union compatible • Same number of fields • Same field names and domains • E.g. S1S2 is valid, but S1R1 is not

  14. Cartesian products AB • Concatenate every row of A with every row of B • What do we do if A and B have some field names in common? • Several choices, but we’ll simply assume that the resulting duplicate field names will have the suffix 1 and 2

  15. Note! Example S1R1

  16. Theta join • Theoretically, it is a derived operator R1Vc R2@c(R1R2) • E.g., S1 Vsid.1<=sid.2R1

  17. Theta join cont. • The result schema is the same as for a cross-product • Sometimes this operator is called a conditional join • Most commonly the condition is an equality on field names, e.g. S1 Vsid.1=sid.2R1

  18. Equi- and natural join • Equi-join is a special case of theta join where the condition is equality of field names, e.g. S1 Vsid R1 • Natural join is an equi-join on all common fields where the duplicate fields are removed. It is written simply A V B

  19. Natural join cont. • Note that the common fields appear only once in the resulting relation instance • This operator appears very frequently in real-life queries • It is always implemented directly by the query engine (why?)

  20. Division • Not a primitive operator, but useful to express queries such as Find sailors who have reserved all the boats • Consider the simple case, where relation A has fields x and y, and relation B has field y • A/B is the set of xs (sailors) such that for every y (boat) in B, there is a row (x,y) in A

  21. Division cont. • Can you code this up in the relational algebra?

  22. Division cont. • Can you code this up in the relational algebra? x’s that are disqualified: x((x(A)  B) – A) Thus: x(A)-x((x(A)  B) – A)

  23. Example 1 Find names of sailors who’ve reserved boat 103 Solution 1:sname(bid=103(Reserves) V Sailors) Solution 2: sname(bid=103(Reserves V Sailors)) Which is more efficient? Query optimisation

  24. Example 2 Find names of sailors who’ve reserved a red boat

  25. Example 2 Find names of sailors who’ve reserved a red boat sname(colour=“red”(Boats) V Reserves V Sailors) Better: sname(sid(bid(colour=“red”(Boats)) V Reserves) V Sailors)

  26. Example 3 Find sailors who’ve reserved a red or a green boat

  27. Example 3 Find sailors who’ve reserved a red or a green boat let T = colour=“red”colour=“green”(Boats) in sname(T V Reserves V Sailors)

  28. Example 4 Find sailors who’ve reserved a red and a green boat

  29. Example 4 Find sailors who’ve reserved a red and a green boat NOTE: Can’t just trivially modify last solution! let T1 = sid (colour=“red”(Boats) V Reserves) T2 = sid (colour=“green”(Boats) V Reserves) in sname((T1  T2) V Sailors)

  30. Example 5 Find the names of sailors who’ve reserved at least two boats let T = sid.1:=sid (sid.1,sname,bid (Sailors V Reserves)) in sname.1 (sid.1=sid.2bid.1bid.2(T  T))

  31. Example 6 Find the names of sailors who’ve reserved all boats let T = sid,bid (Reserves) / bid (Boats) in sname(T V Sailors)

  32. Computational limitations • Suppose we have a relation SequelOf of movies and their immediate sequels • We want to compute the relation ‘isFollowedBy’ …

  33. Computational limitations • We could compute fst,thd(movie:=fst,sequel:=snd(SequelOf) V movie:=snd,sequel:=thd(SequelOf)) • This provides us with sequels-of-sequels • We could write three joins to get sequels-of- sequels-of-sequels and union the results • What about Friday the 13th (9 sequels)?  • In general we need to be able to write an arbitrarily large union… • The relational algebra needs to be extended to handle these sorts of queries

  34. Summary You should now understand: • The core relational algebra • Operations and semantics • Union compatibility • Computational limitations of the relational algebra Next lecture: Relational calculus

More Related