400 likes | 776 Views
Algebraic Laws for Improving Query Plans. Dhivyakrishnan CS 257 ID: 105. Agenda. Basis of Relational Algebra Commutative and Associative Laws Laws Involving Selection Pushing Selections Laws Involving Projection Laws About Joins and Products Laws Involving Duplicate Elimination
E N D
Algebraic Laws for Improving Query Plans Dhivyakrishnan CS 257 ID: 105
Agenda • Basis of Relational Algebra • Commutative and Associative Laws • Laws Involving Selection • Pushing Selections • Laws Involving Projection • Laws About Joins and Products • Laws Involving Duplicate Elimination • Laws Involving Grouping and Aggregation
Basics of Relational Algebra • Relational Model consists of the elements: relations, which are made up of attributes. • A relation is a set of attributes with values for each attribute such that: • 1) Each attribute value must be a single value only (atomic). • 2) All values for a given attribute must be of the same type • (domain). • 3) Each attribute name must be unique. • 4) The order of attributes is insignificant • 5) No two rows (tuples) in a relation can be identical. • 6) The order of the rows (tuples) is insignificant.
Basics of Relational Algebra • Relational Algebra is a collection of operations on Relations. • Two main collections of relational operators: • 1) Set theory operators: • Union, Intersection, Difference and Cartesian product. • 2) Specific Relational Operators: • Selection, Projection, Join, Division • Relational operations is performed on • 1) Set : a relation without duplicate values • 2) Bag : a relation with duplicate values also called as multiset
Example : Database Schema • Consider the schema Movie • Movie (Title : string ,Year : integer ,length : integer, • inColor : boolean,studioName : string,producerC# : integer ) • A sample table is created below Figure 1: The relation Movie
Projection • Projection operator is used to produce from a relation R, a new relation that has only some of R’s columns. • The projection is denoted by π • The value of expression πA1,A2,A3….An ( R ) is a relation that has only the columns for attributes A1,A2….An of R. • The following example clearly explains the concept of projection.
Projection • Projection operator is used to produce from a relation R, a new relation that has only some of R’s columns. • The projection is denoted by π • The value of expression πA1,A2,A3….An ( R ) is a relation that has only the columns for attributes A1,A2….An of R. • The following example clearly explains the concept of projection.
Projection • We can project the relation in figure 1 onto the first three attributes with the expression. • πtitle,year,length (Movie) • The resulting relation is as follows Figure 2 : Result of projection
Selection • Selection applied to relation R , produces a new relation with subset of R’s tuples. • The tuples in the resulting relation are those that satisfy some condition C that involves the attribute of R. • Selection is denoted by σ • The following example clearly illustrates the concept of selection.
Selection • Suppose we want set of tuples in the relation Movie from figure 1 represents the movies that are at least 100 minutes long which is denoted by • σlength >= 100 (Movie) • The resulting relation is shown as follows Figure 3 : Result of selection
Natural Join • Generally ,we want to take product of two relation. For that we need to join them by pairing only those tuples that match in some way. • The simplest sort of match is the natural join of two relation R and S denoted by R S. • In the following example, relation R and S is defined and the result of natural join R S is also given.
Natural Join Relation R Relation S • The natural join of the relation R and S is shown below Figure 4 : Result of natural join
Commutative and Associative Laws • Commutative Laws states that the order of arguments does not affect the end result. • Operators + and * are associative operators. • Operator – is not commutative : a –b <> b -a • The algebraic form of the commutative law for addition and multiplication is as follows: • a +b = b + a • a * b = b * a
Commutative and Associative Laws • Associative Law states that we may group two uses of the operator either from the left or right. • Operators + and * are associative operators. • Operator – is associative:(a –b)-c <> a-(b-c) • The algebraic form of associative law is as follows • (a + b ) + c = a + ( b + c) • (a * b ) * c = a * ( b * c)
Commutative and Associative Laws • Several operations of relational algebra are both associative and commutative • R S = S R ; ( R S) T = R (S T) • where is join operator. R and S are relations. • Similarly operators like U (set union) , ∩ (intersection), (cartesian product) are both commutative and associative. • Note that these laws hold for both sets and bags
Laws Involving Selection • Selections reduce the size of relations. • To make efficient query, the selection must be moved down the tree without the changing what the expression does. • When the condition for the selection is complex, it is connected by AND or OR which helps to break the condition into its constituent parts.
Laws Involving Selection • The first two laws for selection σ are the splitting laws, • σc1 AND c2 (R) = σc1(σc2(R)) • σc1 OR c2 (R) = (σc1(R)) Us (σc2(R)) • The second law for OR works only if the relation R is the set .If R is a bag, then the set union U will eliminate the duplicates incorrectly.
Laws Involving Selection • The next family of laws involving σ allow us to push selection through binary operators : product ,union intersection, difference and join. • For a union ,selection must be pushed through both arguments. • For a difference ,the selection must be pushed to the firs argument and optionally through second argument. • For other operators it is only required that the selection be pushed to one argument.
Pushing Selections • Pushing Selection down the expression tree( i.e replacing the left side of one of the rules by the right side )is one of the best method to optimize query. • An example for Pushing Selection is illustrated as follows
Pushing Selections • Suppose we have relations • StarsIn(title ,year , starName) • Movie(title ,year , length ,inColor , studioName, producerC#) • We Define a view Movies1996 as • CREATE VIEW Movie1996 AS • SELECT * FROM MOVIE • WHERE year = 1996;
Pushing Selections • The query to find out which stars worked in which studios in 1996 • SELECT starName ,studioName • FROM Movie1996 NATURAL JOIN StarsIn • The view is Movie1996 is defined by • σyear = 1996 (Movie)
Pushing Selections • An expression tree for the above query and view is constructed as follows π starName ,studioName σ year = 1996 StarsIn Movie Figure 5: Logical Query Plan constructed from the definition of a query and view
Pushing Selections • Since year is attribute of Movie and StarsIn,we may push selection down to both children of join node.As result ,we reduce the size of the relation StarsIn before we join it with the movies of 1996 π starName ,studioName σ year = 1996 σ year = 1996 Movie StarsIn Fig 6 : Improving the query plan by moving selection up and down the tree
Laws Involving Projection • Projection, like selection can be pushed down through many other operators. • Pushing Projection usually involves introducing a new projection somewhere below an existing projection. • Projection differs from selection in the aspect that projection reduces the length of the tuples whereas selection reduces the number of the tuples.
Laws Involving Projection SELECT starName FROM StarsIn WHERE year = 1996 π starName σ movieYear = 1996 StarsIn Figure 7 : Logical query plan for the above query
Laws Involving Projection • From the above figure(fig 7),we can introduce a another projection below selection (σmovieYear =1996 ) onto the attributes. • 1) starName ,because that attribute is needed in the result. • 2) movieYear ,because that attribute is needed for the selection condition. • An expression tree for second projection is drawn as follows
Laws Involving Projection π starName σ movieYear = 1996 π starName, movieYear Lower Projection StarsIn • Please Note the following before doing the above projection: • If StarsIn were not a stored relation, but a relation that was constructed by another operation, such as join, then the above plan make sense . • If StarsIn is a stored relation, the lower projection in above figure could waste lot of time, especially if there were an index on movieYear
Laws Involving Duplicate Elimination • The operator δ , which eliminates duplicates from a bag can be pushed through only some of the operators. • Moving δ down the tree reduces the size of intermediate relation and may therefore be beneficial. • In some cases, we can move δ to a position where it can be eliminated because it is applied to a relation that does not have any duplicates
Laws Involving Duplicate Elimination • δ( R ) = R if R has no duplicates • Important cases of such a relation R include • A stored relation with a declared primary key • A relation that is the result of a γ operation ,since grouping creates a relation with no duplicates
Laws Involving Duplicate Elimination • Several Laws that push δ through other operators are • δ(R X S) = δ( R ) X δ( S ) • δ( σc( R)) = σc(δ ( R ). • Similarly join operator can also push δ • δ cannot be moved across the operators like U , - , π.
Laws Involving Grouping and Aggregation • While using grouping and aggregation ,the applicability of many transformation depends on the details of the aggregation used. • Due to the above ,we cannot state laws in generality. • One exception is the law below that γ absorbs δ • δ(γL(R)) = γL ( R )
Laws Involving Grouping and Aggregation • Another general rule is that we may project useless attributes prior to applying γ operation • γL ( R ) = γL(πM (R ) where M is the list containing at least all those attributes of R that are mentioned in L.
Laws Involving Grouping and Aggregation • Suppose we have the relation • MovieStar(name ,addr ,gender ,birthdate) • StarsIn(movieTitle ,movieYear ,starName) • Consider the query below • Select movieYear ,MAX(birthDate) • FROM MovieStar ,StarsIn • WHERE name = starName • GROUP BY movieYear
Laws Involving Grouping and Aggregation • The expression tree from the above query is depicted as follows γ movieYear ,MAX(birthDate) σ name = starName MovieStar StarsIn
Laws Involving Grouping and Aggregation • The FROM list is expressed by a product and the WHERE clause by a selection above it. • The grouping and aggregation are expressed by the γ. • Some transformation could be applied to the above figure(Fig 8). • 1) Combine the selection and product into an equijoin. • 2) Generate a δ below the γ ,since the γ is duplicate- impervious. • 3) Generate a π between the γ and the introduced δ to project onto movieYear and birthDate ,the only attributes relevant to the γ.
Laws Involving Grouping and Aggregation γ movieYear ,MAX(birthDate) π movieYear ,birthDate δ name = starName MovieStar StarsIn • From the above tree, still we can push the δ below the and introduce π below .The new query plan is depicted as follows
Laws Involving Grouping and Aggregation γ movieYear ,MAX(birthDate) π movieYear ,birthDate name = starName δ δ π birthDate,name π birthDate,name MovieStar StarsIn Figure 9 : Resulting query