270 likes | 481 Views
CS 255: Database System Principles slides: From Parse Trees to Logical Query Plans. By:- Arunesh Joshi Id:-006538558. Agenda. Conversion to Relational Algebra. Removing Sub queries From Conditions. Improving the Logical Query Plan. Grouping Associative/Commutative Operators. . Parsing.
E N D
CS 255: Database System Principlesslides: From Parse Trees to Logical Query Plans By:- Arunesh Joshi Id:-006538558
Agenda • Conversion to Relational Algebra. • Removing Sub queries From Conditions. • Improving the Logical Query Plan. • Grouping Associative/Commutative Operators.
Parsing • Goal is to convert a text string containing a query into a parse tree data structure: • leaves form the text string (broken into lexical elements) • internal nodes are syntactic categories • Uses standard algorithmic techniques from compilers • given a grammar for the language (e.g., SQL), process the string and build the tree
Example: SQL query SELECT title FROM StarsIn WHERE starName IN ( SELECT name FROM MovieStar WHERE birthdate LIKE ‘%1960’ ); (Find the movies with stars born in 1960) Assume we have a simplified grammar for SQL.
SELECT <SelList> FROM <FromList> WHERE <Condition> <Attribute> <RelName> <Attribute> LIKE <Pattern> nameMovieStar birthDate‘%1960’ Example: Parse Tree <Query> <SFW> SELECT <SelList> FROM <FromList> WHERE <Condition> <Attribute> <RelName> <Tuple> IN <Query> titleStarsIn <Attribute> ( <Query> ) starName <SFW>
The Preprocessor • It replaces each reference to a view with a parse (sub)-tree that describes the view (i.e., a query) • It does semantic checking: • are relations and views mentioned in the schema? • are attributes mentioned in the current scope? • are attribute types correct?
Convert Parse Tree to Relational Algebra • The complete algorithm depends on specific grammar, which determines forms of the parse trees • Here is a flavor of the approach
Conversion • Suppose there are no subqueries. • SELECT att-list FROM rel-list WHERE cond is converted into PROJatt-list(SELECTcond(PRODUCT(rel-list))), or att-list(cond( X (rel-list)))
SELECT movieTitle FROM StarsIn, MovieStar WHERE starName = name AND birthdate LIKE '%1960'; <Query> <SFW> SELECT <SelList> FROM <FromList> WHERE <Condition> <Attribute> <RelName> , <FromList> AND <Condition> movieTitleStarsIn <RelName> <Attribute> LIKE <Pattern> MovieStarbirthdate'%1960' <Condition> <Attribute> = <Attribute> starName name
Equivalent Algebraic Expression Tree movieTitle starname = name AND birthdate LIKE '%1960' X StarsIn MovieStar
Handling Subqueries • Recall the (equivalent) query: SELECT title FROM StarsIn WHERE starName IN ( SELECT name FROM MovieStar WHERE birthdate LIKE ‘%1960’ ); • Use an intermediate format called two-argument selection
Example: Two-Argument Selection title StarsIn <condition> <tuple> IN name <attribute> birthdate LIKE ‘%1960’ starName MovieStar
Converting Two-Argument Selection • To continue the conversion, we need rules for replacing two-argument selection with a relational algebra expression • Different rules depending on the nature of the sub query • Here is shown an example for IN operator and uncorrelated query (sub query computes a relation independent of the tuple being tested)
Rules for IN C R <Condition> X R t IN S S C is the condition that equates attributes in t with corresponding attributes in S
Example: Logical Query Plan title starName=name StarsIn name birthdate LIKE ‘%1960’ MovieStar
What if Subquery is Correlated? • Example is when subquery refers to the current tuple of the outer scope that is being tested • More complicated to deal with, since subquery cannot be translated in isolation • Need to incorporate external attributes in the translation • Some details are in textbook
Improving the Logical Query Plan • There are numerous algebraic laws concerning relational algebra operations • By applying them to a logical query plan judiciously, we can get an equivalent query plan that can be executed more efficiently • Next we'll survey some of these laws
Example: Improved Logical Query Plan title starName=name StarsIn name birthdate LIKE ‘%1960’ MovieStar
Associative and Commutative Operations • product • natural join • set and bag union • set and bag intersection • associative: (A op B) op C = A op (B op C) • commutative: A op B = B op A
Laws Involving Selection • Selections usually reduce the size of the relation • Usually good to do selections early, i.e., "push them down the tree" • Also can be helpful to break up a complex selection into parts
Selection Splitting • C1 AND C2 (R) = C1 ( C2 (R)) • C1 OR C2 (R) = (C1 (R)) Uset (C2 (R)) if R is a set • C1 ( C2 (R)) = C2 ( C1 (R))
Selection and Binary Operators • Must push selection to both arguments: • C (R U S) = C (R) U C (S) • Must push to first arg, optional for 2nd: • C (R - S) = C (R) - S • C (R - S) = C (R) - C (S) • Push to at least one arg with all attributes mentioned in C: • product, natural join, theta join, intersection • e.g., C (R X S) = C (R) X S, if R has all the atts in C
Pushing Selection Up the Tree • Suppose we have relations • StarsIn(title,year,starName) • Movie(title,year,len,inColor,studioName) • and a view • CREATE VIEW MoviesOf1996 AS SELECT * FROM Movie WHERE year = 1996; • and the query • SELECT starName, studioName FROM MoviesOf1996 NATURAL JOIN StarsIn;
Remember the rule C(R S) = C(R) S ? The Straightforward Tree starName,studioName year=1996 StarsIn Movie
starName,studioName starName,studioName starName,studioName year=1996 year=1996 year=1996 year=1996 StarsIn StarsIn Movie StarsIn Movie push selection up tree push selection down tree Movie The Improved Logical Query Plan
Grouping Assoc/Comm Operators • Groups together adjacent joins, adjacent unions, and adjacent intersections as siblings in the tree • Sets up the logical QP for future optimization when physical QP is constructed: determine best order for doing a sequence of joins (or unions or intersections) U D E F U D E F U A B C A B C