300 likes | 420 Views
The Query Compiler 16.1 Parsing and Preprocessing. Meghna Jain(205) Dr. T. Y. Lin. Presentation Outline 16.1 Parsing and Preprocessing 16.1.1 Syntax Analysis and Parse Tree 16.1.2 A Grammar for Simple Subset of SQL 16.1.3 The Preprocessor
E N D
The Query Compiler 16.1 Parsing and Preprocessing Meghna Jain(205)Dr. T. Y. Lin
Presentation Outline 16.1 Parsing and Preprocessing 16.1.1 Syntax Analysis and Parse Tree 16.1.2 A Grammar for Simple Subset of SQL 16.1.3 The Preprocessor 16.1.4 Processing Queries Involving Views
Query compilation is divided into three steps • 1. Parsing: Parse SQL query into parser tree. • 2. Logical query plan: Transforms parse tree into expression tree of relational algebra. • 3.Physical query plan: Transforms logical query plan into physical query plan. • . Operation performed • . Order of operation • . Algorithm used • . The way in which stored data is obtained and passed from one • operation to another.
Query Parser Preprocessor Logical Query plan generator Query rewrite Preferred logical query plan Form a query to a logical query plan
Syntax Analysis and Parse Tree Parser takes the sql query and convert it to parse tree. Nodes of parse tree: 1. Atoms: known as Lexical elements such as key words, constants, parentheses, operators, and other schema elements. 2. Syntactic categories: Subparts that plays a similar role in a query as <Query> , <Condition>
Grammar for Simple Subset of SQL <Query> ::= <SFW> <Query> ::= (<Query>) <SFW> ::= SELECT <SelList> FROM <FromList> WHERE <Condition> <SelList> ::= <Attribute>,<SelList> <SelList> ::= <Attribute> <FromList> ::= <Relation>, <FromList> <FromList> ::= <Relation> <Condition> ::= <Condition> AND <Condition> <Condition> ::= <Tuple> IN <Query> <Condition> ::= <Attribute> = <Attribute> <Condition> ::= <Attribute> LIKE <Pattern> <Tuple> ::= <Attribute> Atoms(constants), <syntactic categories>(variable), ::= (can be expressed/defined as)
Query and Parse T ree StarsIn(title,year,starName) MovieStar(name,address,gender,birthdate) Query: Give titles of movies that have at least one star born in 1960 SELECT title FROM StarsIn WHERE starName IN ( SELECT name FROM MovieStar WHERE birthdate LIKE '%1960%' );
Another query equivalent SELECT title FROM StarsIn, MovieStar WHERE starName = name AND birthdate LIKE '%1960%' ;
Parse Tree <Query> <SFW> SELECT <SelList> FROM <FromList> WHERE <Condition> <Attribute> <RelName> , <FromList> AND title StarsIn <RelName> MovieStar <Query> <Condition><Condition> <Attribute> = <Attribute> <Attribute> LIKE <Pattern> starName name birthdate ‘%1960’
The Preprocessor Functions of Preprocessor . If a relation used in the query is virtual view then each use of this relation in the form-list must replace by parser tree that describe the view. . It is also responsible for semantic checking 1. Checks relation uses : Every relation mentioned in FROM- clause must be a relation or a view in current schema. 2. Check and resolve attribute uses: Every attribute mentioned in SELECT or WHERE clause must be an attribute of same relation in the current scope. 3. Check types: All attributes must be of a type appropriate to their uses.
StarsIn(title,year,starName) MovieStar(name,address,gender,birthdate) Query: Give titles of movies that have at least one star born in 1960 SELECT title FROM StarsIn WHERE starName IN ( SELECT name FROM MovieStar WHERE birthdate LIKE '%1960%' );
Preprocessing Queries Involving Views When an operand in a query is a virtual view, the preprocessor needs to replace the operand by a piece of parse tree that represents how the view is constructed from base table. Base Table: Movies( title, year, length, genre, studioname, producerC#) View definition : CREATE VIEW ParamountMovies AS SELECT title, year FROM movies WHERE studioName = 'Paramount'; Example based on view: SELECT title FROM ParamountMovies WHERE year = 1979;
16.2 ALGEBRAIC LAWS FOR IMPROVING QUERY PLANS Ramya Karri ID: 206
Optimizing the Logical Query Plan • The translation rules converting a parse tree to a logical query tree do not always produce the best logical query tree. • It is often possible to optimize the logical query tree by applying relational algebra laws to convert the original tree into a more efficient logical query tree. • Optimizing a logical query tree using relational algebra laws is called heuristic optimization
Relational Algebra Laws These laws often involve the properties of: • commutativity - operator can be applied to operands independent of order. • E.g. A + B = B + A - The “+” operator is commutative. • associativity - operator is independent of operand grouping. • E.g. A + (B + C) = (A + B) + C - The “+” operator is associative.
Associative and Commutative Operators • The relational algebra operators of cross-product (×), join (⋈), union, and intersection are all associative and commutative. Associative (R X S) X T = S X (R X T) (R ⋈ S) ⋈ T= S ⋈ (R ⋈ T) (R S) T = S (R T) (R ∩ S) ∩ T = S ∩ (R ∩ T) Commutative R X S = S X R R ⋈ S = S ⋈ R R S = S R R ∩ S = S ∩ R
Laws Involving Selection • Complex selections involving AND or OR can be broken into two or more selections: (splitting laws) σC1ANDC2 (R) = σC1( σC2 (R)) σC1ORC2 (R) = ( σC1 (R) ) S ( σC2 (R) ) • Example • R={a,a,b,b,b,c} • p1 satisfied by a,b, p2 satisfied by b,c • σp1vp2 (R) = {a,a,b,b,b,c} • σp1(R) = {a,a,b,b,b} • σp2(R) = {b,b,b,c} • σp1 (R) U σp2 (R) = {a,a,b,b,b,c}
Laws Involving Selection (Contd..) • Selection is pushed through both arguments for union: σC(R S) = σC(R) σC(S) • Selection is pushed to the first argument and optionally the second for difference: σC(R - S) = σC(R) - S σC(R - S) = σC(R) - σC(S)
Laws Involving Selection (Contd..) • All other operators require selection to be pushed to only one of the arguments. • For joins, may not be able to push selection to both if argument does not have attributes selection requires. σC(R × S) = σC(R) × S σC(R ∩ S) = σC(R) ∩ S σC(R ⋈ S) = σC(R) ⋈ S σC(R ⋈D S) = σC(R) ⋈D S
Laws Involving Selection (Contd..) • Example • Consider relations R(a,b) and S(b,c) and the expression • σ (a=1 OR a=3) AND b<c (R ⋈S) • σ a=1 OR a=3(σ b<c (R ⋈S)) • σ a=1 OR a=3(R ⋈ σ b<c (S)) • σ a=1 OR a=3(R) ⋈ σ b<c (S)
Laws Involving Projection • Like selections, it is also possible to push projections down the logical query tree. However, the performance gained is less than selections because projections just reduce the number of attributes instead of reducing the number of tuples.
Laws Involving Projection • Laws for pushing projections with joins: πL(R × S) = πL(πM(R) × πN(S)) πL(R ⋈ S) = πL((πM(R) ⋈πN(S)) πL(R ⋈D S) = πL((πM(R) ⋈DπN(S))
Laws Involving Projection • Laws for pushing projections with set operations. • Projection can be performed entirely before union. πL(R UB S) = πL(R) UBπL(S) • Projection can be pushed below selection as long as we also keep all attributes needed for the selection (M = L attr(C)). πL ( σC (R)) = πL( σC (πM(R)))
Laws Involving Join • We have previously seen these important rules about joins: • Joins are commutative and associative. • Selection can be distributed into joins. • Projection can be distributed into joins.
Laws Involving Duplicate Elimination • The duplicate elimination operator (δ) can be pushed through many operators. • R has two copies of tuples t, S has one copy of t, • δ (RUS)=one copy of t • δ (R) U δ (S)=two copies of t
Laws Involving Duplicate Elimination • Laws for pushing duplicate elimination operator (δ): δ(R × S) = δ(R) × δ(S) δ(R S) = δ(R) δ(S) δ(R D S) = δ(R) Dδ(S) δ( σC(R) = σC(δ(R))
Laws Involving Duplicate Elimination • The duplicate elimination operator (δ) can also be pushed through bag intersection, but not across union, difference, or projection in general. δ(R ∩ S) = δ(R) ∩δ(S)
Laws Involving Grouping • The grouping operator (γ) laws depend on the aggregate operators used. • There is one general rule, however, that grouping subsumes duplicate elimination: δ(γL(R)) = γL(R) • The reason is that some aggregate functions are unaffected by duplicates (MIN and MAX) while other functions are (SUM, COUNT, and AVG).