1 / 83

CS 277: Database System Implementation

CS 277: Database System Implementation. Notes 6: Query Processing Arthur Keller. Focus: Relational System. Others?. Query Processing. Q  Query Plan. Example. Select B,D From R,S Where R.A = “c”  S.E = 2  R.C=S.C. Answer B D 2 x. R A B C S C D E

ely
Download Presentation

CS 277: Database System Implementation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 277: Database System Implementation Notes 6: Query Processing Arthur Keller Notes 6

  2. Focus: Relational System • Others? Query Processing Q  Query Plan Notes 6

  3. Example Select B,D From R,S Where R.A = “c”  S.E = 2  R.C=S.C Notes 6

  4. Answer B D 2 x R A B C S C D E a 1 10 10 x 2 b 1 20 20 y 2 c 2 10 30 z 2 d 2 35 40 x 1 e 3 45 50 y 3 Notes 6

  5. How do we execute query? - Do Cartesian product - Select tuples - Do projection One idea Notes 6

  6. Bingo! Got one... RXS R.A R.B R.C S.C S.D S.E a 1 10 10 x 2 a 1 10 20 y 2 . . C 2 10 10 x 2 . . Notes 6

  7. Relational Algebra - can be used to describe plans... Ex: Plan I B,D sR.A=“c” S.E=2  R.C=S.C X R S OR: B,D [sR.A=“c” S.E=2  R.C = S.C (RXS)] Notes 6

  8. Another idea: Plan II B,D sR.A = “c”sS.E = 2 R S natural join Notes 6

  9. R S A B C s (R) s(S) C D E a 1 10 A B C C D E 10 x 2 b 1 20 c 2 10 10 x 2 20 y 2 c 2 10 20 y 2 30 z 2 d 2 35 30 z 2 40 x 1 e 3 45 50 y 3 Notes 6

  10. Plan III Use R.A and S.C Indexes (1) Use R.A index to select R tuples with R.A = “c” (2) For each R.C value found, use S.C index to find matching tuples (3) Eliminate S tuples S.E  2 (4) Join matching R,S tuples, project B,D attributes and place in result Notes 6

  11. =“c” <c,2,10> <10,x,2> check=2? output: <2,x> next tuple: <c,7,15> R S A B C C D E a 1 10 10 x 2 b 1 20 20 y 2 c 2 10 30 z 2 d 2 35 40 x 1 e 3 45 50 y 3 A C I1 I2 Notes 6

  12. Overview of Query Optimization Notes 6

  13. SQL query parse parse tree convert answer logical query plan execute apply laws statistics Pi “improved” l.q.p pick best estimate result sizes {(P1,C1),(P2,C2)...} l.q.p. +sizes estimate costs consider physical plans {P1,P2,…..} Notes 6

  14. Example: SQL query SELECT title FROM StarsIn WHERE starName IN ( SELECT name FROM MovieStar WHERE birthdate LIKE ‘%1960’ ); (Find the movies with stars born in 1960) Notes 6

  15. Example: Parse Tree <Query> <SFW> SELECT <SelList> FROM <FromList> WHERE <Condition> <Attribute> <RelName> <Tuple> IN <Query> title StarsIn <Attribute> ( <Query> ) starName <SFW> SELECT <SelList> FROM <FromList> WHERE <Condition> <Attribute> <RelName> <Attribute> LIKE <Pattern> name MovieStar birthDate ‘%1960’ Notes 6

  16. Example: Generating Relational Algebra title  StarsIn <condition> <tuple> IN name <attribute> birthdate LIKE ‘%1960’ starName MovieStar Fig. 16.14: An expression using a two-argument , midway between a parse tree and relational algebra Notes 6

  17. Example: Logical Query Plan title starName=name  StarsIn name birthdate LIKE ‘%1960’ MovieStar Fig. 16.16: Applying the rule for IN conditions Notes 6

  18. Example: Improved Logical Query Plan title Question: Push project to StarsIn? starName=name StarsIn name birthdate LIKE ‘%1960’ MovieStar Fig. 16.21: The effect of query rewriting. Notes 6

  19. Example: Estimate Result Sizes Need expected size StarsIn MovieStar P s Notes 6

  20. Example: One Physical Plan Parameters: join order, memory size, project attributes,... Hash join SEQ scan index scan Parameters: Select Condition,... StarsIn MovieStar Notes 6

  21. Example: Estimate costs L.Q.P P1 P2 …. Pn C1 C2 …. Cn Pick best! Notes 6

  22. Textbook outline Chapter 15 15.1 Physical operators - Scan,sort, … 15.2-15.9 Implementing operators + estimating their cost Notes 6

  23. Chapter 16 16.1 Parsing 16.2 Algebraic laws 16.3 Parse tree -> logical query plan 16.4 Estimating result sizes 16.5-16.7 Cost based optimization Notes 6

  24. Reading textbook - Chapters 15,16 Optional: 15.7-15.9, 16.6-16.7 Optional: Duplicate elimination operator grouping, aggregation operators Notes 6

  25. Query Optimization - In class order • Relational algebra level • Detailed query plan level • Estimate Costs • without indexes • with indexes • Generate and compare plans Notes 6

  26. Relational algebra optimization • Transformation rules (preserve equivalence) • What are good transformations? Notes 6

  27. Rules:Natural joins & cross products & union R S = S R (R S) T = R (S T) Notes 6

  28. Note: • Carry attribute names in results, so order is not important • Can also write as trees, e.g.: T R R S S T Notes 6

  29. Rules:Natural joins & cross products & union R S = S R (R S) T = R (S T) R x S = S x R (R x S) x T = R x (S x T) R U S = S U R R U (S U T) = (R U S) U T Notes 6

  30. Rules: Selects sp1p2(R) = sp1vp2(R) = sp1 [ sp2 (R)] [ sp1 (R)] U [ sp2 (R)] Notes 6

  31. Bags vs. Sets R = {a,a,b,b,b,c} S = {b,b,c,c,d} RUS = ? • Option 1 SUM RUS = {a,a,b,b,b,b,b,c,c,c,d} • Option 2 MAX RUS = {a,a,b,b,b,c,c,d} Notes 6

  32. Option 2 (MAX) makes this rule work:sp1vp2 (R) = sp1(R) U sp2(R) Example: R={a,a,b,b,b,c} P1 satisfied by a,b; P2 satisfied by b,c sp1vp2 (R) = {a,a,b,b,b,c}sp1(R) = {a,a,b,b,b}sp2(R) = {b,b,b,c}sp1(R) U sp2 (R) = {a,a,b,b,b,c} Notes 6

  33. “Sum” option makes more sense: Senators (……) Rep (……) T1 = pyr,state Senators; T2 = pyr,state Reps T1 Yr State T2 Yr State 97 CA 99 CA 99 CA 99 CA 98 AZ 98 CA Union? Notes 6

  34. Executive Decision -> Use “SUM” option for bag unions -> Some rules cannot be used for bags Notes 6

  35. Rules: Project Let: X = set of attributes Y = set of attributes XY = X U Y pxy (R) = px [py (R)] Notes 6

  36. [sp (R)] S R [sq (S)] Rules:s + combined Let p = predicate with only R attribs q = predicate with only S attribs m = predicate with only R,S attribs sp (R S) = sq (R S) = Notes 6

  37. Rules:s + combined (continued) Some Rules can be Derived: spq (R S) = spqm (R S) = spvq (R S) = Notes 6

  38. Do one: spq (R S) = [sp (R)] [sq (S)] spqm (R S) = sm[(sp R) (sq S)] spvq (R S) = [(sp R) S] U [R(sq S)] Notes 6

  39. --> Derivation for first one: spq (R S) = sp [sq (R S) ] = sp[ R sq (S) ] = [sp (R)] [sq (S)] Notes 6

  40. pxz px Rules:p,s combined Let x = subset of R attributes z = attributes in predicate P (subset of R attributes) px[sp (R) ] = {sp [ px (R) ]} Notes 6

  41. pxy{[pxz (R) ][pyz (S) ]} Rules:p, combined Let x = subset of R attributes y = subset of S attributes z = intersection of R,S attributes pxy (R S)= Notes 6

  42. pxy {sp[pxz’ (R) pyz’ (S)]} z’ = z U {attributes used in P } pxy {sp(R S)} = Notes 6

  43. Rules for s,p combined with X similar... e.g., sp (R X S) = ? Notes 6

  44. Rules s,U combined: sp(R U S)= sp(R) U sp(S) sp(R - S)= sp(R) - S = sp(R) - sp(S) Notes 6

  45. Which are “good” transformations? sp1p2 (R) sp1 [sp2 (R)] sp (R S)  [sp (R)] S R S  S R px [sp(R)] px {sp [pxz(R)]} Notes 6

  46. Conventional wisdom: do projections early Example: R(A,B,C,D,E) x={E} P: (A=3)  (B=“cat”) px {sp(R)} vs. pE {sp{pABE(R)}} Notes 6

  47. What if we have A, B indexes? But B = “cat” A=3 Intersect pointers to get pointers to matching tuples Notes 6

  48. Bottom line: • No transformation is always good • Usually good: early selections Notes 6

  49. In textbook: more transformations • Eliminate common sub-expressions • Other operations: duplicate elimination Notes 6

  50. Outline - Query Processing • Relational algebra level • transformations • good transformations • Detailed query plan level • estimate costs • generate and compare plans Notes 6

More Related