1 / 52

LR ~ Bottom-Up ~ Shift-Reduce - concluded Dotted Items Building the Handles DFA

CSE P501 – Compilers. LR ~ Bottom-Up ~ Shift-Reduce - concluded Dotted Items Building the Handles DFA Building Action & Goto Tables Building the Handles DFA: Theory Nullable , FIRST & FOLLOW SLR, LALR, LR Next. Recap: LR State Machine. Problem: when to reduce? when to shift?

camdyn
Download Presentation

LR ~ Bottom-Up ~ Shift-Reduce - concluded Dotted Items Building the Handles DFA

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSE P501 – Compilers • LR ~ Bottom-Up ~ Shift-Reduce - concluded • Dotted Items • Building the Handles DFA • Building Action & Goto Tables • Building the Handles DFA: Theory • Nullable, FIRST & FOLLOW • SLR, LALR, LR • Next Jim Hogg - UW - CSE - P501

  2. Recap: LR State Machine • Problem: when to reduce? when to shift? • Clairvoyance - foretelling the future - tricky algorithm! • Backtrack and try another path - slow and cumbersome • DFA for prefixes - great solution, but still magical • Action & Goto tables - encoding of 3, so still magical • Idea: devise algorithm for 3&4 - DFA to recognize handles • Language generated by a CFG is generally not regular (cannot be generated by regex, as we saw earlier) • But language of handles for a CFG is regular • So use DFA to recognize those handles • if (DFA accepts) then REDUCE else SHIFT Jim Hogg - UW - CSE - P501

  3. Recap: Prefixes, Handles, Items • S is start symbol of a grammar G • If S =>*  then  is a sentential form of G •  is a viable prefix of G if there is some derivation: S =>*rm Aw =>*rm w and  is a prefix of  • A viable prefix is a prefix of a sentential form in a rightmost derivation • The occurrence of  in w is a handle of w Jim Hogg - UW - CSE - P501

  4. Dotted Items An itemis a production containing a  at some position in its RHS • eg: [A   X Y ] [A  X Y ] [A  X Y ] The  shows how much of the token stream we have seen so far Jim Hogg - UW - CSE - P501

  5. Example Grammar 0. Z  S $ ($ denotes end-of-file) 1. S  ( L ) 2. S  x 3. L  S 4. L  L , S • We have added a production Z with the original start symbol followed by end of file marker, $ • What language does this grammar generate? Jim Hogg - UW - CSE - P501

  6. Handles DFA - Opening Skirmish 0. Z  S$ 1. S  (L ) 2. S  x 3. L  S 4. L  L , S Initially • Stack is empty • Input is the right hand side of Z (ie, S$) • Initial configuration is [Z  S$] • As before  marks what we've seen so far - initially, nothing • But, since  is just before S, we might expect to next see anything that can be derived from S. So: • just before ( or x Jim Hogg - UW - CSE - P501

  7. Initial Items State 0. Z  S$ 1. S  ( L ) 2. S  x 3. L  S 4. L  L , S • A state is just a set of items • Start: an initial set of items • Completion (or closure): additional productions whose LHS appears immediately to the right of the dot in some item already in the state Z  S $ S  ( L ) S  x start item, or core Completion, or closure, of core Jim Hogg - UW - CSE - P501

  8. Shift Action on x 0. Z  S$ 1. S  ( L ) 2. S  x 3. L  S 4. L  L , S • If we shift and see x, we will transition to a state containing the item S  x  • So, add a new state with that item • In this case, closure of [S  x] adds nothing • If we hit this state during parse, we will reduce - no alternative! Z  S $ S  ( L ) S  x x S  x  Jim Hogg - UW - CSE - P501

  9. Shift Action on ( 0. Z  S$ 1. S  ( L ) 2. S  x 3. L  S 4. L  L , S • If, instead, we shift and see ( we obtain item S  (  L ) • Its closure adds another 4 items, as shown S  (L ) L  L , S L  S S  ( L ) S  x Z  S $ S  ( L ) S  x ( Jim Hogg - UW - CSE - P501

  10. Goto Actions 0. Z  S$ 1. S  ( L ) 2. S  x 3. L  S 4. L  L , S • If we instead shift S, we pop the RHS from the stack exposing the first state. Add a Goto transition on S for this. Z  S $ S  ( L ) S  x S Z  S $ Jim Hogg - UW - CSE - P501

  11. Building the Handles DFA 0. S’ S $ 1. S  ( L ) 2. S  x 3. L  S 4. L  L , S S’  S $ S  ( L ) S  x 1 Jim Hogg - UW - CSE - P501

  12. Build Handles DFA – Step 2 0. S’ S $ 1. S  ( L ) 2. S  x 3. L  S 4. L  L , S S’  S $ S  ( L ) S  x 1 S 4 S’ S  $ Jim Hogg - UW - CSE - P501

  13. Build Handles DFA – Step 3 0. S’ S $ 1. S  ( L ) 2. S  x 3. L  S 4. L  L , S 2 S’  S $ S  ( L ) S  x x 1 S x  S 4 S’ S  $ Jim Hogg - UW - CSE - P501

  14. Build Handles DFA – Step 4 0. S’ S $ 1. S  ( L ) 2. S  x 3. L  S 4. L  L , S 2 S’  S $ S  ( L ) S  x x 1 S x  3 ( S S (  L ) L   S L   L , S S   ( L ) S   x 4 S’ S  $ Jim Hogg - UW - CSE - P501

  15. Build Handles DFA – Step 5 0. S’ S $ 1. S  ( L ) 2. S  x 3. L  S 4. L  L , S 2 S’  S $ S  ( L ) S  x x 1 S x  x ( S S (  L ) L   S L   L , S S   ( L ) S   x 4 S’ S  $ 3 Jim Hogg - UW - CSE - P501

  16. Build Handles DFA – Step 6 0. S’ S $ 1. S  ( L ) 2. S  x 3. L  S 4. L  L , S 2 S’  S $ S  ( L ) S  x x 1 S x  x ( S S (  L ) L   S L   L , S S   ( L ) S   x 4 S’ S  $ 3 S L  S  7 Jim Hogg - UW - CSE - P501

  17. Build Handles DFA – Step 7 0. S’ S $ 1. S  ( L ) 2. S  x 3. L  S 4. L  L , S 2 S’  S $ S  ( L ) S  x x 1 S x  x ( S ( S (  L ) L   S L   L , S S   ( L ) S   x 4 S’ S  $ 3 S L  S  7 Jim Hogg - UW - CSE - P501

  18. Build Handles DFA – Step 8 0. S’ S $ 1. S  ( L ) 2. S  x 3. L  S 4. L  L , S 2 S’  S $ S  ( L ) S  x x 1 S x  x ( S ( S (  L ) L   S L   L , S S   ( L ) S   x 4 S’ S  $ L S ( L  ) L  L  , S 5 3 S L  S  7 Jim Hogg - UW - CSE - P501

  19. Build Handles DFA – Step 9 0. S’ S $ 1. S  ( L ) 2. S  x 3. L  S 4. L  L , S 2 S’  S $ S  ( L ) S  x x 1 S x  x ( S ( S (  L ) L   S L   L , S S   ( L ) S   x 4 S’ S  $ L S ( L  ) L  L  , S 5 3 ) S S ( L )  6 L  S  7 Jim Hogg - UW - CSE - P501

  20. Build Handles DFA – Step 10 0. S’ S $ 1. S  ( L ) 2. S  x 3. L  S 4. L  L , S 2 S’  S $ S  ( L ) S  x 8 x 1 S x  L  L ,  S S   ( L ) S   x x ( S ( , S (  L ) L   S L   L , S S   ( L ) S   x 4 S’ S  $ L S ( L  ) L  L  , S 5 3 ) S S ( L )  6 L  S  7 Jim Hogg - UW - CSE - P501

  21. Build Handles DFA – Step 11 0. S’ S $ 1. S  ( L ) 2. S  x 3. L  S 4. L  L , S 2 S’  S $ S  ( L ) S  x 8 x 1 S x  L  L ,  S S   ( L ) S   x 9 S L  L , S  x ( S ( , S (  L ) L   S L   L , S S   ( L ) S   x 4 S’ S  $ L S ( L  ) L  L  , S 5 3 ) S S ( L )  6 L  S  7 Jim Hogg - UW - CSE - P501

  22. Build Handles DFA – Step 12 0. S’ S $ 1. S  ( L ) 2. S  x 3. L  S 4. L  L , S 2 S’  S $ S  ( L ) S  x 8 x 1 S x  x L  L ,  S S   ( L ) S   x 9 S L  L , S  x ( S ( , S (  L ) L   S L   L , S S   ( L ) S   x 4 S’ S  $ L S ( L  ) L  L  , S 5 3 ) S S ( L )  6 L  S  7 Jim Hogg - UW - CSE - P501

  23. Build Handles DFA – Step 13 0. S’ S $ 1. S  ( L ) 2. S  x 3. L  S 4. L  L , S 2 S’  S $ S  ( L ) S  x 8 x 1 S x  x L  L ,  S S   ( L ) S   x 9 S L  L , S  x ( ( S ( , S (  L ) L   S L   L , S S   ( L ) S   x 4 S’ S  $ L S ( L  ) L  L  , S 5 3 ) S S ( L )  6 L  S  7 Jim Hogg - UW - CSE - P501

  24. Building Action & Goto Tables 2 S’  S $ S  ( L ) S  x 8 x 1 S x  x L  L ,  S S   ( L ) S   x 9 S L  L , S  x ( ( S ( , S (  L ) L   S L   L , S S   ( L ) S   x S ( L  ) L  L  , S 5 4 S’ S  $ L ) 3 S ( L )  6 S 0. S’ S $ 1. S  ( L ) 2. S  x 3. L  S 4. L  L , S L  S  7

  25. Building the Handles DFA: Theory • Closure(S) • S is a state in the Handles-DFA • Closure adds all items implied by items already in S • Goto (S, A) • S is a state in the Handles-DFA (set-of-items) • A is a grammar symbol (single non-terminal) • Goto moves the dot past symbol A in all appropriate items in S Jim Hogg - UW - CSE - P501

  26. Closure Algorithm fun Closure(S) = // S is a state in the Handles-DFA repeat foreachitem = [AB] in S do// B  N foreachproduction B do S = [B] enddo enddo until S does not change return S // update-in-place endfun Adds all the items to S that it should have: given B we are about to see a token that starts B; equally well, a token that starts any production of B Jim Hogg - UW - CSE - P501

  27. Goto Algorithm funGoto(S, X) = // S is node in DFA = set-of-Items // X is a terminal or non-terminal new = {} foreachitem = [AX] in S do new = [AX] enddo return Closure(new) endfun • How to find new, or existing, DFA state we reach on starting in state S and reading X. • We view S as a set-of-items Jim Hogg - UW - CSE - P501

  28. LR(0) Construction • Our LR Parse Table and Algorithm is based only on what we can see on the parse stack; so far, we ignore lookahead • So our grammar is restricted to LR(0) • Augment grammar with extra start production S’  S $ • Let N be the set of states (Nodes in DFA) • Let E be the set of edges • Initialize N to Closure( [S’  S $] ) • Initialize E to empty Jim Hogg - UW - CSE - P501

  29. LR(0) Construction Algorithm repeat foreachstate S in DFA do foreachitem [A X] in S do new = Goto (S, X) S = new E = I x new enddo enddo until E and S do not change View S (state in DFA) as a set-of-items. So S = new For symbol $, don’t compute Goto(I, $); instead, make this an accept action. Jim Hogg - UW - CSE - P501

  30. Building the Parse Tables (1) foreachedge = I x J do if X  T then Action[I, X] = sj // shift and goto state j if X  N then Goto[I, X] = gj // goto state j enddo Key T = set of Terminals N = set of NonTerminals Jim Hogg - UW - CSE - P501

  31. Building the Parse Tables (2) foreachstate S do foreachitem = [S'  S$] do Action[I, $] = accept enddo foreachitem = [A  ] do Action[I, *] = rj, where j is production number for A   enddo enddo Jim Hogg - UW - CSE - P501

  32. Where have we reached? We have built LR(0) DFA and Parser Tables (Action & Goto) • No lookahead yet • Variations of LR parsers add lookahead info, but basic idea of states, closures, and edges remains the same Jim Hogg - UW - CSE - P501

  33. A Grammar that is not LR(0) • We cannot parse the grammar below using LR(0), because it produces a Shift-Reduce conflict: • S  E $ • E  T + E • E  T • T  x (This grammar generates strings: x, x + x, x + x + x, etc) Jim Hogg - UW - CSE - P501

  34. 0. S E $ 1. E T + E 2. E T 3. T x LR(0) Parser for 2 1 E S  E $ S   E $ E  T + E E  T T  x 3 T E  T + E E  T  x + T 5 4 x T  x  E  T + E E  T + E E  T T  x 6 • SR conflict in State 3 • So grammar is not LR(0) E E  T + E  Jim Hogg - UW - CSE - P501

  35. SLR Parsers • Idea: Use information about what can follow a non-terminal to decide if we should perform a reduction • Easiest form is SLR – "Simple LR" • We need to compute FOLLOW(A) – the set of symbols that can immediately follow A in any possible derivation • ie, terminal t is in FOLLOW(A) if any derivation contains At • To compute this, we need to compute FIRST() for strings  that can follow A Jim Hogg - UW - CSE - P501

  36. Nullable, FIRST& FOLLOW • Nullable(A) is true if A can derive the empty string • ie, we can find A =>*  • FIRST(A)= set of Terminals that can begin strings derived from A • FIRST() = set of Terminals that can begin strings derived from  • FOLLOW(A) is the set of terminals that can immediately follow A • Recall standard notation: • A  N // Non-terminal •   (N  T)* // String of Non-terminals and Terminals Jim Hogg - UW - CSE - P501

  37. Computing FIRST: Intuition • To calculate FIRST(): • if  = A then FIRST() = FIRST(A), right? • Yes, but what if A has an epsilon production: A   • Then we need to union-in FOLLOW(A) (in this case, equal to FIRST()) Jim Hogg - UW - CSE - P501

  38. Example Z  d Z  X Y Z Y  ε Y  c X  Y X  a Jim Hogg - UW - CSE - P501

  39. Example Nullable? FIRST FOLLOW Z  d Z  X Y Z Y  ε Y  c X  Y X  a Iterate thru productions, until tables stop changing Jim Hogg - UW - CSE - P501

  40. Nullable, FIRST & FOLLOW: Intuition • To compute, start by: •  A  N, FIRST(A) = FOLLOW(A) = {} •  A  N, Nullable(A) = false •  a  T, FIRST(a) = {a} • Rules: • if A a then FIRST(A) = a • if A Y1 and Y1 not Nullable then FIRST(A) = FIRST(Y1) • if A Y1Y2 and Y1 is Nullable then FIRST(A) = FIRST(Y2) • if A Y1Y2Y3 and Y1,Y2 both Nullable then FIRST(A) = FIRST(Y3) • if A Y1..Yn and all Yi are Nullable then FIRST(A) =  a  T // Terminal A  N // Non-terminal Y  (N  T)// Terminal or Non-terminal   (N  T)* // String of Terminals and Non-terminals Jim Hogg - UW - CSE - P501

  41. Computing Nullable Nullable(A) is true if A can derive the empty string. ie, we can find A =>*  foreachA  N doNullable(A) = false enddo repeat foreachA  Y1..Yk in P do if Y1..Yk are all NullablethenNullable(A) = true enddo untilNullable does not change Nullable :: N  Bool • Intuition: • Suppose A  XY then we can derive A =>*  only if both X =>*  AND Y =>*  Jim Hogg - UW - CSE - P501

  42. Computing FIRST FIRST :: N  {T} foreach A  N do FIRST(A) = {} enddo repeat foreachA  Y1..Yk in P do FIRST(A) = FIRST(Y1) foreachiin [1..k-1] do ifNullable(Yi) then FIRST(A) = FIRST(Yi+1) else break enddo if all Y1..Yk are NullablethenFIRST(A) =  enddo until FIRST does not change Jim Hogg - UW - CSE - P501

  43. Computing FOLLOW FOLLOW :: N  {T} foreachA  N do FOLLOW(A) = {} enddo repeat foreachA  Y1..Yk in P do foreachiin [1..k-1] do if Yi  N then FOLLOW(Yi) = FIRST(Yi+1) enddo enddo until FOLLOW does not change • Intuition: • We cannot tell anything about FOLLOW(A) by looking at A's productions • But, given A  Y1..Yk we can infer FOLLOW(Yi). • Eg: AaBcdEFg => FOLLOW(B) = c; FOLLOW(E) = FIRST(F); etc Jim Hogg - UW - CSE - P501

  44. SLR Construction • Identical to LR(0) states. But refine the reduce actions • Algorithm: foreachstate S in DFA do foreachitem [A] in S do reduce by A onlyiflookahead  FOLLOW(A) enddo enddo Jim Hogg - UW - CSE - P501

  45. 0. S E $ 1. E T + E 2. E T 3. T x LR(0) Parser for 1 2 E S  E $ S   E $ E  T + E E  T T  x 3 T E  T + E E  T  x + T 5 4 x T  x  E  T + E E  T + E E  T T  x By taking account of lookahead, we can eliminate 5 reduce actions 6 E E  T + E  Jim Hogg - UW - CSE - P501

  46. On To LR(1) • Many practical grammars are SLR • LR(1) is even more powerful • Similar construction for LR(1), but notion of an item is more complex, incorporating lookahead information Jim Hogg - UW - CSE - P501

  47. LR(1) Items • An LR(1) item [A , a] is • A grammar production (A  ) • A right hand side position (the dot) • A lookahead symbol (a) • Idea: This item indicates that  is the top of the stack and the next input is derivable from a • Full construction: see Cooper&Torczon Jim Hogg - UW - CSE - P501

  48. LR(1) Tradeoffs • LR(1) • Pro: extremely precise; largest set of grammars / languages • Con: potentially very large parse tables with many states Jim Hogg - UW - CSE - P501

  49. LALR(1) • Variation of LR(1), but merge any two states that differ only in lookahead • Example: these two would be merged [A ::= x , a] [A ::= x , b] Jim Hogg - UW - CSE - P501

  50. LALR(1) vs LR(1) • LALR(1) tables can have many fewer states than LR(1) • LALR(1) may have reduce conflicts where LR(1) would not (but in practice this doesn’t happen often) • Most practical bottom-up parser tools are LALR(1)(egyacc, bison, CUP) Jim Hogg - UW - CSE - P501

More Related