1 / 78

LR Parsing Build Bottom-Up Parse Tree Handles Writing a Shift-Reduce Parser

CSE P501 – Compilers. LR Parsing Build Bottom-Up Parse Tree Handles Writing a Shift-Reduce Parser ACTION & GOTO Parse Tables Dotted Items SR & RR conflicts Next. LR Parsing. ‘Middle End’. Back End. Target. Source. Front End. chars. IR. IR. Scan. Select Instructions. Optimize.

nami
Download Presentation

LR Parsing Build Bottom-Up Parse Tree Handles Writing a Shift-Reduce Parser

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSE P501 – Compilers LR Parsing Build Bottom-Up Parse Tree Handles Writing a Shift-Reduce Parser ACTION & GOTO Parse Tables Dotted Items SR & RR conflicts Next Jim Hogg - UW - CSE - P501

  2. LR Parsing ‘Middle End’ Back End Target Source Front End chars IR IR Scan Select Instructions Optimize tokens IR Allocate Registers Parse IR AST Emit Convert IR IR Machine Code AST = Abstract Syntax Tree IR = Intermediate Representation Jim Hogg - UW - CSE P501

  3. S aABe A  Abc| b B  d a b b c d e • Build Bottom-Up Parse Tree Black dot marks how much we've read of the input token stream so far (none) Shift dot to right Jim Hogg - UW - CSE - P501

  4. S aABe A  Abc| b B  d a b b c d e abbcde – done wrong – step 2 Can we reduce a ? No. Shift dot to right Jim Hogg - UW - CSE - P501

  5. S aABe A  Abc| b B  d a b  b c d e abbcde – done wrong – step 3 Can we reduce a or ab? Yes, using A  b. Note: a b (in red) marks current frontier Jim Hogg - UW - CSE - P501

  6. S aABe A  Abc| b B  d A | a b  b c d e abbcde – done wrong – step 4 Note: frontier is now aA Can we reduce aA or A ? No. Shift dot Jim Hogg - UW - CSE - P501

  7. S aABe A  Abc| b B  d A | a b b c d e abbcde – done wrong – step 5 Can we reduce aAb or Ab or b? Yes, using A  b Jim Hogg - UW - CSE - P501

  8. AA | | a b b c d e abbcde – done wrong – step 6 S  aABe A  Abc| b B  d Can we reduce aAA or AA or A? No. Shift dot Jim Hogg - UW - CSE - P501

  9. AA | | a b bc d e abbcde – done wrong – step 7 S  aABe A  Abc| b B  d Can we reduce aAAc or AAc or Ac or c? No. Shift dot Jim Hogg - UW - CSE - P501

  10. AA | | a b bcd e abbcde – done wrong – step 8 S  aABe A  Abc| b B  d Can we reduce aAAcdor AAcdor Acdor ddord? No. Shift dot Jim Hogg - UW - CSE - P501

  11. S  a A B e A  A b c| b B  d AA | | a b bcde abbcde – done wrong – step 9 Can we reduce aAAcdeor AAcdeor Acdeor cde or de or e? No. We are stuck! Jim Hogg - UW - CSE - P501

  12. Rewind • Rewind to step 5 • Let's not reduce using A  b. Shift instead • Try again! Jim Hogg - UW - CSE - P501

  13. A | a b b c d e abbcde – done right – step 5 S  aABe A  Abc| b B  d Can we reduce aAb or Ab or b? Yes, we could, using A  b. But it led to a dead-end, first time. So do not reduce. Instead, shift Jim Hogg - UW - CSE - P501

  14. A | a b bc d e abbcde – done right – step 6 S  aABe A  Abc| b B  d Can we reduce aAbc or Abc or bc or c? Yes, using A  Abc. Jim Hogg - UW - CSE - P501

  15. A | A | a b b c  d e abbcde – done right – step 7 S  aABe A  Abc| b B  d Can we reduce aAorA? No. Shift Jim Hogg - UW - CSE - P501

  16. A | A | a b b c d e abbcde – done right – step 8 S  aABe A  Abc| b B  d Can we reduce aAd or Ad ord? Yes, using B  d Jim Hogg - UW - CSE - P501

  17. A | A B | | a b b c d  e abbcde – done right – step 9 S  aABe A  Abc| b B  d Can we reduce aAB or AB? No. Shift Jim Hogg - UW - CSE - P501

  18. A | A B | | a b b c d e abbcde – done right – step 10 S aABe A  Abc| b B  d Can we reduce aABe or ABe or Be or e ? Yes, using S aABe Jim Hogg - UW - CSE - P501

  19. S A | A B | | a b b c d e abbcde – done right – step 11 S aABe A  Abc| b B  d We just executed a rightmost derivation, backwards: S => a AB e => a A d e => a A b c d e => a b b c d e Jim Hogg - UW - CSE - P501

  20. LR(1) Parsing • Left to right scan; Rightmost derivation; 1 token lookahead • "Bottom-Up" approach. Also called Shift-Reduce parser • The syntax of almost all practical programming languages can be specified by an LR(1) grammar • LALR(1) and SLR are subsets of LR(1) • LALR(1) can parse most real languages, has a smaller memory footprint, and is used by CUP • All variants (SLR, LALR, LR) use same algorithm – but different driver tables Jim Hogg - UW - CSE - P501

  21. LR Parsing in Greek • Bottom-up parser builds a rightmost derivation, backwards • Given the rightmost derivation: • S =>1=>2=>…=>n-2=>n-1=>n = w parser will first discover n-1=>n , then n-2=>n-1 , etc • But it discovers n-1=>n before seeing all of n: • S => a AB e => a A d e => a A b c d e => a b b c d e • X denotes rightmost terminal to derive • S <= a A B e  <= a Ad  e <= a A b c  d e <= a b b c d e •  denotes handle •  denotes top-of-stack = end of input so far seen • Parsing terminates when • 1 reduced to S (start symbol, success), or • No match can be found (syntax error) Jim Hogg - UW - CSE - P501

  22. Terminology : Sentential Forms • If S =>* , the string  is called a sentential form of the of the grammar (not yet a sentence, but on its way) • In the derivation S =>1=>2=>…=>n-2=>n-1=>n = w each of the iare sentential forms • A sentential form in a rightmost derivation is called a rightsentential form Jim Hogg - UW - CSE - P501

  23. Handles Informally, a handle is a substring of the frontier that matches the RHS of the "correct " production • Even ifA   is a production,  is a handle only if it matches the frontier at a point where A   was used in the derivation. So, it’s a handle if we should reduce by it (yes, this definition is circular) •  may appear in many other places in the frontier without being a handle for that particular production Jim Hogg - UW - CSE - P501

  24. Handle – the Dragon Definition Formally, a handleof a right-sentential form  is a production A   and a position in  where  may be replaced by A to produce the previous right-sentential form in the rightmost derivation of  Jim Hogg - UW - CSE - P501

  25. Handle Examples In the derivation: S => a A B e => a Ad e => a Ab c d e => abbcde • abbcde is a right sentential form whose handle is Ab at position 2 • aAbcde is a right sentential form whose handle is AAbc at position 4 • Note: some books take the left of the match as the position – but it really doesn't matter Jim Hogg - UW - CSE - P501

  26. Writing a Shift-Reduce Parser Key Data structures • A stack holding the frontier of the tree • The token-stream of remaining input Jim Hogg - UW - CSE - P501

  27. Shift-Reduce Parser Operations • Reduce– if the top of stack is a handle (RHS of some A that we should use to reduce), pop , push A • Shift– push the next input symbol onto the stack • Accept– announce success • Error– syntax error discovered Jim Hogg - UW - CSE - P501

  28. S aABe A Abc| b B d Shift-Reduce Example – Step 0 Stack Input Action $ abbcde$ shift • Note: • “$” marks bottom-of-stack • “$” also marks end-of-input • Neither one takes part in the parse. They don’t move. Jim Hogg - UW - CSE - P501

  29. S aABe A Abc| b B d Shift-Reduce Example – Step 1 Stack Input Action $ abbcde$ shift $a bbcde$ shift • At each step, look for a handle at top-of-stack • If we find a handle, then reduce • If we don’t find a handle, then shift • We are relying on clairvoyance - foretelling the future - to decide which RHSs at top-of-stack are handles Jim Hogg - UW - CSE - P501

  30. S aAB e A Abc| b B d Shift-Reduce Example Stack Input Action 0 $abbcde$ shift 1 $abbcde$ shift 2 $abbcde$ reduce 3 $aAbcde$ shift 4 $aAbcde$ shift 5 $aAbcde$ reduce 6 $aAde$ shift 7 $aAde$ reduce 8 $aABe$ shift 9 $aABe$ reduce 10 $S$ accept Jim Hogg - UW - CSE - P501

  31. How Do We Automate This? • Lacking a clairvoyance function, we could resort to back-tracking. But it's too slow. It's a non-starter • Viable prefix– a prefix of a right-sentential form that can appear on the stack of the shift-reduce parser (on its way to a successful parse) • Idea: Construct a DFA to recognize viable prefixes given the stack and (one or two tokens from) remaining input • Perform reductions when we recognize handles Jim Hogg - UW - CSE - P501

  32. S' S$ S aABe A Abc| b B d DFA for prefixes for: e accept 8 9 S  a A B e B $ a A b c start 1 2 3 6 7 A  A b c b d 4 5 A  b B  d • We have augmented the grammar with a unique start symbol, S’ • This DFA replaces our clairvoyance function – equally magical at this point! • States 4,5,7,9 of this DFA define the handles • Eg: if stack is …aAbc then reduce using A  Abc (always at top of stack); then unwind, back to state 1 Jim Hogg - UW - CSE - P501

  33. Stack Input $ abbcde$ e accept 8 9 S  aABe B $ a start A b c 1 2 3 6 7 A  Abc b d 4 5 A  b B  d Trace – Step 1 S' S$ S aABe A Abc| b B d • Stack = { } • DFA = state 1; not a “reduce” state, {4,5,7,9}, so shift Jim Hogg - UW - CSE - P501

  34. Stack Input $a bbcde$ e accept 8 9 S  aABe B $ a start A b c 1 2 3 6 7 A  Abc b d 4 5 A  b B  d Trace – Step 2 S' S$ S aABe A Abc| b B d • Stack = a • DFA = 2; not a “reduce” state, so shift Jim Hogg - UW - CSE - P501

  35. Stack Input $ab bcde$ e accept 8 9 S  aABe B $ a start A b c 1 2 3 6 7 A  Abc b d 4 5 A  b B  d Trace – Step 3 S' S$ S aABe A Abc| b B d • Stack = ab • DFA = 4 => reduce, using production A b • So, pop b (RHS of production); and push A (LHS or production) • Retrace to DFA state 1 Jim Hogg - UW - CSE - P501

  36. Stack Input $aAbcde$ e accept 8 9 S  aABe B $ a start A b c 1 2 3 6 7 A  Abc b d 4 5 A  b B  d Trace – Step 4 S' S$ S aABe A Abc| b B d • Stack = aA • So transition states 1  2  3 • DFA = 3 => shift Jim Hogg - UW - CSE - P501

  37. Stack Input $aAbcde$ e accept 8 9 S  aABe B $ a start A b c 1 2 3 6 7 A  Abc b d 4 5 A  b B  d Trace – Step 5 S' S$ S aABe A Abc| b B d • Stack = aAb • DFA = 6 => shift Jim Hogg - UW - CSE - P501

  38. Stack Input $aAbcde$ e accept 8 9 S  aABe B $ a start A b c 1 2 3 6 7 A  Abc b d 4 5 A  b B  d Trace – Step 6 S' S$ S aABe A Abc| b B d • Stack = aAbc • DFA = 7 => reduce by A  Abc • So, pop Abc push A • Retreat to state 1 Jim Hogg - UW - CSE - P501

  39. Stack Input $aAde$ e accept 8 9 S  aABe B $ a start A b c 1 2 3 6 7 A  Abc b d 4 5 A  b B  d Trace – Step 7 S' S$ S aABe A Abc| b B d • Stack = aA • DFA = 3 => shift Jim Hogg - UW - CSE - P501

  40. Stack Input $aAde$ e accept 8 9 S  aABe B $ a start A b c 1 2 3 6 7 A  Abc b d 4 5 A  b B  d Trace – Step 8 S' S$ S aABe A Abc| b B d • Stack = aAd • DFA = 5 => reduce by B  d • So, pop d, push B • Retreat to DFA = 1 Jim Hogg - UW - CSE - P501

  41. Stack Input $aABe$ e accept 8 9 S  aABe B $ a start A b c 1 2 3 6 7 A  Abc b d 4 5 A  b B  d Trace – Step 9 S' S$ S aABe A Abc| b B d • Stack = aAB • So transition states 1  2  3  8 • DFA = 8 => shift Jim Hogg - UW - CSE - P501

  42. Stack Input $aABe$ e accept 8 9 S  aABe B $ a start A b c 1 2 3 6 7 A  Abc b d 4 5 A  b B  d Trace – Step 10 S' S$ S aABe A Abc| b B d • Stack = aABe • DFA = 9 => reduce by S  aABe • So pop aABe, push S • Retreat to DFA = 1 Jim Hogg - UW - CSE - P501

  43. Stack Input $S $ e accept 8 9 S  aABe B $ a start A b c 1 2 3 6 7 A  Abc b d 4 5 A  b B  d Trace – Step 11 S' S$ S aABe A Abc| b B d • Stack = S • DFA = 1 => shift Jim Hogg - UW - CSE - P501

  44. Stack Input $S$ e accept 8 9 S  aABe B $ a start A b c 1 2 3 6 7 A  Abc b d 4 5 A  b B  d Trace – Step 12 S' S$ S aABe A Abc| b B d • Stack = S$ • => accept Jim Hogg - UW - CSE - P501

  45. Cast out the Magic • We started with a magical clairvoyance function • We replaced this with, an equally magical, DFA • The DFA approach included too much repetition: • retreat to DFA = 1, then rescan the stack to find the new DFA state • we only replaced the handle with its NonTerminal LHS, so first part of stack is unchanged • Want the parser to run in linear time – proportional to total number of tokens • How do avoid repetition? • How to construct the magic DFA, for any grammar? Jim Hogg - UW - CSE - P501

  46. Avoiding DFA Rescanning • Observe: after a reduce, the contents of the stack are little altered: we replaced handle at top-of-stack with its LHS non-terminal • So, re-scanning the stack will step thru same DFA transitions as before, until the last one • So, record trace of DFA state numbers on stack to avoid the rescan Jim Hogg - UW - CSE - P501

  47. LR Stack • DFA pictures are nice, but we want a program to do it • Could change the stack to hold <state, token> pairs. Perhaps easier to understand and/or debug? • $ <s0,X0> <s1,X1> . . . <sn,Xn> • But, all we need are the states (think about it!) • on a reduce, pop top states - reduce rule tells us how many • then push corresponding LHS, non-terminal Jim Hogg - UW - CSE - P501

  48. S aABe A Abc A b B d Reminder - DFA for: e accept 8 9 S  a A B e B $ a A b c start 1 2 3 6 7 A  A b c b d 4 5 A  b B  d => shift => reduce Jim Hogg - UW - CSE - P501

  49. e accept 8 9 S  aABe B $ a start A b c 1 2 3 6 7 A  Abc b d 4 5 A  b B  d ACTION & GOTO Parse Tables S aABe A Abc A b B d Key sN = shift; transition to state number N rR = reduce using rule R gN = goto state number N acc = accept blank = syntax error in program

  50. DFA Transition Tables: Summary • ACTION => what to do after a shift • Row = current state • Column = next token (terminal) • sN = shift; move to DFA state number N • rR = reduce, using Rule number R • acc = accept the input program • blank = syntax error in input program • report, recover, continue • for P501, just report and stop! • GOTO=> what to do after a reduce • Row = current state (top-of-stack, after pushing non-terminal) – think of this as the uncovered state • Column = LHS of reduction (non-terminal) • gN = goto DFA state number N • blank = bug in the GOTO table Jim Hogg - UW - CSE - P501

More Related