340 likes | 463 Views
CS412/413. Introduction to Compilers and Translators Spring ’99 Lecture 5: Bottom-up parsing. Outline. Creating LL(1) grammars Limitations of LL(1) grammars Bottom-up parsing LR(0) parser construction. Administration. Should have received mail about group assignments by now
E N D
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 5: Bottom-up parsing
Outline • Creating LL(1) grammars • Limitations of LL(1) grammars • Bottom-up parsing • LR(0) parser construction CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Administration • Should have received mail about group assignments by now • Homework 1 due next class (Friday) • Monday considered 2 days late (-20%), Tuesday 3 days (-40%) • No class next Monday (Feb 8) CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Programming Assignment • Due Monday, Feb 15 • Implement a lexer for Iota language • Do not need to implement DFA construction • Opportunity to work as group • We expect high quality CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Review • Can construct recursive descent parsers for LL(1) grammars Language grammar How to perform this step? LL(1) grammar predictive parse table recursive-descent parser recursive-descent parser w/ AST generation CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Grammars • Have been using grammar for language of “sums with parentheses” • Original grammar: S S + E| E E number | ( S ) • LL(1) grammar for same language: S ES’ S’ | + S E number | ( S ) (1+(3+4))+5 CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
(1 + 2 + (3 + 4)) + 5 Left-recursive vs Right-recursive • Original grammar was left-recursive SS+ E S E • LL(1) grammar is right-recursive : parsed top-down S E S’ S’ | + S • Left-recursive grammars don’t work with top-down parsing -- need an arbitrary amount of look-ahead S E + S S E S S + E (...) + (...) + (...) + (...) ... S + E S + E CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
How to create an LL(1) grammar • Write a right-recursive grammar S E + S S E • Left-factor common prefixes, place suffix in new non-terminal S E S’ S’ S’ + S CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
(1 + 2 + (3 + 4)) + 5 Right Recursion S • Right recursion : right-associative E S’ + ( S ) + S + 5 E S’ 5 1 + 1 + S 2 + E S’ 3 4 2 + S E S’ • Left recursion : left-associative + ( S ) + 5 E S’ + + + S 3 E 1 2 3 4 4 CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Associativity • We can provide left-associativity by massaging the recursive-descent code void parse_S() { switch (token) { case ‘(’: case number: parse_E(); parse_S’(); return; default: throw new ParseError(); } } void parse_S’() { switch (token) { case ‘+’: token = input.read(); parse_S(); return; case ‘)‘: return; case EOF: return; default: throw new ParseError(); } } CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Associativity void parse_S() {// parses a sequence of E + E + E ... switch (token) { case ‘(’: case number: parse_E(); switch (token) { case ‘+’: token = input.read(); parse_S(); return; case ‘)‘: return; case EOF: return; default: throw new ParseError(); } return; default: throw new ParseError(); } } tail recursion CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
+ 3 4 Flattening Associative Operators void parse_S () { // parses an arbitrary sequence of E + E + E ... while (true) { switch (token) { case ‘(’: case number: parse_E (); switch (token) { case ‘+’: token = input.read(); break; case ‘)‘: case EOF: return; default: throw new ParseError(); } break; default: throw new ParseError(); } } } (1 + 2 + (3+4)) + 5 + + 5 1 2 CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Summary • Now have complete recipe for building a parser Language grammar LL(1) grammar predictive parse table recursive-descent parser recursive-descent parser w/ AST generation CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Bottom-up parsing • A more powerful parsing technology • LR grammars -- more power than LL • can handle left-recursive grammars, virtually all programming languages • More natural expression of programming language syntax • Shift-reduce parsers • automatic parser generators (e.g. yacc) • detect errors as soon as possible • allows better error recovery CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
S S + E| E E number | ( S ) Top-down parsing (1+2+(3+4))+5 S S+E E+E (S)+E (S+E)+E (S+E+E)+E (E+E+E)+E (1+E+E)+E(1+2+E)+E ... • In left-most derivation, entire tree above a token (2) has been expanded when encountered • Must be able to predict! S S + E E 5 ( S ) S + E ( S ) S + E E S + E 2 4 1 E 3 CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
S S + E| E E number | ( S ) Bottom-up parsing • Right-most derivation-- backward • Start with the tokens • End with the start symbol (1+2+(3+4))+5 (E+2+(3+4))+5 (S+2+(3+4))+5 (S+E+(3+4))+5 (S+(3+4))+5 (S+(E+4))+5 (S+(S+4))+5 (S+(S+E))+5 (S+(S))+5 (S+E)+5 (S)+5 E+5 S+E S CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
S S + E| E E number | ( S ) Bottom-up parsing (1+2+(3+4))+5 (1+2+(3+4))+5 (E+2+(3+4))+5 (1 +2+(3+4))+5 (S+2+(3+4))+5 (1 +2+(3+4))+5 (S+E+(3+4))+5 (1+2 +(3+4))+5 (S+(3+4))+5 (1+2+(3 +4))+5 (S+(E+4))+5 (1+2+(3 +4))+5 (S+(S+4))+5 (1+2+(3 +4))+5 (S+(S+E))+5 (1+2+(3+4 ))+5 (S+(S))+5 (1+2+(3+4 ))+5 (S+E)+5 (1+2+(3+4) )+5 (S)+5 (1+2+(3+4) )+5 E+5 (1+2+(3+4)) +5 S+E(1+2+(3+4))+5 S(1+2+(3+4))+5 right-most derivation CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
S S + E| E E number | ( S ) Bottom-up parsing • (1+2+(3+4))+5 (E+2+(3+4))+5 (S+2+(3+4))+5 (S+E+(3+4))+5 … • Advantage of bottom-up parsing: can select productions based on more information S S + E E 5 ( S ) S + E ( S ) S+E E S + E 2 4 1 E 3 CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Top-down vs. Bottom-up Bottom-up: Don’t need to figure out as much of the parse tree for a given amount of input scanned unscanned scanned unscanned Top-down Bottom-up CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Shift-reduce parsing • Parsing is a sequence of shift and reduce operations • Parser state is a stack of terminals and non-terminals (grows to the right) • Unconsumed input is a string of terminals • Current derivation step is always stack+input • Shift -- push head of input onto stack stack input ( 1+2+(3+4))+5 (1 +2+(3+4))+5 CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Reduce • Replace symbols in top of stack with non-terminal symbol X, corresponding to production X (pop , push X) stack input (S+E +(3+4))+5 reduce S S+E (S +(3+4))+5 • What effect does this have on derivation? CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
S S + E| E E number | ( S ) Shift-reduce parsing derivation input stream action stack (1+2+(3+4))+5 (1+2+(3+4))+5 shift (1+2+(3+4))+5 ( 1+2+(3+4))+5 shift (1+2+(3+4))+5 (1 +2+(3+4))+5 reduceEnum (E+2+(3+4))+5 (E +2+(3+4))+5 reduceS E (S+2+(3+4))+5 (S +2+(3+4))+5 shift (S+2+(3+4))+5 (S+ 2+(3+4))+5 shift (S+2+(3+4))+5 (S+2 +(3+4))+5 reduceEnum (S+E+(3+4))+5 (S+E +(3+4))+5 reduce S S+E (S+(3+4))+5 (S +(3+4))+5 shift (S+(3+4))+5 (S+ (3+4))+5 shift (S+(3+4))+5 (S+( 3+4))+5 shift (S+(3+4))+5 (S+(3 +4))+5 reduce Enum CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Problem • How do we know which action to take -- whether to shift or reduce, and which production? • Sometimes can reduce but shouldn’t • e.g., X can always be reduced • Sometimes can reduce in different ways CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Action Selection Problem • Given stack and input symbol b, should we • shift b onto the stack (making it b) • reduce some production X assuming that stack has the form (making it X) • Should apply reduction X depending on what stack prefix is -- but is different for different possible reductions, since ’s have different length. How to keep track? CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Parser States • Idea: summarize all possible stack prefixes as a parser state • A state transition function updates the parser state as shifts and reductions are performed: DFA • Summarizing discards information • affects what grammars parser handles • affects size of DFA (number of states) CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
LR(0) parser • Left-to-right scanning, Right-most derivation, zero look-ahead characters • Too weak to handle most language grammars (including this one) • But will help us understand how to build better parsers CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
LR(0) states • A state is a set of items • An LR(0) item is a production from the language with a separator “.” somewhere in the RHS of the production • Stuff before “.” already on stack (beginnings of possible ’s) • Stuff after “.” : what we might see next • The prefixes represented by state E number . E ( .S ) state item CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
An LR(0) grammar: non-empty lists S ( L ) S id L S L L , S x (x,y) (x, (y,z), w) ((((x)))) (x, (y, (z, w))) CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
S ( L ) | id L S | L, S Closure S ’ . S $ S . ( L ) S . id start state Closure S ’ .S $ • Closure of a state adds items for all productions whose LHS occurs in an item in the state, just after “.” • Added items have the “.”located at the beginning • Like NFA DFA conversion CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Applying shift actions S ( .L ) L .S L .L , S S . ( L ) S . id S ( L ) | id L S | L , S S ’ . S $ S . ( L ) S . id ( ( id id S id . In new state, include all items that have appropriate input symbol just after dot, and advance dot in those items. (and take closure) CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Applying reduce actions S ( .L ) L .S L .L , S S . ( L ) S . id S ( L.) L L . , S • Need to set state after reducing • On reduction, pop back to old state and take DFA transition on non-terminal reduced L S ’ . S $ S . ( L ) S . id ( S ( L S. id id S id . states causing reductions CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Full DFA (Appel p. 63) 8 9 2 L L , . S S . ( L ) S . id 1 id S id S ’ . S $ S . ( L ) S . id S id . L L , S . id 3 S ( .L ) L .S L .L , S S . ( L ) S . id ( 5 L S ( L.) L L . , S S ) ( S 6 S ( L ). 4 7 L S. S ’ S . $ $ final state CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Idea: stack is labeled w/state Let’s try parsing ((x),y) derivation stack input action ((x),y) 1 ((x),y) shift, goto 3 ((x),y) 1 (3 (x),y)shift, goto 3 ((x),y) 1 (3 (3 x),y)shift, goto 2 ((x),y) 1 (3 (3 x2 ),y)reduce Sid ((S),y) 1 (3 (3 S7 ),y)reduce LS ((L),y) 1 (3 (3 L5 ),y)shift, goto 6 ((L),y) 1 (3 (3 L5)6 ,y)reduce S(L) (S,y) 1 (3 S7 ,y)reduce LS (L,y) 1 (3 L5 ,y)shift, goto 8 (L,y) 1 (3 L5 , 8 y)shift, goto 9 (L,y) 1 (3 L5 , 8 y2 )reduce Sid (L,S) 1 (3 L5 , 8 S9 )reduce LL , S (L) 1 (3 L5 )shift, goto 6 (L) 1 (3 L5 )6 reduce S(L) S 1S4$ done S ( L ) | id L S | L, S
Summary • Grammars can be parsed bottom-up using a DFA + stack • State construction converts grammar into states that capture information needed to know what action to take • Stack entries labeled by state index • Next time: SLR, LR(1) parsers, automatic parser generators CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers