270 likes | 381 Views
Bottom-Up Syntax Analysis. Mooly Sagiv html://www.math.tau.ac.il/~msagiv/courses/wcc01.html Textbook:Modern Compiler Implementation in C Chapter 3. Pushdown automata Deterministic Report an error as soon as the input is not a prefix of a valid program
E N D
Bottom-Up Syntax Analysis Mooly Sagiv html://www.math.tau.ac.il/~msagiv/courses/wcc01.html Textbook:Modern Compiler Implementation in C Chapter 3
Pushdown automata Deterministic Report an error as soon as the input is not a prefix of a valid program Not usable for all context free grammars context free grammar parser tokens Efficient Parsers bison “Ambiguity errors” parse tree
Top-Down (Predictive Parsing) LL Construct parse tree in a top-down matter Find the leftmost derivation For every non-terminal and token predict the next production Bottom-Up LR Construct parse tree in a bottom-up manner Find the rightmost derivation in a reverse order For every potential right hand side and token decide when a production is found Kinds of Parsers
Input A context free grammar A stream of tokens Output A syntax tree or error Method Construct parse tree in a bottom-up manner Find the rightmost derivation in (reversed order) For every potential right hand side and token decide when a production is found Report an error as soon as the input is not a prefix of valid program Bottom-Up Syntax Analysis
Pushdown automata Bottom-up parsing (given a parser table) Constructing the parser table Interesting non LR grammars Plan
Pushdown Automaton input u t w $ V control parser-table $ stack
reduceA Pop | | symbol from the stack Apply the associated action Push a symbol goto[top, A] on the stack shiftX Push X onto the stack Advance the input accept Parsing is complete error Report an error Bottom-Up Parser Actions
A Parser Table for S a S b| Manual Construction?
The Challenge • How to construct a parser-table from a given grammar • LR(1) grammars • Left to right scanning • Rightmost derivations (reverse) • 1 token • Different solutions • Operator precedence • SLR(1) • Simple LR(1) • CLR(1) • Canonic LR(1) • LALR(1) • Look Ahead LR(1) • Yacc, Bison, JCUP
Grammar Hierarchy Non-ambiguous CFG CLR(1) LL(1) LALR(1) SLR(1)
Constructing an SLR parsing table • Add a production S’ S$ • Construct a finite automaton accepting “valid stack symbols” • The states of the automaton becomes the states of parsing-table • Determine shift operations • Determine goto operations • Construct reduce entries by analyzing the grammar
A finite Automaton for S’ S$ S a S b| a a S b 0 1 2 3 S 4
Constructing a Finite Automaton • NFA • For X X1 X2 … Xn • [X X1 X2 …XiXi+1 … Xn] • “prefixes of rhs (handles)” • X1 X2 … Xi is at the top of the stack and we expect Xi+1 … Xn • The initial state [S’ .S$] • ([X X1…XiXi+1 … Xn], Xi+1 = [X X1 …XiXi+1 … Xn] • For every production Xi+1 ([[X X1 X2 …XiXi+1 … Xn], ) = [Xi+1 ] • Convert into DFA
a S b S NFA S’ S$ S a S b| [S .aSb] [S a.Sb] [S aS.b] [S’ .S$] [S .] [S aSb.] [S’ S.$]
DFA [S’ .S$] [S .aSb] [S .] [S a.Sb] [S .aSb] [S .] S a [S aS.b] b [S aSb.] S a [S’ S.$] a S [S .aSb] [S a.Sb] [S aS.b] [S’ .S$] b S [S .] [S aSb.] [S’ S.$]
[S’ .S$] [S .aSb] [S .] [S a.Sb] [S .aSb] [S .] S a [S aS.b] b [S aSb.] S a [S’ S.$]
Filling reduce entries • For an item [A .] we need to know the tokens that can follow A in a derivation from S’ • Follow(A) = {t | S’ * At} • See the textbook for an algorithm for constructing Follow from a given grammar
[S’ .S$] [S .aSb] [S .] [S a.Sb] [S .aSb] [S .] S a [S aS.b] b [S aSb.] S a [S’ S.$] Follow(S) = {b, $} r S r S r S r S r S a S b
Interesting Non SLR(1) Grammar S’ S$ S L = R | R L *R | id R L Partial DFA [S L=.R] [R .L] [L .*R] [L .id] [S’ .S$] [S .L=R] [S .R] [L .*R] [L .id] [R L] [S L.=R] [R L.] = L Follow(R)= {$, =}
LR(1) Parser • Item [A ., t] • is at the top of the stack and we are expecting t • LR(1) State • Sets of items • LALR(1) State • Merge items with the same look-ahead
Interesting Non LR(1) Grammars • Ambiguous • Arithmetic expressions • Dangling-else • Common derived prefix • A B1 a b | B2 a c • B1 • B2 • Optional non-terminals • St OptLab Ass • OptLab id : | • Ass id := Exp
Summary • LR is a powerful technique • Generates efficient parsers • Generation tools exit • Bison, yacc, CUP • But some grammars need to be tuned • Shift/Reduce conflicts • Reduce/Reduce conflicts • Efficiency of the generated parser