770 likes | 824 Views
Compilers. 4. Bottom-up Parsing Chih-Hung Wang. References 1. C. N. Fischer and R. J. LeBlanc. Crafting a Compiler with C. Pearson Education Inc., 2009. 2. D. Grune, H. Bal, C. Jacobs, and K. Langendoen. Modern Compiler Design. John Wiley & Sons, 2000.
E N D
Compilers 4. Bottom-up Parsing Chih-Hung Wang References 1. C. N. Fischer and R. J. LeBlanc. Crafting a Compiler with C. Pearson Education Inc., 2009. 2. D. Grune, H. Bal, C. Jacobs, and K. Langendoen. Modern Compiler Design. John Wiley & Sons, 2000. 3. Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman. Compilers: Principles, Techniques, and Tools. Addison-Wesley, 1986. (2nd Ed. 2006)
rest_expression expression term rest_expr IDENT IDENT IDENT aap + ( noot + mies ) Creating a bottom-up parser automatically • Left-to-right parse, Rightmost-derivation • create a node when all children are present • handle: nodes representing the right-hand side of a production
LR(0) Parsing • Theoretically important but too weak to be useful. • running example: expression grammar input expression EOF expression expression ‘+’ term | term term IDENTIFIER | ‘(’ expression ‘)’ • short-hand notation Z E $ E E ‘+’ T | T T i | ‘(’ E ‘)’
Z E $ E E ‘+’ T E T T i T ‘(’ E ‘)’ LR(0) Parsing • keep track of progress inside potential handles when consuming input tokens • LR items: N • initial set S0 Z E $ E E ‘+’ T E T T i T ‘(’ E ‘)’
Closure algorithm for LR(0) The important part is the inference rule; it predicts new handle hypotheses from the hypothesis that we are looking for a certain non-terminal, and is sometimes called prediction rule; it corresponds to an move, in that it allows the automation to move to another state without consuming input. Reduce item: an item with the dot at the end Shift item: the others
S0 Z E $ E E ‘+’ T E T T i T ‘(’ E ‘)’ S3 Z E $ E E ‘+’ T S5 E E + T Transition Diagram S2 T E T i S1 T i E i S4 E E ‘+’ T T i T ‘(’ E ‘)’ ‘+’ $ T S6 Z E $
input i + i $ stack S0 LR(0) parsing example (1) Z E $ E E ‘+’ T E T T i T ‘(’ E ‘)’ • shift input token (i) onto the stack • compute new state
LR(0) parsing example (2) Z E $ E E ‘+’ T E T T i T ‘(’ E ‘)’ stack input S0 i S1 + i $ • reduce handle on top of the stack • compute new state
LR(0) parsing example (3) • reduce handle on top of the stack • compute new state Z E $ E E ‘+’ T E T T i T ‘(’ E ‘)’ stack input S0 T S2 + i $ i
LR(0) parsing example (4) • shift input token on top of the stack • compute new state Z E $ E E ‘+’ T E T T i T ‘(’ E ‘)’ stack input S0 E S3 + i $ T i
LR(0) parsing example (5) • shift input token on top of the stack • compute new state Z E $ E E ‘+’ T E T T i T ‘(’ E ‘)’ stack input S0 E S3 + S4 i $ T i
LR(0) parsing example (6) • reduce handle on top of the stack • compute new state Z E $ E E ‘+’ T E T T i T ‘(’ E ‘)’ stack input S0 E S3 + S4 i S1 $ T i
LR(0) parsing example (7) • reduce handle on top of the stack • compute new state Z E $ E E ‘+’ T E T T i T ‘(’ E ‘)’ stack input S0 E S3 + S4 T S5 $ T i i
LR(0) parsing example (8) • shift input token on top of the stack • compute new state Z E $ E E ‘+’ T E T T i T ‘(’ E ‘)’ stack input S0 E S3 $ E + T T i i
LR(0) parsing example (9) • reduce handle on top of the stack • compute new state Z E $ E E ‘+’ T E T T i T ‘(’ E ‘)’ stack input S0 E S3 $ S6 E + T T i i
LR(0) parsing example (10) • accept! Z E $ E E ‘+’ T E T T i T ‘(’ E ‘)’ stack input S0 Z E $ E + T T i i
Precomputing the item set (1) • Initial item set
Precomputing the item set (2) • Next item set
The LR push-down automation • Two major moves and a minor move • Shift move • Remove the first token from the present input and pushes it onto the stack • Reduce move • N -> • are moved from the stack • N is then pushed onto the stack • Termination • The input has been parsed successfully when it has been reduced to the start symbol.
Another Example of LR(0) from Fischer (1)—closure0 G1 example
Another Example of LR(0) from Fischer (2) G2 example • S'S$ • SID|
LR comments • The bottom-up parsing, unlike the top-down parsing, has no problems with left-recursion. • On the other hand, bottom-up parsing has a slight problem with right-recursion.
LR(0) conflicts (1) • shift-reduce conflict • Exist in a state when table construction cannot use the next k tokens to decide whether to shift the next input token or call for a reduction. • array indexing: T i [ E ] T i [ E ](shift) T i(reduce) • -rule: RestExpr Expr Term RestExpr (shift) RestExpr (reduce)
LR(0) conflicts (2) • reduce-reduce conflict • Exist when table construction cannot use the next k tokens to distinguish between multiple reductions that cannot be applied in the inadequate state. • assignment statement: Z V := E $ V i (reduce) T i (reduce) (Different reduce rules) • typical LR(0) table contains many conflicts
Handling LR(0) conflicts • Use a one-token look-ahead • Use a two-dimensional ACTION table • different construction of ACTION table • SLR(1) – Simple LR • LR(1) • LALR(1) – Look-Ahead LR
SLR(1) parsing • A handle should not be reduced to a non-terminal N if the look-ahead is a token that cannot follow N. • reduce N iff token FOLLOW(N) • FOLLOW(N) • FOLLOW(Z) = { $ } • FOLLOW(E) = { ‘+’, ‘)’, $ } • FOLLOW(T) = { ‘+’, ‘)’, $ }
SLR(1) ACTION table shift
SLR(1) ACTION/GOTO table 1: Z E $ 2: E T 3: E E ‘+’ T 4: T i 5: T ‘(’ E ‘)’ s7 sn – shift to state n rn – reduce rule n
Example of resolving conflicts (1) • A new rule T i [E] 1: Z E $ 2: E T 3: E E ‘+’ T 4: T i 5: T ‘(’ E ‘)’ 6: T i ‘[‘ E ‘]’
Example of resolving conflicts (2) 1: Z E $ 2: E T 3: E E ‘+’ T 4: T i 5: T ‘(’ E ‘)’ 6: T i ‘[‘ E ‘]’ s5 T i. T i. [E]
Another Example of LR(0) Conflicts(3) num plus num times num $
Another Example of LR(0) Conflicts(4) Follow(E)= {plus, $}
Unfortunately … • SLR(1) leaves many shift-reduce conflicts unsolved • problem: FOLLOW(N) set is a union of all all look-aheads of all alternatives of N in all states • example S A | x b A a A b | B B x Follow (S)={$} Follow(A) = {b, $} Follow(B) = {b, $}
Another Example of SLR Problem Follow(A)={b, c, $}
Make the Grammar SLR(1) Follow(A1)={b, $}
Another Example 2 of SLR(1) (1) G3: SE$ EE+T|T TT*P|P PID|(E)
Another Example 2 of SLR(1) (2) G3: SE$ EE+T|T TT*P|P PID|(E) • SLR(1) Action Table for G3 Follow(E)={$+)} Follow(T)={$+)*}
SLR(1) Conflicts Ex2 Elem(List, Elem) ElemScalar ListList,Elem List Elem Scalar ID Scalar(Scalar) Fellow(Elem)={“)”,”,”,….}
LR(1) parsing • The LR(1) technique does not rely on FOLLOW sets, but rather keeps the specific look-ahead with each item • LR(1) item: N {} • - closure for LR(1) item sets: if set S contains an item P N {} then for each production rule N S must contain the item N {} where = FIRST( {} )