Implementing LR Parsers

Implementing LR Parsers Adapted from Notes by Prof. Saman Amarasinghe (MIT) LR parsing tables and building LR(0), SLR(1) and LR(1) Parsers

11 Actions of a Shift-Reduce Parser • Shift to sn • Push input token into the symbol stack • Push sn into state stack • Advance to next input symbol • Reduce • Pop both stacks as many times as the number of symbols on the RHS of rule n • Push LHS of rule n into symbol stack • Lookup [top of the state stack][top of symbol stack] • Push that state (in goto k) into state stack • Accept • End of stream reached and stack only has the start symbol • Reject • End of stream reached but stack has more than the start symbol

53 Building a LR(0) parser engine • Add the special production S’  S $ • Find the items of the CFG • Create the DFA • using closure and goto functions • Build the parse table LR(0) Parser Engine

Creating the LR(0) DFA s1 X <S>  <X> • $ s0 s2 <S>  • <X> $ <X>  • ( <X> ) <X>  • ( ) ( <X>  ( • <X> ) <X>  ( • ) <X>  • ( <X> ) <X>  • ( ) ( s3 <X>  ( <X> • ) X ) ) s5 s4 <X>  ( ) • <X>  ( <X> ) •

50 Creating the LR(0) parse tables • For each state • Transition to another state using a terminal symbol is a shift to that state (shift to sn) • Transition to another state using a non-terminal is a goto that state (goto sn) • If there is an item A  • in the state do a reduction with that production for all terminals (reduce k)

Building LR(0) Parse Table Example s1 X <S>  <X> • $ s0 s2 <S>  • <X> $ <X>  • ( <X> ) <X>  • ( ) ( <X>  ( • <X> ) <X>  ( • ) <X>  • ( <X> ) <X>  • ( ) ( s3 <X>  ( <X> • ) X ) ) s5 s4 <X>  ( ) • <X>  ( <X> ) •

Changing the Language • String of one more more left parentheses followed by the same number of right parentheses <S>  <X> $ <X>  ( <X> ) <X>  ( ) • String of zero or more more left parentheses followed by the same number of right parentheses <S>  <X> $ <X>  ( <X> ) <X>  

16 Building the LR(0) Parse Table s1 X <S>  <X> • $ s0 s2 s3 <S>  • <X> $ <X>  • ( <X> ) <X>  • ( <X>  ( • <X> ) <X>  • ( <X> ) <X>  • <X>  ( <X> • ) ( ) Shift/reduce conflict X s4 Shift/reduce conflict <X>  ( <X> ) •

Idea behind SLR(1) grammars • Many shift/reduce conflicts in LR(0) • an item X   • in the current state identifies a reduction • But does not select when to reduce • Thus, have to perform the reduction on all input symbols • Do the reduction only when the input symbol truly follows the reduction • Need to calculate the terminals that can follow a non-terminal symbol

22 Building the SLR(1) Parse Table s1 X <S>  <X> • $ s0 s2 s3 <S>  • <X> $ <X>  • ( <X> ) <X>  • ( <X>  ( • <X> ) <X>  • ( <X> ) <X>  • <X>  ( <X> • ) ( ) X s4 follow(<X>) = { ), $ } follow(<S>) = { $ } <X>  ( <X> ) •

Expanded Example Grammar <S>  <X> $ (1) <X>  <Y> (2) <X>  ( (3) <Y>  ( <Y> ) (4) <Y>   (5) • Change language to: • Zero or more open parentheses followed by matching # of close parentheses • or a single open parenthesis

( s0 s1 s2 <S>  • <X> $ <X>  • <Y> <X>  • ( <Y>  • ( <Y> ) <Y>  • <X>  ( • <Y>  ( • <Y> ) <Y>  • ( <Y> ) <Y>  • <Y>  ( • <Y> ) <Y>  • ( <Y> ) <Y>  • s3 s4 s6 s5 <Y>  ( <Y> • ) <Y>  ( <Y> ) • <X>  <Y> • <S>  <X> • $ 26 Expanded Example DFA <S>  <X> $ <X>  <Y> <X>  ( <Y>  ( <Y> ) <Y>   ( ( Y Y Y X )

( s0 s1 s2 <S>  • <X> $ <X>  • <Y> <X>  • ( <Y>  • ( <Y> ) <Y>  • <X>  ( • <Y>  ( • <Y> ) <Y>  • ( <Y> ) <Y>  • <Y>  ( • <Y> ) <Y>  • ( <Y> ) <Y>  • s3 s4 s5 s6 <Y>  ( <Y> • ) <Y>  ( <Y> ) • <S>  <X> • $ <X>  <Y> • 31 ( ( Y Y Y X )

31 LR(1) Items • Items will keep info on • production • right-hand-side position (the dot) • look ahead symbol • LR(1) item is of the form [A   •  a] • A   is a production • The dot in A  • denotes the position • a is a terminal or the end marker ($) • For the item [A  • a] • a is the next symbol after A in the string i.e. there exist a derivation S   A a 

) Y ( s0 s1 s2 [<S>  • <X> $ ?] [<X>  • <Y> $] [<X>  • ( $] [<Y>  • (<Y>) $] [<Y>  • $] [<X>  ( • $] [<Y>  ( • <Y> ) $] [<Y>  • ( <Y>) )] [<Y>  • )] [<Y>  ( • <Y>) )] [<Y>  • ( <Y> ) )] [<Y>  • )] s7 s3 s4 s8 s5 s6 [<Y>  (<Y> • ) )] [<Y>  (<Y> • ) $] [<Y>  (<Y>) • )] [<Y>  (<Y>) • $] [<S>  <X> •$ ?] [<X>  <Y> • $] 46 Expanded Example LR(1) DFA <S>  <X> $ <X>  <Y> <X>  ( <Y>  ( <Y> ) <Y>   ( ( Y Y X )

) Y s7 s8 [<Y>  (<Y> • ) )] [<Y>  (<Y>) • )] 51 s7 s7 … s8 => ( s0 s2 ( [<S>  • <X> $ ?] [<X>  • <Y> $] [<X>  • ( $] [<Y>  • (<Y>) $] [<Y>  • $] [<Y>  ( • <Y>) )] [<Y>  • ( <Y> ) )] [<Y>  • )] s1 ( [<X>  ( • $] [<Y>  ( • <Y> ) $] [<Y>  • ( <Y>) )] [<Y>  • )] Y s3 Y X ) [<Y>  (<Y> • ) $] s6 s5 s4 [<X>  <Y> • $] [<S>  <X> •$ ?] [<Y>  (<Y>) • $]

) Y s7 s8 [<Y>  (<Y> • ) )] [<Y>  (<Y>) • )] 51 s7 s7 s8 s8 reduce (5) error s3 … s4 => ( s0 s2 ( [<S>  • <X> $ ?] [<X>  • <Y> $] [<X>  • ( $] [<Y>  • (<Y>) $] [<Y>  • $] [<Y>  ( • <Y>) )] [<Y>  • ( <Y> ) )] [<Y>  • )] s1 ( [<X>  ( • $] [<Y>  ( • <Y> ) $] [<Y>  • ( <Y>) )] [<Y>  • )] Y s3 Y X ) [<Y>  (<Y> • ) $] s6 s5 s4 [<X>  <Y> • $] [<S>  <X> •$ ?] [<Y>  (<Y>) • $]

SLR(1) 52 LR(1) s7 s7 … s8 => similar to s3…s4 except for “reduce” column.

( s0 s1 s2 [<S>  • <X> $ ?] [<X>  • <Y> $] [<X>  • ( $] [<Y>  • (<Y>) $] [<Y>  • $] [<X>  ( • $] [<Y>  ( • <Y> ) $] [<Y>  • ( <Y>) )] [<Y>  • )] [<Y>  ( • <Y>) )] [<Y>  • ( <Y> ) )] [<Y>  • )] s3 s4 s5 s6 [<Y>  (<Y> • ) {$,)}] [<Y>  (<Y>) • {$,)}] [<S>  <X> •$ ?] [<X>  <Y> • $] 46 Example -- LALR(1) DFA <S>  <X> $ <X>  <Y> <X>  ( <Y>  ( <Y> ) <Y>   ( ( Y Y Y X )

s3 s4 [<Y>  (<Y> • ) {$,)}] [<Y>  (<Y>) • {$,)}] 51 reduce(4) ( s0 s2 ( [<S>  • <X> $ ?] [<X>  • <Y> $] [<X>  • ( $] [<Y>  • (<Y>) $] [<Y>  • $] [<Y>  ( • <Y>) )] [<Y>  • ( <Y> ) )] [<Y>  • )] s1 ( [<X>  ( • $] [<Y>  ( • <Y> ) $] [<Y>  • ( <Y>) )] [<Y>  • )] Y Y Y X ) s6 s5 [<X>  <Y> • $] [<S>  <X> •$ ?]

52 LALR(1) reduce(4) s7 LR(1) s7 … s8 =>

SLR(1) 52 LALR(1) reduce(4)

A Hierarchy of Grammar Classes

LR Languages • The set of LR languages are independent of the lookahead distance k. • For all the languages with SLR(1), LALR(1) and LR(1) grammars we looked at, we could have found an LR(0) grammar!!! • This grammar may not be natural however.

LR Languages • Given any LR(k) grammar Gk, if L(Gk) is prefix-free, then there exists an (equivalent) LR(0) grammar G0 such that L(Gk) = L(G0) • A language L is prefix-freeiff [x in L and xy in L implies y ise]. • A language can be made prefix-free by concatenating a special end-marker to each string in the language.

LR vs LL In general, LL(k) grammars are LR(k) grammars, for any k.

LR(0) vs LL(k) • The following LR(0) grammar (with left recursion) is not LL(1). S  S0 | 0 • The following LR(0) grammar (with left recursion) is not LL(k), for any k. S  S0 | S1 | 0 | 1 • The following LL(1) grammar is not LR(0). S Z$ Z aZ | e

Not LR(k) for an k • The following regular grammar (with left recursion) is not LR(k), for any k. S  Cc | Bb C  Ca | a B  Ba | a

SLR(1) vs LL(k) • The following SLR(1) grammar (and hence, LR(1) grammar) requiring left factoring is not LL(k), for any k. E  T+E | T T  x *T | x

LL(1) vs SLR(1) • The following grammar is LL(1), but not SLR(1). S  1X | 2Ag X  Af | Bg A  3 | e B  4 | e

LL(1) vs LALR(1) • The following grammar is LL(1), but not LALR(1). S  aF | bG F  Xc | Yd G  Xd | Yc X  IA Y  IB A e I e Be

Implementing LR Parsers

Implementing LR Parsers

Presentation Transcript

Parsers and Grammars

Comparing Java XML parsers

XML Parsers

LR PARSERS

Parsers

LL(k) and LR(k) Parsers

Review: LR(k) parsers

Parsers

Parsers and Grammar

XML Parsers

Recursive Descent Parsers

Recursive Descent Parsers

MC365 XML Parsers

Parsing V Introduction to LR(1) Parsers

XML Parsers

Conflicts in Simple LR parsers

Parsers and Grammars

Parsers

LR Parsing