Understanding Bottom-Up Parsing Methods in LR Approach

Chapter 4-2 Chang Chi-Chung 2007.06.07

Bottom-Up Parsing • LR methods (Left-to-right, Rightmost derivation) • LR(0), SLR, Canonical LR = LR(1), LALR • Other special cases: • Shift-reduce parsing • Operator-precedence parsing

Operator-Precedence Parsing • Special case of shift-reduce parsing • See textbook section 4.6

Bottom-Up Parsing reduction E→ T → T * F → T * id → F * id → id * id rightmost derivation E → E + T | T T → T * F | F F → ( E ) | id

E → E + T | T T → T * F | F F → ( E ) | id Handle Pruning • Handle • A handle is a substring of grammar symbols in aright-sentential form that matches a right-hand side of a production

S A α ω β Handle Pruning • A rightmost derivation in reverse can be obtained by “handle pruning” S = γ0 →rm γ1 → rm γ2 →rm …. →rm γn-1 →rm γn =ω • Handle definition • S →*rm αAω →*rm αβω A handle A →β in the parse tree for αβω

Example: Handle a b b c d ea A b c d ea A d ea A BeS Grammar S  aA B eA  Ab c | bB  d Handle a b b c d ea A b c d ea A A e… ? NOT a handle, becausefurther reductions will fail (result is not a sentential form)

Shift-Reduce Parsing • Shift-Reduce Parsing is a form of bottom-up parsing • A stack holds grammar symbols • An input buffer holds the rest of the string to parsed. • Shift-Reduce parser action • shift • reduce • accept • error

Shift-Reduce Parsing • Shift • Shift the next input symbol onto the top of the stack. • Reduce • The right end of the string to be reduced must be at the top of the stack. • Locate the left end of the string within the stack and decide with what nonterminal to replace the string. • Accept • Announce successful completion of parsing • Error • Discover a syntax error and call recovery routine

Shift-Reduce Parsing E → E + T | T T → T * F | F F → ( E ) | id

Example: Shift-Reduce Parsing Grammar:S  aA B eA  Ab c | bB  d Shift-reduce correspondsto a rightmost derivation:S rma A B e rmaA d e rm a A b c d e rm a b b c d e Reducing a sentence:a b b c d ea A b c d ea A d ea A BeS These matchproduction’sright-hand sides S A A A A A A B A B a b b c d e a b b c d e a b b c d e a b b c d e

Conflicts During Shift-Reduce Parsing • Conflicts Type • shift-reduce • reduce-reduce • Shift-reduce and reduce-reduce conflicts are caused by • The limitations of the LR parsing method (even when the grammar is unambiguous) • Ambiguity of the grammar

Shift-Reduce Conflict GrammarE  E+EE  E*EE  (E)E  id Stack $$id $E$E+$E+id$E+E$E+E*$E+E*id$E+E*E$E+E$E Input id+id*id$+id*id$+id*id$id*id$*id$*id$id$$$$$ Actionshiftreduce E  idshiftshiftreduce E  idshift (or reduce?)shiftreduce E  idreduce E  E * Ereduce E  E + Eaccept How toresolveconflicts? Find handlesto be reduced

Shift-Reduce Conflict Stack $…$…ifEthenS Input …$else…$ Action…shift or reduce? Ambiguous grammar:S  ifEthenS |ifEthenSelseS | other Shift else to ifEthenS or Reduce ifEthenS Resolve in favorof shift, so elsematches closestif

Reduce-Reduce Conflict GrammarC  A BA  aB  a Stack $$a Input aa$a$ Actionshiftreduce A  a orB  a ? Resolve in favorof reduce A  a,otherwise we’re stuck! 動彈不得

LR • LR parser are table-driven • Much like the nonrecursive LL parsers. • The reasons of the using the LR parsing • An LR-parsering method is the most general nonbacktracking shift-reduce parsing method known, yet it can be implemented as efficiently as other. • LR parsers can be constructed to recognize virtually all programming-language constructs for which context-free grammars can be written. • An LR parser can detect a syntactic error as soon as it is possible to do so on a left-to-right scan of the input. • The class of grammars that can be parsed using LR methods is a proper superset of the class of grammars that can be parsed with predictive or LL methods.

LR(0) • An LR parser makes shift-reduce decisions by maintaining states to keep track. • An item of a grammar G is a production of G with a dot at some position of the body. • Example A → X Y Z A → ．X Y Z A → X．Y Z A → X Y ．Z A → X Y Z ． A → X．Y Z items stack next derivations with input strings Note that production A  has one item [A •]

LR(0) • Canonical LR(0) Collection • One collection of sets of LR(0) items • Provide the basis for constructing a DFA that is used to make parsing decisions. • LR(0) automation • The canonical LR(0) collection for a grammar • Augmented the grammar • If G is a grammar with start symbol S, then G’ is the augmented grammar for G with new start symbol S’ and new production S’ → S • Closure function • Goto function

Use of the LR(0) Automaton

Function Closure • If I is a set of items for a grammar G, then closure(I) is the set of items constructed from I. • Create closure(I) by the two rules: • add every item in I to closure(I) • IfA→α．Bβ is in closure(I) andB→γis a production, then add the item B→．γtoclosure(I). Applythis rule untill no more new items can be added toclosure(I). • Divide all the sets of items into two classes • Kernel items • initial item S’ → ．S, and all items whose dots are not at the left end. • Nonkernel items • All items with their dots at the left end, except for S’ → ．S

Example • The grammar G E’ → E E → E + T | T T → T * F | F F → ( E ) | id • LetI = { E’ → ．E} , then closure(I) = { E’ →．E E →．E + T E →．T T →．T * F T →．F F →．( E ) F →．id}

Exercise • The grammar G E’ → E E → E + T | T T → T * F | F F → ( E ) | id • LetI = { E → E +．T }

Function Goto • Function Goto(I, X) • I is a set of items • X is a grammar symbol • Goto(I, X) is defined to be the closure of the set of all items [A  α X‧β] such that [A  α‧Xβ] is in I. • Goto function is used to define the transitions in the LR(0) automation for a grammar.

Example • The grammar G • E’ → E • E → E + T | T • T → T * F | F • F → ( E ) | id • I = { E’ → E ． E → E ．+ T } • Goto (I, +) = { E → E +．T T →．T * F T →．F F →．(E) F →．id }

Constructing the LR(0) Collection • The grammar is augmented with a new start symbol S’ and production S’S • Initially, set C = closure({[S’•S]})(this is the start state of the DFA) • For each set of items I  C and each grammar symbol X  (NT) such thatGOTO(I, X)  Cand goto(I, X)  , add the set of items GOTO(I, X) to C • Repeat 3 until no more sets can be added to C

Example: The Parse of id * id

Model of an LR Parser

Structure of the LR Parsing Table • Parsing Table consists of two parts: • A parsing-action function ACTION • A goto function GOTO • The Action function, Action[i, a], have one of four forms: • Shift j, where j is a state. • The action taken by the parser shifts input a to the stack, but use state j to represent a. • Reduce A→β. • The action of the parser reduces β on the top of the stack to head A. • Accept • The parser accepts the input and finishes parsing. • Error

Structure of the LR Parsing Table(1) • The GOTO Function, GOTO[Ii, A], defined on sets of items. • If GOTO[Ii, A] = Ij, then GOTO also maps a state i and a nonterminal A to state j.

Configuration ( = LR parser state): ($s0s1s2 … sm,aiai+1 … an$) stack input ($ X1 X2 … Xm,aiai+1 … an$) LR-Parser Configurations Ifaction[sm, ai] = shift sthen push s (ai), and advance input (s0 s1 s2 … sm, aiai+1 … an $)(s0s1s2 … sms, ai+1 … an$) Ifaction[sm, ai] = reduce A   and goto[sm-r, A] = s with r = || then pop r symbols, and push s ( push A )( (s0 s1 s2 … sm, ai ai+1 … an $) (s0s1 s2 … sm-rs, aiai+1 … an$) Ifaction[sm, ai] = accept then stop Ifaction[sm, ai] = error then attempt recovery

Example LR Parse Table action goto Grammar:1. E  E+T2. E  T3. T  T*F4. T  F5. F  (E)6. F  id state shift & goto 5 reduce byproduction #1

Example Grammar0. S  E 1. E  E+T2. E  T3. T  T*F4. T  F5. F  (E)6. F  id

SLR Grammars • SLR (Simple LR): a simple extension of LR(0) shift-reduce parsing • SLR eliminates some conflicts by populating the parsing table with reductions A on symbols in FOLLOW(A) Shift on + State I2:E  id•+EE  id• State I0:S  •EE  •id+EE  •id goto(I0,id) goto(I3,+) S  EE  id+EE  id FOLLOW(E)={$}thus reduce on $

SLR Parsing Table • Reductions do not fill entire rows • Otherwise the same as LR(0) 1. S  E2. E  id+E3. E  id Shift on + FOLLOW(E)={$}thus reduce on $

Constructing SLR Parsing Tables • Augment the grammar with S’ S • Construct the set C={I0, I1, …, In} of LR(0) items • State i is constructed from Ii • If [A•a] Iiand goto(Ii, a)=Ij then set action[i, a]=shiftj • If [A•]  Iithen set action[i,a]=reduce A for all a  FOLLOW(A) (apply only if AS’) • If [S’S•] is in Ii then set action[i,$]=accept • If goto(Ii, A)=Ij then set goto[i, A]=jset goto table • Repeat 3-4 until no more entries added • The initial state iis the Iiholding item [S’•S]

Example SLR Grammar and LR(0) Items Augmentedgrammar:0. C’  C1. C  A B2. A  a3. B  a I0 = closure({[C’  •C]})I1 = goto(I0,C) = closure({[C’  C•]})… State I1:C’  C• State I4:C  A B• final goto(I0,C) goto(I2,B) State I0:C’  •CC  •A BA  •a State I2:C  A•BB  •a start goto(I0,A) goto(I2,a) State I5:B  a• goto(I0,a) State I3:A  a•

Example SLR Parsing Table State I0:C’  •CC  •A BA  •a State I1:C’  C• State I2:C  A•BB  •a State I3:A  a• State I4:C  A B• State I5:B  a• 1 Grammar:0. C’  C1. C  A B2. A  a3. B  a C 4 B start A 2 0 a 5 a 3

SLR and Ambiguity • Every SLR grammar is unambiguous, but not every unambiguous grammar is SLR, maybe LR(1) • Consider for example the unambiguous grammarS L=R | RL  *R | idR  L FOLLOW(R) = {=, $} I0:S’  •SS  •L=RS  •RL  •*RL  •idR  •L I1:S’  S• I2:S  L•=RR  L• I3:S  R• I4:L  *•RR  •L L  •*RL  •id I5:L  id• I6:S  L=•RR  •L L  •*RL  •id I7:L  *R• I8:R  L• I9:S  L=R• action[2,=]=s6action[2,=]=r5 Has no SLRparsing table no

LR(1) Grammars • SLR too simple • LR(1) parsing uses lookahead to avoid unnecessary conflicts in parsing table • LR(1) item = LR(0) item + lookahead LR(0) item[A•] LR(1) item [A•, a]

SLR Versus LR(1) • Split the SLR states by adding LR(1) lookahead • Unambiguous grammarS L = R | RL  * R | idR  L I2:S  L•=RR  L• split S  L•=R R  L• action[2,=]=s6 Should not reduce, because noright-sentential form begins with R=

LR(1) Items • An LR(1) item [A•,a]contains a lookahead terminal a, meaning  already on top of the stack, expect to seea • For items of the form[A•,a]the lookahead a is used to reduce A only if the next input is a • For items of the form[A•, a]with  the lookahead has no effect

The Closure Operation for LR(1) Items • Start with closure(I) = I • If [A•B, a]  closure(I) then for each production B in the grammar and each terminal b  FIRST(a) add the item [B•, b] to I if not already in I • Repeat 2 until no new items can be added

The Goto Operation for LR(1) Items • For each item [A•X, a]  I, add the set of items closure({[AX•, a]}) to goto(I,X) if not already there • Repeat step 1 until no more items can be added to goto(I,X)

The grammar G • S’ → S • S → C C • C → cC | d Example • Let I= { (S’ → •S, $) } • I0 = closure(I) = { S’ → •S, $ S → • C C, $ C → •cC, c/d C → •d, c/d } • goto(I0, S) = closure( {S’ → S •, $ } ) = {S’ → S •, $ } = I1

Exercise • The grammar G • S’ → S • S → C C • C → cC | d • Let I= { (S → C •C, $) } • I2 = closure(I) = ? • I3 = goto(I2, c) = ?

Construction of the sets of LR(1) Items • Augment the grammar with a new start symbol S’ and production S’S • Initially, set C = closure({[S’•S,$]})(this is the start state of the DFA) • For each set of items I  C and each grammar symbol X  (NT) such that goto(I, X)  Cand goto(I, X)  , add the set of items goto(I, X) to C • Repeat 3 until no more sets can be added to C

LR(1) Automation

Construction of the Canonical LR(1) Parsing Tables • Augment the grammar with S’S • Construct the set C={I0,I1,…,In} of LR(1) items • State i of the parser is constructed from Ii • If [A•a, b]  Iiand goto(Ii,a)=Ij then set action[i,a]=shift j • If [A•, a]  Ii then set action[i,a]=reduce A (apply only if AS’) • If [S’S•, $] is in Ii then set action[i,$]=accept • If goto(Ii,A)=Ij then set goto[i,A]=j • Repeat 3 until no more entries added • The initial state i is the Ii holding item [S’•S,$]

Example • The grammar G • S’ → S • S → C C • C → cC | d

Example Grammar and LR(1) Items • Unambiguous LR(1) grammar:S L=R | RL  *R | idR  L • Augment with S’  S • LR(1) items (next slide)

Understanding Bottom-Up Parsing Methods in LR Approach

Understanding Bottom-Up Parsing Methods in LR Approach

Presentation Transcript

Chapter 4 Part 2

Chapter 4 Part 2

Tutorial 4: Chapter 2

Chapter 4 - 2

Chapter 4 Section 2

Chapter 4 section 2

Chapter 2 Section 4

Chapter 2 Section 4

Chapter 4- Part 2

Chapter 4 Lesson 2

Chapter 2, Section 4

Unit 2 : Chapter 4

Chapter 4 Lesson 2

Chapter 4 - 2

Chapter 4 - Part 2

Unit 2, Chapter 4

Chapter 4: Part 2

Chapter 2, Section 2-4

Chapter 4 Section 2

Chapter 4 Section 2

Chapter 2 4

Chapter 4 Lesson 2