620 likes | 655 Views
Chapter 4-2. Chang Chi-Chung 2007.06.07. Bottom-Up Parsing. L R methods ( Left-to-right , Rightmost derivation ) LR(0), SLR, Canonical LR = LR(1), LALR Other special cases: Shift-reduce parsing Operator-precedence parsing. Operator-Precedence Parsing.
E N D
Chapter 4-2 Chang Chi-Chung 2007.06.07
Bottom-Up Parsing • LR methods (Left-to-right, Rightmost derivation) • LR(0), SLR, Canonical LR = LR(1), LALR • Other special cases: • Shift-reduce parsing • Operator-precedence parsing
Operator-Precedence Parsing • Special case of shift-reduce parsing • See textbook section 4.6
Bottom-Up Parsing reduction E→ T → T * F → T * id → F * id → id * id rightmost derivation E → E + T | T T → T * F | F F → ( E ) | id
E → E + T | T T → T * F | F F → ( E ) | id Handle Pruning • Handle • A handle is a substring of grammar symbols in aright-sentential form that matches a right-hand side of a production
S A α ω β Handle Pruning • A rightmost derivation in reverse can be obtained by “handle pruning” S = γ0 →rm γ1 → rm γ2 →rm …. →rm γn-1 →rm γn =ω • Handle definition • S →*rm αAω →*rm αβω A handle A →β in the parse tree for αβω
Example: Handle a b b c d ea A b c d ea A d ea A BeS Grammar S aA B eA Ab c | bB d Handle a b b c d ea A b c d ea A A e… ? NOT a handle, becausefurther reductions will fail (result is not a sentential form)
Shift-Reduce Parsing • Shift-Reduce Parsing is a form of bottom-up parsing • A stack holds grammar symbols • An input buffer holds the rest of the string to parsed. • Shift-Reduce parser action • shift • reduce • accept • error
Shift-Reduce Parsing • Shift • Shift the next input symbol onto the top of the stack. • Reduce • The right end of the string to be reduced must be at the top of the stack. • Locate the left end of the string within the stack and decide with what nonterminal to replace the string. • Accept • Announce successful completion of parsing • Error • Discover a syntax error and call recovery routine
Shift-Reduce Parsing E → E + T | T T → T * F | F F → ( E ) | id
Example: Shift-Reduce Parsing Grammar:S aA B eA Ab c | bB d Shift-reduce correspondsto a rightmost derivation:S rma A B e rmaA d e rm a A b c d e rm a b b c d e Reducing a sentence:a b b c d ea A b c d ea A d ea A BeS These matchproduction’sright-hand sides S A A A A A A B A B a b b c d e a b b c d e a b b c d e a b b c d e
Conflicts During Shift-Reduce Parsing • Conflicts Type • shift-reduce • reduce-reduce • Shift-reduce and reduce-reduce conflicts are caused by • The limitations of the LR parsing method (even when the grammar is unambiguous) • Ambiguity of the grammar
Shift-Reduce Conflict GrammarE E+EE E*EE (E)E id Stack $$id $E$E+$E+id$E+E$E+E*$E+E*id$E+E*E$E+E$E Input id+id*id$+id*id$+id*id$id*id$*id$*id$id$$$$$ Actionshiftreduce E idshiftshiftreduce E idshift (or reduce?)shiftreduce E idreduce E E * Ereduce E E + Eaccept How toresolveconflicts? Find handlesto be reduced
Shift-Reduce Conflict Stack $…$…ifEthenS Input …$else…$ Action…shift or reduce? Ambiguous grammar:S ifEthenS |ifEthenSelseS | other Shift else to ifEthenS or Reduce ifEthenS Resolve in favorof shift, so elsematches closestif
Reduce-Reduce Conflict GrammarC A BA aB a Stack $$a Input aa$a$ Actionshiftreduce A a orB a ? Resolve in favorof reduce A a,otherwise we’re stuck! 動彈不得
LR • LR parser are table-driven • Much like the nonrecursive LL parsers. • The reasons of the using the LR parsing • An LR-parsering method is the most general nonbacktracking shift-reduce parsing method known, yet it can be implemented as efficiently as other. • LR parsers can be constructed to recognize virtually all programming-language constructs for which context-free grammars can be written. • An LR parser can detect a syntactic error as soon as it is possible to do so on a left-to-right scan of the input. • The class of grammars that can be parsed using LR methods is a proper superset of the class of grammars that can be parsed with predictive or LL methods.
LR(0) • An LR parser makes shift-reduce decisions by maintaining states to keep track. • An item of a grammar G is a production of G with a dot at some position of the body. • Example A → X Y Z A → .X Y Z A → X.Y Z A → X Y .Z A → X Y Z . A → X.Y Z items stack next derivations with input strings Note that production A has one item [A •]
LR(0) • Canonical LR(0) Collection • One collection of sets of LR(0) items • Provide the basis for constructing a DFA that is used to make parsing decisions. • LR(0) automation • The canonical LR(0) collection for a grammar • Augmented the grammar • If G is a grammar with start symbol S, then G’ is the augmented grammar for G with new start symbol S’ and new production S’ → S • Closure function • Goto function
Function Closure • If I is a set of items for a grammar G, then closure(I) is the set of items constructed from I. • Create closure(I) by the two rules: • add every item in I to closure(I) • IfA→α.Bβ is in closure(I) andB→γis a production, then add the item B→.γtoclosure(I). Applythis rule untill no more new items can be added toclosure(I). • Divide all the sets of items into two classes • Kernel items • initial item S’ → .S, and all items whose dots are not at the left end. • Nonkernel items • All items with their dots at the left end, except for S’ → .S
Example • The grammar G E’ → E E → E + T | T T → T * F | F F → ( E ) | id • LetI = { E’ → .E} , then closure(I) = { E’ →.E E →.E + T E →.T T →.T * F T →.F F →.( E ) F →.id}
Exercise • The grammar G E’ → E E → E + T | T T → T * F | F F → ( E ) | id • LetI = { E → E +.T }
Function Goto • Function Goto(I, X) • I is a set of items • X is a grammar symbol • Goto(I, X) is defined to be the closure of the set of all items [A α X‧β] such that [A α‧Xβ] is in I. • Goto function is used to define the transitions in the LR(0) automation for a grammar.
Example • The grammar G • E’ → E • E → E + T | T • T → T * F | F • F → ( E ) | id • I = { E’ → E . E → E .+ T } • Goto (I, +) = { E → E +.T T →.T * F T →.F F →.(E) F →.id }
Constructing the LR(0) Collection • The grammar is augmented with a new start symbol S’ and production S’S • Initially, set C = closure({[S’•S]})(this is the start state of the DFA) • For each set of items I C and each grammar symbol X (NT) such thatGOTO(I, X) Cand goto(I, X) , add the set of items GOTO(I, X) to C • Repeat 3 until no more sets can be added to C
Structure of the LR Parsing Table • Parsing Table consists of two parts: • A parsing-action function ACTION • A goto function GOTO • The Action function, Action[i, a], have one of four forms: • Shift j, where j is a state. • The action taken by the parser shifts input a to the stack, but use state j to represent a. • Reduce A→β. • The action of the parser reduces β on the top of the stack to head A. • Accept • The parser accepts the input and finishes parsing. • Error
Structure of the LR Parsing Table(1) • The GOTO Function, GOTO[Ii, A], defined on sets of items. • If GOTO[Ii, A] = Ij, then GOTO also maps a state i and a nonterminal A to state j.
Configuration ( = LR parser state): ($s0s1s2 … sm,aiai+1 … an$) stack input ($ X1 X2 … Xm,aiai+1 … an$) LR-Parser Configurations Ifaction[sm, ai] = shift sthen push s (ai), and advance input (s0 s1 s2 … sm, aiai+1 … an $)(s0s1s2 … sms, ai+1 … an$) Ifaction[sm, ai] = reduce A and goto[sm-r, A] = s with r = || then pop r symbols, and push s ( push A )( (s0 s1 s2 … sm, ai ai+1 … an $) (s0s1 s2 … sm-rs, aiai+1 … an$) Ifaction[sm, ai] = accept then stop Ifaction[sm, ai] = error then attempt recovery
Example LR Parse Table action goto Grammar:1. E E+T2. E T3. T T*F4. T F5. F (E)6. F id state shift & goto 5 reduce byproduction #1
Example Grammar0. S E 1. E E+T2. E T3. T T*F4. T F5. F (E)6. F id
SLR Grammars • SLR (Simple LR): a simple extension of LR(0) shift-reduce parsing • SLR eliminates some conflicts by populating the parsing table with reductions A on symbols in FOLLOW(A) Shift on + State I2:E id•+EE id• State I0:S •EE •id+EE •id goto(I0,id) goto(I3,+) S EE id+EE id FOLLOW(E)={$}thus reduce on $
SLR Parsing Table • Reductions do not fill entire rows • Otherwise the same as LR(0) 1. S E2. E id+E3. E id Shift on + FOLLOW(E)={$}thus reduce on $
Constructing SLR Parsing Tables • Augment the grammar with S’ S • Construct the set C={I0, I1, …, In} of LR(0) items • State i is constructed from Ii • If [A•a] Iiand goto(Ii, a)=Ij then set action[i, a]=shiftj • If [A•] Iithen set action[i,a]=reduce A for all a FOLLOW(A) (apply only if AS’) • If [S’S•] is in Ii then set action[i,$]=accept • If goto(Ii, A)=Ij then set goto[i, A]=jset goto table • Repeat 3-4 until no more entries added • The initial state iis the Iiholding item [S’•S]
Example SLR Grammar and LR(0) Items Augmentedgrammar:0. C’ C1. C A B2. A a3. B a I0 = closure({[C’ •C]})I1 = goto(I0,C) = closure({[C’ C•]})… State I1:C’ C• State I4:C A B• final goto(I0,C) goto(I2,B) State I0:C’ •CC •A BA •a State I2:C A•BB •a start goto(I0,A) goto(I2,a) State I5:B a• goto(I0,a) State I3:A a•
Example SLR Parsing Table State I0:C’ •CC •A BA •a State I1:C’ C• State I2:C A•BB •a State I3:A a• State I4:C A B• State I5:B a• 1 Grammar:0. C’ C1. C A B2. A a3. B a C 4 B start A 2 0 a 5 a 3
SLR and Ambiguity • Every SLR grammar is unambiguous, but not every unambiguous grammar is SLR, maybe LR(1) • Consider for example the unambiguous grammarS L=R | RL *R | idR L FOLLOW(R) = {=, $} I0:S’ •SS •L=RS •RL •*RL •idR •L I1:S’ S• I2:S L•=RR L• I3:S R• I4:L *•RR •L L •*RL •id I5:L id• I6:S L=•RR •L L •*RL •id I7:L *R• I8:R L• I9:S L=R• action[2,=]=s6action[2,=]=r5 Has no SLRparsing table no
LR(1) Grammars • SLR too simple • LR(1) parsing uses lookahead to avoid unnecessary conflicts in parsing table • LR(1) item = LR(0) item + lookahead LR(0) item[A•] LR(1) item [A•, a]
SLR Versus LR(1) • Split the SLR states by adding LR(1) lookahead • Unambiguous grammarS L = R | RL * R | idR L I2:S L•=RR L• split S L•=R R L• action[2,=]=s6 Should not reduce, because noright-sentential form begins with R=
LR(1) Items • An LR(1) item [A•,a]contains a lookahead terminal a, meaning already on top of the stack, expect to seea • For items of the form[A•,a]the lookahead a is used to reduce A only if the next input is a • For items of the form[A•, a]with the lookahead has no effect
The Closure Operation for LR(1) Items • Start with closure(I) = I • If [A•B, a] closure(I) then for each production B in the grammar and each terminal b FIRST(a) add the item [B•, b] to I if not already in I • Repeat 2 until no new items can be added
The Goto Operation for LR(1) Items • For each item [A•X, a] I, add the set of items closure({[AX•, a]}) to goto(I,X) if not already there • Repeat step 1 until no more items can be added to goto(I,X)
The grammar G • S’ → S • S → C C • C → cC | d Example • Let I= { (S’ → •S, $) } • I0 = closure(I) = { S’ → •S, $ S → • C C, $ C → •cC, c/d C → •d, c/d } • goto(I0, S) = closure( {S’ → S •, $ } ) = {S’ → S •, $ } = I1
Exercise • The grammar G • S’ → S • S → C C • C → cC | d • Let I= { (S → C •C, $) } • I2 = closure(I) = ? • I3 = goto(I2, c) = ?
Construction of the sets of LR(1) Items • Augment the grammar with a new start symbol S’ and production S’S • Initially, set C = closure({[S’•S,$]})(this is the start state of the DFA) • For each set of items I C and each grammar symbol X (NT) such that goto(I, X) Cand goto(I, X) , add the set of items goto(I, X) to C • Repeat 3 until no more sets can be added to C
Construction of the Canonical LR(1) Parsing Tables • Augment the grammar with S’S • Construct the set C={I0,I1,…,In} of LR(1) items • State i of the parser is constructed from Ii • If [A•a, b] Iiand goto(Ii,a)=Ij then set action[i,a]=shift j • If [A•, a] Ii then set action[i,a]=reduce A (apply only if AS’) • If [S’S•, $] is in Ii then set action[i,$]=accept • If goto(Ii,A)=Ij then set goto[i,A]=j • Repeat 3 until no more entries added • The initial state i is the Ii holding item [S’•S,$]
Example • The grammar G • S’ → S • S → C C • C → cC | d
Example Grammar and LR(1) Items • Unambiguous LR(1) grammar:S L=R | RL *R | idR L • Augment with S’ S • LR(1) items (next slide)