290 likes | 422 Views
More SLR /LR(1). CMSC 431 Shon Vick. Remember Computing FIRST (N). If N e First (N) includes e if N aABC First (N) includes a if N X1X2 First (N) includes First (X1) if N X1X2… and X1 e ,
E N D
More SLR /LR(1) CMSC 431 Shon Vick
RememberComputing FIRST (N) • If N e First (N) includes e • if N aABC First (N) includes a • if N X1X2 First (N) includes First (X1) • if N X1X2… and X1 e, • First (N) includes First (X2) • Obvious generalization to First (a) where a is X1X2...
Computing Follow (N) • Follow (N) is computed from productions in which N appears on the rhs • For the sentence symbol S, Follow (S) includes $ • if A a N b, Follow (N) includes First (b) • because an expansion of N will be followed by an expansion from b • if A a N, Follow (N) includes Follow (A) • because N will be expanded in the context in which A is expanded • if A a N B , B e, Follow (N) includes Follow (A)
Recall our Example • A grammar to generate all palindromes over S = { a, b } 1) S--> P 2) P --> a Pa 3) P --> b P b 4) P --> c • LR parsers work with an augmented grammar in which the start symbol never appears in the right side of a production. Here the original grammar was rules 2-4
Computing the Items • S0: S--> .P , P --> .a P a, P--> .bP b, P-->.c • S1: S--> P. • S2: P --> a.Pa, P-->.aPa,P-->.bPb,P-->.c • S3:P--> b.P b, P-->.aPa,P-->.bPb,P-->.c • S4: P--> c. • S5: P--> aP.a • S6:P--> bP.b • S7: P--> aPa. • S8: P--> bP b.
Finite State Machine • Draw the FSA. The major difference is that transitions can be both terminal and non-terminal symbols. • The Goto and Action Parts of the parsing table come from the FSA as detailed for SLR by Algo 4.8 page 227-228
FSA c Io S-> .P P -> .aPa P -> .bPb P ->.c I1 S-> P. P a I2 P -> a.Pa P -> .aPa P -> .bPb P ->.c a I5 P-> a P.a b b P I4 P-> c. c a I3 P -> b.Pb P -> .aPa P -> .bPb P ->.c a I7 P-> a Pa. c b P 1) P -> aPa 2) P -> bPb 3) P -> c b I6 P-> bP.b I8 P-> bPb.
Parsing Table Contd • Si means shift the input symbol and goto state I. • Rj means reduce by jth production. Note that we are not storing all the items in the state in our table. • example: abcba$ • if we go thru, parsing algorithm, we get
Example Contd • StateInputAction • $0 abcba$ shift • $0a2 bcba$ shift • $0a2b3 cba$ shift • $0a2b3c4 ba$ reduce • $0a2b3P6 ba$ shift • $0a2b3P6b8 a$ reduce • $0a2P5 a$ shift • $0a2P5a7 $ reduce • $0P 1 $accept
See also • Appel pages 60-62 especially Figure 3.2.1
Shift/Reduce Conflicts • An LR(0) state contains a conflict if its canonical set has two items that recommend conflicting actions. • shift/reduce conflict - when one item prompts a shift action, the other prompts a reduce action. • reduce/reduce conflict - when two items prompt for reduce actions by different production. • A grammar is said be to be LR(0) grammar, if the table does not have any conflicts.
SLR Summary • Uses DFA to recognize viable prefixes of grammar G • Each state in the DFA: • is the set of LR(0 items valid for a viable prefix • “encodes” information about the symbols that have been shifted onto the stack • Valid LR(0) items are computed by applying the closure and goto functions to the initial, valid item [S’ -> .S] (this is called the canonical collection of LR(0) items) • Uses FOLLOW to disambiguate actions
SLR(1) Summary • If A -> aAb is in Ikand goto(Ik, a) = Ij, then set actions[k,a] to sj • If A -> a is in Ikthen set actions[k,b]to rule#, for all be FOLLOW(A) • If S’ -> S. is inIkthen set actions[k,$] to accept Rules 1-3 may define conflicting actions for an entry in the actions table. In this case, the grammar is not SLR(1).
Shift/Reduce Conflict S’ -> .S S -> .A b | d c | b A c A -> .d A very simple language = {db, dc, bdc} Follow(S) = {$}, Follow(A) = {b,c} Form part f the SLR(1) parser: I0 S’ -> .S S -> .A b S -> . d c S -> . b A c A -> .d I1 S -> d .c A -> d. But since c is in Follow(A), we don’t whether to shift or reduce in I1 D1: S’ -> S ->dc D2; S’ ->S ->bAc ->bdc
Reduce/Reduce Conflict S’-> S S -> b A e | b B d | A c A -> d B -> Ec E-> d S’ -> S -> Ac -> dc S’ ->S -> bBd -> bEcd -> bdcd S’ -> S -> bAe -> bde d I2 A -> d. E -> d . I0 S’ -> .S S -> . b A e S -> .b B d S -> .A c A -> .d I1 S -> b .A e S -> b .B d A -> . d B -> .E c E -> .d b Which reduction should be taken? There is not enough context to decide!
LR(1) • Solution: keep more information about context. Namely keep track what next input symbol can be as part of DFA state • Idea: keep an input look-ahead as part of each item - these are called LR(1) items • Always a subset of Follow(A) for any non-terminal A (may not be a proper subset) • Can give rise to larger parsers (i.e. many states) than SLR but recognizes a greater number of constructs
LR(k) Items • The table construction algorithm for an LR(k) parser uses LR(k) items to represent the set of possible states in a parse • An LR(k) item is a pair [a, b], where • ais a production from G with a “.”at some position in the rhs • bis a look-ahead string containing k (where k is typically 1) symbols that are terminals or $ • Example LR(1) item [A -> X . Y Z , a] b a
LR(k) Items • What’s the point of the look-ahead symbols? • Carry them along to allow us to choose correct reduction when there is any choice • Look-ahead symbols are bookkeeping unless item has unless reducing (i.e. has a “.” at the right end) [A -> X . Y Z , a] [A -> X Y Z . , a] Use to Guide Reduction No Use The point: for [A -> a ., a] and [B -> a ., b], we can decide between reducing to A or to B by looking at limited right context
* rm rm Canonical LR(1) Items • The canonical collection of sets of LR(1) items: • Sets of valid items for viable prefixes of the grammar • Sets of items derivable from [S’ -> .S, $] using Goto and Closure • We say that an LR(1) item [A -> a . b, a] is valid for a viable prefix g if there is a derivation S d A w dab w where g = d a and either a is the first symbol of w, or w is e and a is $
LR(1) closure • Given a LR(1) item [A ->a .b, a], its closure contains the item and any other LR(1) items that can generate legal sub-strings to followa . • Thus, if the parser has the suffix a of a viable prefix on its stack, a sub-string of the input should reduce to Bb(or somegfor some other item [B ->. g, b] in the closure).
Computing LR(1) closure • Regard Algo given in Appel on page pg 63 • How is this different than the previous algorithm for computing closure given pg 60?
An Algorithm for Computing Closure function closure(I) repeat new_item <- false for each item [A -> a . B b ,a] e I, each production B -> g e G, and each terminal b e FIRST(ba), if [B -> .g,b] !e I then add [B -> .g,b] to I new_item <- true endif endfor until (new_item = false) return I
LR(1) Goto • Let I be a set of LR(1) items and X be a grammar symbol • Then, Goto(I,X) is the closure of the set of all items [A -> aX. b, a]such that [A -> a .X b, a] • That is, if I is the set of valid items for some viable prefixg , then goto(I,X) is the set of valid items for the viable prefixg X
LR(1) Goto function goto(I, X) J <- set of items [A -> a X.b,a] such that [A -> a .X b,a] e I J’ <- closure(J) return J’
Canonical Collection of Sets of LR(1) Items • We start the construction of the canonical collection of sets of LR(1) items with the item [S’ -> .S, $], where • S’ is the start symbol of the augmented grammar G • S is the start symbol of G • $ is the right end of sentence marker
Computing the Canonical Collection of Sets of LR(1) Items procedure items(G’) C <- closure({[S’ -> .S,$]}) repeat new_item <- false for each set of items I in C and each grammar symbol X such that goto(I,X) != {} and goto(I,X) ! e C add goto(I,X) to C new_item <- true endfor until (new_item = false)
LR(1) Table Construction • Construct the canonical collection of sets of LR(1) items for G • State i of the parser is constructed from Ii • If [A -> a. a b, b] e Ii and Goto(ii ,a) = Ij then set ACTION[i,a] to Shift j – here a must be a terminal • If [A -> a. , a] e Ii then set ACTION[i,a] to Reduce using A -> a • If [S’ -> S. , S] e Ii then set ACTION[i,$] to Accept • If Goto(Ii,A) = Ij then set Goto[i,A] to j • All other entries in ACTION and GOTO are set to Error • The initial state of the parser is the state constructed from the set containing the item [S’ -> .S,$]
References • http://www.cs.nyu.edu/courses/spring02/G22.2130-001/parsing2.ppt • http://www.cs.rpi.edu/~moorthy/Courses/compiler98/Lectures/lecturesinppt/lecture7.ppt • http://www.cs.rutgers.edu/~tdnguyen/classes/cs415/lectures/lecture11.pdf • Modern Compiler Implementation in Java, Andrew Appel, Cambridge University Press