220 likes | 321 Views
Sections 4.5,4.6. Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield Road, Box U-155 Storrs, CT 06269-1155. aggelos@cse.uconn.edu http://www.cse.uconn.edu/~akiayias. Bottom Up Parsing. “Shift-Reduce” Parsing
E N D
Sections 4.5,4.6 Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield Road, Box U-155 Storrs, CT 06269-1155 aggelos@cse.uconn.edu http://www.cse.uconn.edu/~akiayias
Bottom Up Parsing • “Shift-Reduce” Parsing • Reduce a string to the start symbol of the grammar. • At every step a particular substring is matched (in left-to-right fashion) to the right side of some production and then it is substituted by the non-terminal in the left hand side of the production. abbcde aAbcde aAde aABe S Consider: S aABe A Abc | b B d Rightmost Derivation: S aABe aAde aAbcde abbcde
Handles • Handle of a string = substring that matches the RHS of some production AND whose reduction to the non-terminal on the LHS is a step along the reverse of some rightmost derivation. • Formally: handle of a right sentential form is <A , location of in > that satisfies the above property. • i.e. A is a handle of at the location immediately after the end of , if: S => A => • A certain sentential form may have many different handles. • Right sentential forms of a non-ambiguous grammarhave one unique handle [but many substrings that look like handles potentially !]. * rm rm
Example Consider: S aABe A Abc | b B d S aABe aAde aAbcde abbcde It follows that:(S ) aABe is a handle of aABe in location 1. (B ) d is a handle of aAde in location 3. (A ) Abc is a handle of aAbcde in location 2. (A ) b is a handle of abbcde in location 2.
Example, II Grammar: S aABe A Abc | b B d Consider aAbcde (it is a right sentential form) Is [A b, aAbcde] a handle? if it is then there must be: S rm … rm aAAbcde rm aAbcde no way ever to get two consecutive A’s in this grammar. => Impossible
Example, III Grammar: S aABe A Abc | b B d Consider aAbcde (it is a right sentential form) Is [B d, aAbcde] a handle? if it is then there must be: S rm … rm aAbcBe rm aAbcde we try to obtain aAbcBe not a right sentential form S rm aABe ?? aAbcBe
Handle Pruning • A rightmost derivation in reverse can be obtained by “handle-pruning.” • Apply this to the previous example. S aABe A Abc | b B d abbcde A b aAbcde A Abc aAde B d aABe S aABe S
Handle Pruning, II • Consider the cut of a parse-tree of a certain right sentential form. S A Left part Handle (only terminals here) Viable prefix
Shift Reduce Parsing with a Stack • The “big” problem : given the sentential form locate the handle • General Idea for S-R parsing using a stack:using a stack: 1. “shift” input symbols into the stack until a handle is found on top of it.2. “reduce” the handle to the corresponding non-terminal.(other operations: “accept” when the input is consumed and only the start symbol is on the stack, also: “error”). • Viable prefix: prefix of a right sentential form that appears on the stack of a Shift-Reduce parser.
What happens with ambiguous grammars Consider: E E + E | E * E | | ( E ) | id Derive id+id*id By two different Rightmost derivations
Example STACK INPUT Remark $ $ id $E id + id * id$ + id * id$ + id * id$ E E + E | E * E | ( E ) | id Shift Reduce by E id
Conflicts • Conflicts [appear in ambiguous grammars] either “shift/reduce” or “reduce/reduce” • Another Example: stmt if exprthen stmt | if exprthen stmtelse stmt | other (any other statement) Stack Input if … thenelse … Shift/ Reduceconflict
More Conflicts stmt id ( parameter-list ) stmt expr:=expr parameter-list parameter-list ,parameter | parameter parameter id expr-list expr-list ,expr | expr expr id | id ( expr-list ) Consider the string A(I,J)Corresponding token stream is id(id, id)After three shifts:Stack = id(id Input = , id) Reduce/Reduce Conflict … what to do?(it really depends on what is A,an array? or a procedure?
Removing Conflicts • One way is to manipulate grammar. • cf. what we did in the top-down approach to transform a grammar so that it is LL(1). • Nevertheless: • We will see that shift/reduce and reduce/reduce conflicts can be best dealt with after they are discovered. • This simplifies the design.
Operator-Precedence Parsing • problems encountered so far in shift/reduce parsing: • IDENTIFY a handle. • resolve conflicts (if they occur). • operator grammars: a class of grammars where handle identification and conflict resolution is easy. • Operator Grammars: no production right side is or has two adjacent non-terminals. • note: this is typically ambiguous grammar. E E - E | E + E | E * E | E / E | E ^ E | - E | ( E ) | id
Basic Technique • resolving ambiguity: • For the terminals of the grammar,define the relations <. .> and .=. • a <. b means that a yields precedence to b • a .=. b means that a has the same precedence as b. • a .> b means hat a takes precedence over b • E.g. * .> + or + <. * • Many handles are possible. We will use <. .=. And .> in a clever way to find the correct handle (i.e.,the one that respects the precedence).
Using Operator-Precedence Relations • GOAL: delimit the handle of a right sentential form • <. will mark the beginning, .> will mark the end and .=. will be in between. • Since no two adjacent non-terminals appear in the RHS of any production, the same is true for any any sentential form. • So given 0 a1 1a2 2 … an nwhere each i is either a nonterminal or the empty string. • We drop all non-terminals and we write the corresponding relation between each consecutive pair of terminals. • example for $id+id*id$ using standard precedence: $<.id.>+<.id.>*<.id.>$ • Example for $E+E*id$ … $<.+<.*<.id.>$
Using Operator-Precedence • … Then1. Scan the string to discover the first .>2. Scan backwards skipping .=. (if any) until a <. is found. (we will associate to the right)3. The handle is the substring delimited by the two steps above (including any in-between or surrounding non-terminals).E.g. Consider the sentential form E+E*Ewe obtain $+*$ and from this the string$<. + <. * .> $ • The handle is E*E
Operator Precedence Parser Set ip to point to the first symbol of w$ Stack=$ Repeat forever: if $==topofstack and ip==$ then accept Else { a=topofstack; b=ip; if a<.b or a.=.b then push(b);advance ip; if a.>b then repeat pop() until the top stack terminal is related by <. else error
Example STACK INPUT Remark $ $ id $ $ + $ + id $ + $ + * $ + * id $ + * $ + $ id + id * id$ + id * id$ + id * id$ id * id$ * id$ * id$ id$ $ $ $ $ $ $ <. idid >. +$<. ++ <. idid .> *+ <. * * <. idid .> $* .> $+ .> $ accept A sequence of pops corresponds to the application of some of the productions
Operator Precedence Table Construction • Basic techniques for operators: • if operator1has higher precedence than 2then set 1.> 2 • If the operators are of equal precedence (or the same operator)set 1.> 2and 2.> 1if the operators associate to the leftset 1<. 2and 2<. 1if the operators associate to the right • Make <.( and (<. and ).> and .>) • id has higher precedence than any other symbol • $ has lowest precedence.
Unary Operators • Unary operators that are not also used as binary operators are treated as before. • Problem: the – sign. • Typical solution: have the lexical analyzer return a different token when it sees a unary minus.