240 likes | 368 Views
Lecture #10, Feb. 14, 2007. Modified sets of item construction Rules for building LR parse tables The Action rules The GOTO rules Conflicts and ambiguity Shift-reduce and reduce-reduce conflicts Parser generators and ambiguity Ambiguous expression grammar Ambiguous if-then-else grammar
E N D
Lecture #10, Feb. 14, 2007 • Modified sets of item construction • Rules for building LR parse tables • The Action rules • The GOTO rules • Conflicts and ambiguity • Shift-reduce and reduce-reduce conflicts • Parser generators and ambiguity • Ambiguous expression grammar • Ambiguous if-then-else grammar • Ml-yacc
Assignments • Homework • Assignment 7 will be accepted til the end of the week • good review for exam! • Assignment 8 (paper and pencil) is posted & due Mon. Feb 19 • good review for exam! • Assignment 9 (programming) is posted & due Wed. Feb 21 • in case your interested • Project 1 is Due today. • Email me the code • Name files with your last name as discussed in the Project 1 description. • Midterm Exam will be Monday, Feb 19, 2007 • Exam will be closed book • Exam will take 60 minutes • We will have a short lecture after the exam. • Project 2 will be assigned next Monday Feb 19, 2006
To facilitate Table building • To facilitate Table building we modify the sets of items construction slightly • Each item now has three components. • A production • A location for the dot • A terminal symbol that indicates a valid terminal that could follow the production. This is similar to, but not quite like the Non-terminals that are in the Follow set. • Examples: • [ Start → . Exp, EOF] • [ F → T . * F, +] Start → E E → E + T | T T → T * F | F F → ( E ) | id
Modified Closure • Let I be a set of it modified items • Then Closure(I) = • For each i I, where i = [A → . B β, x] • For each p Productions where p = B→ • For each t Terminals where t First(βx) • Add [B→ . , t] to Closure(I) if its not already there
Modified GOTO • GOTO(I,X) = • For each item in I of the form [A → . X β, a] • i.e. the dot comes just before the X • Let J be the set of items [A → X . β, a] • i.e. move the dot after the X • Return the Closure(J)
Modified Sets of items Construction • Start with a grammar with a Start symbol with only 1 production. Start → E • If the grammar isn’t of that form create a new grammar that is of that form with a new start symbol that accepts the same set of strings. • C := Closure( { [ Start → . E, EOF] }) • For each set of items I C • For Each X NonTerminal union Terminal • Compute new := GOTO(I,X) • If new is not empty, and new is not already in C, add it to C • Until no new sets of items can be added to C
Building Tables for LR parsers • Once the sets of items have been constructed, then the tables can be constructed by using • The set of items • The GOTO construction • The grammar • Each set of items corresponds to a state. • States and Terminals index the ACTION table • States and Non-Terminals index the GOTO table
Construction of ACTION table • Let C be the sets of items constructed for a grammar. There is one state “i” for each set ci C • If [A → . a β, b] ci , and GOTO(ci , a) = cj Then set ACTION[ i, a ] to shift j note “a” is a terminal symbol • If [A → . , a] ci Then set ACTION[ i, a ] to reduce( A → ) • If [Start → S . , EOF ] ci Then set ACTION[ i, EOF ] to accept Any conflict in these rules means the grammar is ambiguous.
Construction of GOTO table If GOTO(ci ,A) = cj Then set GOTO(I,A) to j Note that “A” is a Non-Terminal symbol • All other entries are error entries • The Start state of the parser is the state derived from Closure( { [ Start → . E, EOF] })
Parser generators • Programs that analyze grammars to produce efficient winning strategies • ml-yacc uses a LALR(1) table-driven parser • Look-Ahead 1 symbol • Left to right processing of input • Right-most derivation • ml-yacc reads a grammar, produces a table • ml-yacc attaches semantic actions to reduce moves
expr * expr expr Number + Number 2 17 3 ml-yacc makes a virtue out of Ambiguity • Why not: expr -> expr + expr | expr * expr | Number • Ambiguity ! ! !
Factoring - A hard solution • Fix the grammar E : E + T | T ; T : T * F | F ; F : ( E ) | Number ; • Problems • Grammar is harder to understand • Grammar is bigger E + T E T * F T Number Number F Number 3 2 17
Using Ambiguity • Ambiguity means the parser can’t decide between • Shifting a terminal, or reducing a handle to a Non-Terminal • Reducing a handle to one or more Non-terminals • T → rhs and S → rhs are both in the grammar and rhs is the handle. • This choice means we can’t construct a unique parse tree for any string. • But what if we could direct the parser to always prefer one choice over the other. • Then • The parse tree would always be unique • The grammar might even be smaller
Ambiguous Expression Grammar Start → E E → E + E | E * E | ( E ) | id • Contrast the two grammars. • Convince yourself they both accept the same set of strings. • Which one is ambiguous? • Which one is simpler? • Which one is smaller? Start → E E → E + T | T T → T * F | F F → ( E ) | id
An LR parser for the ambiguous EXP grammar Start → E E → E + E | E * E | ( E ) | id • The sets of item construction has 11 states • It has 4 shift-reduce ambiguities
State of the stack 1 Stack input . . . Exp * Exp + 3 EOF • Choices Action(8,+) shift 5 Action(8,+) reduce by 2 • Reducing by 2 means (E * E) has higher precedence than (E + E) • Generally this is what we want. { [E → E . + E , _ ] , [E → E . * E, _ ] , [E → E * E ., _ ] }
State of the stack 2 Stack input . . . Exp * Exp * 3 EOF • Choices Action(8,*) shift 4 Action(8,*) reduce by 2 • Reducing by 2 means that (E * E) is left associative • Shifting * means (E * E) is right associative • The other 2 shift reduce errors are similar but talk about the precedence and associativity of (E + E) { [E → E . + E , _ ] , [E → E . * E, _ ] , [E → E * E ., _ ] }
ml-yacc, a Better Solution • ml-yacc allows ambiguous grammars to be disambiguated via declarations of precedence and associativity • For example: • %left ‘+’ • %left ‘*’ • Declares that * has higher precedence than + and that both are left associative • If ambiguity remains the following rules are used • always shift on a shift/reduce conflict • do the first reduction listed in the grammar on a reduce/reduce conflict
Partial Ml-yacc file %left TIMES %left PLUS %% Start: E EOF ( E ) E : E PLUS E ( Add(E1,E2) ) | E TIMES E ( Mult(E1,E2) ) | LP E RP ( E ) | id ( Id id ) Much more about ml-yacc next time!
If-then-else example Goal → Stmt Stmt → IF Exp THEN Stmt | IF Exp THEN Stmt ELSE Stmt | ID := Exp State 9 {[Stmt → IF EXP THEN Stmt . , _] ,[Stmt → IF EXP THEN Stmt . ELSE Stmt, _ ] } Action(9,ELSE) = Shift 10 Action(9,ELSE) = Reduce by 1
State of machine Stack input . . . IF Exp THEN Stmt ELSE x := 3 … EOF State 9 {[Stmt → IF EXP THEN Stmt . , _] ,[Stmt → IF EXP THEN Stmt . ELSE Stmt, _ ] } Action(9,ELSE) = Shift 10 Else associated with closest IF on stack Action(9,ELSE) = Reduce by 1 Else associated with further away IF
Ml-yacc file %nonassoc THEN %nonassoc ELSE %% Goal: Stmt EOF ( Stmt ) Stmt: IF EXP THEN Stmt ( IfThen(E,Stmt) ) | IF EXP THEN Stmt ELSE Stmt ( IfThenElse(E,Stmt1,Stmt2) ) | ID ASSIGNOP EXP ( Assign(ID,E) )
Some sample ambiguous grammars • These examples can be found in the directory • http://www.cs.pdx.edu/~sheard/course/Cs321/LexYacc/AmbiguousExamples/