Bottom-up parsing

Bottom-up parsing • Goal of parser : build a derivation • top-down parser : build a derivation by working from the start symbol towards the input. • builds parse tree from root to leaves • builds leftmost derivation • bottom-up parser : build a derivation by working from the input back toward the start symbol • builds parse tree from leaves to root • builds reverse rightmost derivation

Bottom-up parsing • The parser looks for a substring of the parse tree's frontier • that matches the rhs of a production and • whose reduction to the non-terminal on the lhs represents on step along the reverse of a rightmost derivation • Such a substring is called a handle. • Important: not all substrings that match a rhs are handles.

Bottom-up parsing techniques • Shift-reduce parsing • Shift input symbols until a handle is found. Then, reduce the substring to the non-terminal on the lhs. • Operator-precedence parsing • Based on shift-reduce parsing. • Identifies handles based on precedence rules.

Example: Shift-reduce parsing STACK ACTION $ Shift Grammar: $ id1 Reduce (rule 5) 1. S  E 2. E  E + E 3. E  E * E 4. E  num 5. E  id $ E Shift $ E + Shift $ E + num Reduce (rule 4) $ E + E Shift $ E + E * Shift $ E + E * id2 Reduce (rule 5) Input to parse: id1 + num * id2 $ E +E*E Reduce (rule 3) $ E+E Reduce (rule 2) $ E Reduce (rule 1) Handles: underlined $ S Accept

Shift-Reduce parsing • A shift-reduce parser has 4 actions: • Shift -- next input symbol is shifted onto the stack • Reduce -- handle is at top of stack • pop handle • push appropriate lhs • Accept -- stop parsing & report success • Error -- call error reporting/recovery routine

Shift-Reduce parsing • How can we know when we have found a handle? • Analyze the grammar beforehand. • Build tables • Look ahead in the input • LR(1) parsers recognize precisely those languages in which one symbol of look-ahead is enough to determine whether to reduce or shift. • L : for left-to-right parse of the input • R : for reverse rightmost derivation • 1: for one symbol of lookahead

How does it work? • Read input, one token at a time • Use stack to keep track of current state • The state at the top of the stack summarizes the information below. • The stack contains information about what has been parsed so far. • Use parsing table to determine action based on current state and look-ahead symbol. • How do we build a parsing table?

LR parsing techniques • SLR (not in the book) • Simple LR parsing • Easy to implement, not strong enough • Uses LR(0) items • Canonical LR • Larger parser but powerful • Uses LR(1) items • LALR (not in the book) • Condensed version of canonical LR • May introduce conflicts • Uses LR(1) items

E'  E E  E + T E  T T  T * F T  F F  id Class examples S'  S S  L = R S  R L  * R L  id R  L

Bottom-up parsing