Lecture 7 Predictive Parsing

Lecture 7 Predictive Parsing CSCE 531 Compiler Construction • Topics • Review Top Down Parsing • First • Follow • LL (1) Table construction • Readings: 4.4 • Homework: Program 2 February 1, 2006

Overview • Last Time • Ambiguity in classic programming language grammars • Expressions • If-Then-Else • Top-Down Parsing • Modifying Grammars to facilitate Top-down parsing • Today’s Lecture • Regroup halfway to Test 1 • First and Follow • LL(1) property • References: • Homework:

Removing the IF-ELSE Ambiguity Stmt  if Expr then Stmt | if Expr then Stmt else Stmt | other stmts Stmt  MatchedStmt | UnmatchedStmt MatchedStmt  if Expr then MatchedStmt else MatchedStmt | OthersStatements UnmatchedStmt  if Expr then MatchedStmt else | if Expr then MatchedStmt else UmatchedStmt

Recursive Descent Parsers • Recall the parser from Chapter 2 • A recursive descent parser has a routine for each nonterminal. These routines can call each other. If one of these fails then it may backtrack to a point where there is an alternative choice. • In certain cases the grammar is restricted enough where backtracking would never be required. • Such a parser is called a predictive parser. • The parser from Chapter 2 is a predictive parser.

Transition Diagrams for Predictive Parsers • To construct the transition diagram for a predictive parser: • Eliminate left recursion from the grammar • Left factor the grammar • For each nonterminal A do • Create an initial state and final state. • For each production A  X1X2 … Xn create a path from the initial state to the final state labeled X1X2 … Xn • end

Example • E  T E’ • E’  + T E’ | - T E’ | ε • T  F T’ • T’  * F T’ | / F T’ | ε • F  id | num | ( E ) ε E E’ T E’ T E’ + 1 2 3 1 2 3 3 - E’ T T 2 3 F T’ 4 5 6 Etcetera Some of the rest in the text.

Predictive Parsing using Transition Diagrams

Table Driven Predictive Parsing input Predictive Parsing Program output Stack Parsing Table M

Table Driven Predictive Parsing • The stack is initialized to contain $S, the $ is the “bottom” marker. • The input has a $ added to the end. • The parse table, M[X, a] contains what should be done when we see nonterminal X on the stack and current token “a” • Parse Actions for • X = top of stack, and • a = current token • If X = a = $ then halt and announce success. • If X = a != $ then pop X off the stack and advance the input pointer to the next token. • If X is nonterminal consult the table entry M[X, a], details on next slide.

M[X, a] Actions • If X is nonterminal then consult M[X, a]. • The entry will be either a production or an error entry. • If M[X, a] = {X  UVW} the parser • replaces X on the top of the stack • with W, V, U with the U on the top • As output print the name of the production used.

Algorithm 4.3 • Set ip to the first token in w$. • Repeat • Let X be the top of the stack and a be the current token • if X is a terminal or $ then • if X = a then • pop X from the stack and advance the ip • else error() • else /* X is a nonterminal */ • if M[X, a] = X  Y1Y2 …Yk then begin • pop X from the stack • push Yk Yk-1 …Y2Y1 onto the stack withY1 on top • output the productionX  Y1Y2 …Yk • end • else error() • Until X = $

Parse Table for Expression Grammar Figure 4.15+

Parse Trace of (z + q) * x + w * y

First and Follow Functions • We are going to develop two auxilliary functions for facilitating the computing of parse tables. • FIRST(α) is the set of tokens that can start strings derivable from α, • also if α ε then we add ε to First(α). • FOLLOW(N) is the set of tokens that can follow the nonterminal N in some sentential form, i.e., • FOLLOW(N) = { t | S * αNtβ }

Algorithm to Compute First • Input: Grammar symbol X • Output: FIRST(X) • Method • If X is a terminal, then FIRST(X) = {X} • If X  є is a production, then add є to FIRST(X). • For each production X  Y1Y2 … Yk • If Y1Y2 … Yi-1  є then add all tokens in FIRST(Yi) to FIRST(X) • If Y1Y2 … Yk  є then add є to FIRST(X)

Example of First Calculation • FIRST(token) = {token} for tokens: + - * / ( ) id num • FIRST(F) = { id, num, ( } • FIRST(T’) = ? • T’є so … • T’ *FT’ so … • T’ /FT’ so … • FIRST(T’) = {є … } • FIRST(T) = FIRST(F) • FIRST(E’) = ? • FIRST(E) = ? • E  T E’ • E’  + T E’ | - T E’ | є • T  F T’ • T’  * F T’ | / F T’ | є • F  id | num | ( E )

Algorithm to Compute Follow (p 189) • Input: nonterminal A • Output: FOLLOW(A) • Method • Add $ to FOLLOW(S), where • $ is the end_of_input marker • And S is the start state • If A  αBβ is a production, then every token in FIRST(β) is added to FOLLOW(B) (note not є) • If A  αB is a production or • if A  αBβ is a production and β  є then every • token in FOLLOW(A) is added to FOLLOW(B)

Example of FOLLOW Calculation • Add $ to FOLLOW(E) • E TE’ • Add FIRST*(E’) to FOLLOW(T) • E’+ T E’ (similarly E’+T E’) • Add FIRST*(E’) to FOLLOW(T) • E’є, so FOLLOW(E’) is added to FOLLOW(T) • TF T’ • Add FIRST*(T’) to FOLLOW(F) • T’є, so FOLLOW(T’) is added to FOLLOW(F) • F( E ) • Add FIRST( ‘)’ ) to FOLLOW(E) • E  T E’ • E’  + T E’ | - T E’ | є • T  F T’ • T’  * F T’ | / F T’ | є • F  id | num | ( E )

Construction of a Predictive Parse Table • Algorithm 4.4 • Input: Grammar G • Output: Predictive Parsing Table M[N, a] • Method • For each production Aα do • For each a in FIRST(α), add Aα to M[A, a] • If є is in FIRST(α), add Aα to M[A, b] for each token b in FOLLOW(A) If є is in FIRST(α) and $ is in FOLLOW(A) then add Aα to M[A, $] • Mark all other entries of M as “error”

Predictive Parsing Example • FIRST(S) = { i, a } • FIRST(S’) = {є, e } • FIRST(E) = { b } • FOLLOW(S) = { $, e } • FOLLOW(S’) = { $, e} • FOLLOW(E) = { t • Example 4.18 in text table in Figure 4.15 (slide 11) • Example 4.19 • S  iEtSS’ | a • S’  eS | є • E  b

LL(1) Grammars • A grammar is called LL(1) if its parsing table has no multiply defined entries. • LL(1) grammars • Must not be ambiguous. • Must not be left-recursive. • G is LL(1) if and only if whenever A  α | β • FIRST(α) ∩ FIRST(β) = Φ • At most one of α and β can derive є • If β * є then FIRST(α) ∩ FOLLOW(A) = Φ

Error Recovery in Predictive Parsing • Panic Mode Error recovery • If M[A, a] is an error, then throw away input tokens until one in a synchronizing set. • Heuristics for the synchronizing sets for A • Add FOLLOW(A) to the synchronizing set for A • If ‘;’ is a separator or terminator of statements then keywords that can begin statements should not be in synchronizing set for the nonterminal “Expr” because a missing “;” would cause skipping keywords. • …

Parse Table with Synch Entries • Figure 4.18

Trace with Error Recovery • Figure 4.19

Bottom up Parsing • Idea – recognize right hand sides of productions so that we produce a rightmost derivation • “Handle-pruning”

Reductions in a Shift-Reduce Parser • Figure 4.21 • E  E + E | E * E | ( E ) | id

Lecture 7 Predictive Parsing