Compiler Designs and Constructions

Compiler Designs and Constructions Chapter 6: Top-Down Parser Objectives: Types of Parsers Backtracking Vs. Non-backtracking PDA Dr. Mohsen Chitsaz Chapter 6: Top-Down Parser

Definition: • Alphabet: Set of Char (token) • Language: Set of Token • Sentence: a group of token • Grammar: a set of (productions) that control the order of the occurrence of words • Syntax: a set of rules • Parser: software to check the syntax • Parsing: Process of breaking down the lexemes Chapter 6: Top-Down Parser

Types of Parser • Universal Parsing Method example: Cocke-Younger-Kasami Algorithm • Can parse any grammar • too inefficient to use in compiler production • Top-Down Parsing • Left most derivation • Bottom-up Parsing • Right most derivation Chapter 6: Top-Down Parser

Top-Down Parsing Non-backtracking Vs. Backtracking Large Plant Green Plant • Recursive Descent Parsing (Predictive Parser) • We execute a set of recursive procedures to process the input • There is one procedure associated with each non-terminal of a grammar Chapter 6: Top-Down Parser

Example • <type>  <simple> • <type>  ^id • <type>  array [<simple>] of <type> <simple> integer • <simple>  char • <simple>  num .. num Chapter 6: Top-Down Parser

Procedure Match (T:token); Begin If LookaHead==T then LookaHead==NextToken Else Error(); End; Chapter 6: Top-Down Parser

Procedure Type( ); Begin If LookaHead in {integer, char, num} then Simple(); Else If LookaHead == ‘^’ then Begin Match (‘^’); Match (id); End Else If LookaHead == array then Chapter 6: Top-Down Parser

Begin Match (array); Match (‘[‘); Simple(); Match (‘]’); Match (of); Type(); End Else Error(); End; Chapter 6: Top-Down Parser

Procedure Simple(); Begin If LookaHead == integer then Match (integer) Else If LookaHead == char then Match (char) Else If LookaHead == num then Begin Match (num); Match (‘..’); Match (num); End Else Error(); End; Chapter 6: Top-Down Parser

Example • <type>  <simple> • <type>  ^id • <type>  array [<simple>] of <type> <simple> integer • <simple>  char • <simple>  num .. num Chapter 6: Top-Down Parser

Example • Which production should we use? integer • <type>  ^id <type>  <simple> <simple>  integer <simple>  char Chapter 6: Top-Down Parser

Example • First (simple) = {integer, char, num} First (^id) = {^} First (array [<simple>] of type = {array}) First (type) = {^, array, integer, char, num} Chapter 6: Top-Down Parser

Definition: • A  ab • First (A) = a • A  • A  • First of  & must be disjoint Chapter 6: Top-Down Parser

Push Down Automata(Non-recursive predictive Parsing) Chapter 6: Top-Down Parser

Push Down Automata(Non-recursive predictive Parsing) • PDA = {Q, , H,, q0, Z, F} • Q = finite set of States • =set of input alphabet • H = set of stack symbols • = transition function (q, a, z) • q0 = starting state • Z = starting stack • F = final states Chapter 6: Top-Down Parser

Operation on  • state operation  change()  stay() • stack operation  push()  pop()  replace()  none() • input operation •  advance() •  retain() Chapter 6: Top-Down Parser

Example: INPUT •  = { (, ), -| } • H = {A, $} • q0 = {S} • Z = {$} S T A R T Chapter 6: Top-Down Parser

If input is (())-| • StackInput $ (())-| $A ())-| $AA ))-| $A )-| $ -| Chapter 6: Top-Down Parser

Derivation Tree • S  (S) |  Chapter 6: Top-Down Parser

LL(1) Grammar (push down machine) Scan from Left (L), Leftmost derivation (L), Look a head 1 Token Chapter 6: Top-Down Parser

Definition: • A production is called NULL if the RHS of that production is Ø A  • A production is call NULLABLE if the RHS can be reduced to Ø • B  A • A  Chapter 6: Top-Down Parser

Example of LL(1) Grammar: 1- S  AbB-| First(AbB-|) ={a,e,g, } 2- S  d First(d) = {d} 3- A  C Ab First(C Ab) = {a,e} 4- A  B First(B) = {g, } 5- B  gSd First(gSd) = {g} 6- B  First (Ø) = {} 7- C  a First (a) = {a} 8- C  ed First (ed) = {e} Chapter 6: Top-Down Parser

First () = {a|  =>*a} • Set of terminal symbols that occur at the beginning of string derived from  Chapter 6: Top-Down Parser

Follow ()= Set of input symbols that can immediately follow  Follow (A) = {b} Follow (B) = {-|, b, d} 14 • S --->AbB --->BbB 1 5 1 • S --->AbB --->AbgSd --->AbgAbBd Chapter 6: Top-Down Parser

SELECT (PREDICT): • SELECT (A->) = {First () If = Ø; First () U Follow (A)} • SELECT (1) = {a,e,g} U Follow(A)={a, e, g ,b} • SELECT (2) = {d} • SELECT (3) = {a,e} • SELECT (4) = First(4) U Follow (A) = {g,b} • SELECT (5) = {g} • SELECT (6) = First(6) U Follow (B) = {-|,b,d} • SELECT (7)={a} • SELECT (8) = {e} Chapter 6: Top-Down Parser

Is this grammar LL(1)? • S SELECT (1)  SELECT (2) ={a,e,g,b}  {d}= Ø • A SELECT (3)  SELECT (4) ={a,e}  {g,b}= Ø • B SELECT (5)  SELECT (6) = {g}  {-|,b,d}= Ø • C SELECT (7)  SELECT (8) = {a}  {e}= Ø • Def: Grammar is LL(1), IIF production with the same LHS have disjoint prediction set. Chapter 6: Top-Down Parser

Creating the Table: • Stack symbols=rows • Input symbols=columns • A b  For row A, column b =REPLACE(r), ADVANCE • A For row A, column(selection set) =REPLACE (r), RETAIN • Ab For row A, column b =POP, ADVANCE Chapter 6: Top-Down Parser

Creating the Table: • Row b, column b = POP, ADVANCE • Row Δ, column -|=ACCEPT • All other are ERROR Chapter 6: Top-Down Parser

Parsed Table Chapter 6: Top-Down Parser

1- Replace (Δ B b A), Retain 2- Pop, Advance 3- Replace (bAC), Retain 4- Replace (B), Retain 5- Replace (dS), Advance 6- Pop, Retain 7- Pop, Advance 8- Replace(d), Advance Chapter 6: Top-Down Parser

Input: b g d d -| Chapter 6: Top-Down Parser

Parse Tree Chapter 6: Top-Down Parser

Recovery in a Top-Down Parser • Advantage of TDP is efficient error recovery. • Error: If Top of stack (token) is not the same as Look a Head (token) • Recovery: • Pop stack until Top: Synchronization Token (ST) ST: is a token that can end blocks of codes Read input symbol until current LookaHead symbol matches the symbol at the top of the stack or reaches the end of input • If not end of input you are recovered Chapter 6: Top-Down Parser

LL(1) Grammars & Parser: • facts: • CFG • Leftmost Parser • Unambiguous • O(n) • Can be used for automatically generated parser Chapter 6: Top-Down Parser

Making Grammar: LL(1) • 1-Remove Common Prefixed (Left Factor) • S --> aBCd S --> aBD • S --> aB D --> • D --> Cd • <s> --> if <exp> then <action> endif; • <s> --> if <exp> then <action> else <action> endif; • <s> --> if <exp> then <action> <rest> • <rest> --> endif; • <rest> --> else <action> endif; Chapter 6: Top-Down Parser

2-Remove Left Recursion:E • E --> E + TE --> TT --> id • A--> AA-->B • A-->A1 A2A1-->BA2-->A2A2--> • E-->E1E2 E1-->T E2-->+T E2 T-->id E2--> Chapter 6: Top-Down Parser

3-Remove Unreachable Products • S-->a S -->a • B-->S Chapter 6: Top-Down Parser

4-Corner Substitution • S --> a • S --> AS • A --> ab • A --> cA • S --> aB • B --> • B --> bS • A --> ab • A --> cA • S --> cAS • S --> AB • B -->1 • B -->2 • B -->3 • S --> A1 • S --> A2 • S --> A3 Chapter 6: Top-Down Parser

5-Singleton Substitution • S-->B • S--> S'S'--> B • <S> --> <LABEL> <UNLABLE> <LABEL> --> id: <LABEL> --> <UNLABLE> --> id:= <EXP>; Chapter 6: Top-Down Parser

Singleton Substitution • -<S> --> id: <UNLABLE> <S> --> <UNLABLE> <UNLABLE> --> id:=<EXP>; • -<S> --> id:<UNLABLE> <S> --> id:= <EXP>; <UNLABLE> --> id:= <EXP>; • -<S> --> id <id-rest> <id-rest> --> : <UNLABLE> <id-rest> --> := <EXP> <UNLABLE> --> id := <EXP> Chapter 6: Top-Down Parser

Some Grammars Can Not be LL(1) • *If _ Then _ Else If <exp> then <action> If <exp> then <action> else <action> • *S --> aBcS S --> aBcSeS B --> d S --> b Chapter 6: Top-Down Parser

Q-Grammar: • A --> a1 • A --> b1 • A --> • S-Grammar: • A --> a1 • A --> b2 • A and S grammar are a LL(1) and we can make a push down Machine Chapter 6: Top-Down Parser

Compiler Designs and Constructions