180 likes | 301 Views
Today’s Agenda. Compilation > Syntax Analysis > Parsing Top Down Parsing - Recursive Descent - Backtracking - Predictive Parsing. Parsing. Requirement Specification: Input – Program as a token stream Output - Parse Tree (or Abstract Syntax Tree)
E N D
Today’s Agenda Compilation > Syntax Analysis > Parsing Top Down Parsing - Recursive Descent - Backtracking - Predictive Parsing
Parsing • Requirement Specification: • Input – Program as a token stream • Output - Parse Tree (or Abstract Syntax Tree) • Side-Effect – Symbol Table (all ids entered with scope) • Error - Syntax Error - Program not according to grammar. • E.g Invalid Assignment Statement • x := ; • E.g. Function body not closed • … fun() ret integer { return 0; endmodule
Top-down Parsing • Strategy: • Find a leftmost derivation for an input string • Simplest Technique: • Recursive Descent algorithm • Consider the following grammar: S --> if E then S else S L --> end S --> begin S L L --> ; S L S --> print E E --> num = num
Recursive Descent Parsing Token tok; void advance () { tok = getToken(); } void eat(Token t) { if (tok == t) advance(); else error(); } void S(void) { switch(tok) { case IF: eat(IF); E(); eat(THEN); S(); eat(ELSE); S(); break; case BEGIN: eat(BEGIN); S(); L(); break; case PRINT: eat(PRINT); E(); break; default: error(); } }
Recursive Descent Parsing void L(void) { switch(tok) { case END: eat(END); break; case SEMI: eat(SEMI); S(); L(); break; default: error(); } } void E(void) { eat(NUM); eat(EQ); eat(NUM); } Question: Where is the PDA?
Recursive Descent Parsing -Conflicts • Consider a different example • Grammar • S --> cAd A --> ab | a • Input String • cad void S(void) { switch(tok) { case c: eat(c); A(); eat(d); break; default: error(); } } void A(void) { switch(tok) { case a: /* Which rule to apply? */ break; default: error(); } }
Recursive Descent Parsing - Conflicts • Use Look-aheads • How many? • Backtracking • How do you keep track of the rules (yet to be tried)? • Rewriting the grammar • Use a bottom-up strategy
Recursive Descent Parsing - Conflicts (1-2)+3 (1-2) Consider the following grammar: E --> E + T | E – T | T T --> T * F | T / F | F F --> id | num | ( E ) and the (attempted) r.d. parser: void E(void) { switch (tok) { case ? : E(); eat(PLUS); T(); break; case ? : E(); eat(MINUS); T(); break; case ? : T(); break; … } } First terminal symbols (of r.h.s) of rules are not distinct!
Predictive Parsing – Predicting-terminals • Given β a string of symbols, define FIRST(β) • as the set of all terminals than can begin a string derived from β. • E.g. (for grammar in prev. slide): • FIRST(T*F) = { id, num, ( } • Given two grammar rules X-->β1 and X-->β2 • if β1 and β2 have non-disjoint FIRST sets • then the grammar cannot be used for predictive parsing (without additional look-ahead)
Predictive Parsing – FIRST sets • Consider the grammar: • e.g. Z-->d | X Y Z • Y-->ε | c • X-->Y | a • FIRST (XYZ) ? • Not just FIRST(X) • X derives Y so must include FIRST(Y) • Y derives ε so must include FIRST(Z) • Must identify nullable symbols and what follows
Predictive Parsing – FIRST, FOLLOW • Define nullable(X) • as true if X can derive the empty string • Define FIRST(β) • as the set of terminals that can begin strings derived from β • Define FOLLOW(X) • as the set of terminals that can immediately follow X. • i.e. t is in FOLLOW(X) if there is a derivation from start symbol for δXtη • this may happen if the derivation contains XYZt where Y and Z can both derive ε.
Predictive Parsing – Computing FIRST and FOLLOW Inductive Definition - FIRST, FOLLOW, and nullable are the smallest sets for which these properties hold: for each terminal t, FIRST[t] = { t } for each production rule X --> Y1Y2…Yk for each i from 1 to k, each j from i+1 to k if all Yi are nullable then nullable[X] = true if Y1, …,Yi-1 are nullable then FIRST[X] = FIRST[X] U FIRST[Yi] if Yi+1, …,Yk are nullable then FOLLOW[Yi] = FOLLOW[X] U FOLLOW[Yi] if Yi+1…Yj-1 are nullable then FOLLOW[Yi] = FOLLOW[Yi] U FIRST[Yj]
Predictive Parsing – Computing FIRST and FOLLOW 1. (a) Initialize FIRST[X] = {} for each nonterminal X (b) Initialize FOLLOW[X] = {} for each nonterminal X (c) Initialize FIRST[t] = {t} for each terminal t 2. for each production rule X Y1Y2…Yk if all Yi are nullable then nullable[X] = true 3. repeat for each production rule X Y1Y2…Yk for each i from 1 to k, each j from i+1 to k { if Y1, …,Yi-1 are nullable then FIRST[X] = FIRST[X] U FIRST[Yi] if Yi+1, …,Yk are nullable then FOLLOW[Yi] = FOLLOW[X] U FOLLOW[Yi] if Yi+1…Yj-1 are nullable then FOLLOW[Yi] = FOLLOW[Yi] U FIRST[Yj] } until no more changes in FIRST or FOLLOW sets
Predictive Parsing – Computing FIRST • Z --> d | X Y Z • Y --> ε | c • X --> Y | a
Predictive Parsing - FIRST • Extend defintions of FIRST and nullable: • FIRST(Xβ) = FIRST[X] if not nullable[X] • FIRST(Xβ) = FIRST[X] U FIRST(β) if nullable[X] • nullable(β) if each symbol in β is nullable.
Predictive Parsing – Table Construction • Recursive Descent parser • one function for each nonterminal X • each X-production has a clause • choose one of the clauses based on next token T • Reduces to interpreting (lookup and execute) a 2-d table: • one entry for (X, T) • Predictive Parsing table • Enter rule X β in row X, column T • for each T in FIRST[β] • enter rule X β in row X, column T • for each T in FOLLOW[X], if nullable(β),
Predictive Parsing – Table Construction • Z --> d | X Y Z • Y-->ε | c • X --> Y | a
LL(1) parsing • Construct the table for the following example: • E--> T E’ • E--> + E | ε • T -->F T’ • T’ --> * T | ε • F --> num