180 likes | 489 Views
Core. Core: Simple prog. language for which you will write an interpreter as your project. First define the Core grammar Next look at the details of how an interpreter for Core may be written. Approach to be used in interpreter: Recursive descent (also “syntax directed”). BNF for Core.
E N D
Core • Core: Simple prog. language for which you will write an interpreter as your project. • First define the Core grammar • Next look at the details of how an interpreter for Core may be written. • Approach to be used in interpreter: Recursive descent (also “syntax directed”) CSE 3341/655; Part 2
BNF for Core <prog> ::= program <decl seq> begin <stmt seq> end (1) <decl seq> ::= <decl> | <decl> <decl seq> (2) <stmt seq> ::= <stmt> | <stmt> <stmt seq> (3) <decl> ::= int <id list>; (4) <id list> ::= <id> | <id>, <id list> (5) <stmt> ::= <assign>|<if>|<loop>|<in>|<out> (6) <assign> ::= <id> = <exp>; (7) <if> ::= if <cond> then <stmt seq> end; (8) |if <cond> then <stmt seq> else <stmt seq> end; <loop> ::= while <cond> loop <stmt seq> end; (9) <input> ::= read <id list>; (10) <output> ::= write <id list>; (11) CSE 3341/655; Part 2
BNF for Core (contd.) <cond> ::= <comp>|!<cond> (12) | [<cond> && <cond>] | [<cond> or <cond>] <comp> ::= (<op> <comp op> <op>) (13) <exp> ::= <fac>|<fac>+<exp>|<fac>-<exp> (14) <fac> ::= <op> | <op> * <fac> (15) <op> ::= <int> | <id> | (<exp>) (16) <comp op> ::= != | == | < | > | <= | >= (17) <id> ::= <let> | <let><id> | <let><int> (18) <let> ::= A | B | C | ... | X | Y | Z (19) <int> ::= <digit> | <digit><int> (20) <digit> ::= 0 | 1 | 2 | 3 | ... | 9 (21) • Notes: Problem with <exp>: consider 9-5+4; fix? -5 is not a legal <no>; fix? Productions (18)-(21) have no semantic significance; CSE 3341/655; Part 2
program <decl seq> begin <stmt seq> end <decl> <stmt> <stmt seq> int <id list> ; <stmt> <assign> <id> <output> <id> <exp> = ; write ; <let> <id list> <let> <...> <id> x <let> x x Parse Tree for a simple program programint X; begin X = 25; write X; end <prog> CSE 3341/655; Part 2
programint x; begin X = 25; output X; end <prog> program <decl seq> begin <stmt seq> end <decl> <stmt> <stmt seq> int <id list> ; <stmt> ? <assign> <id> <output> <id> <exp> = ; write ; <let> <id list> <let> <...> <id> X ? <let> ? x X Concrete vs. Abstract Parse Trees CSE 3341/655; Part 2
<prog> <decl seq> <stmt seq> <decl> <stmt> <stmt> <id list> <id list> <assign> <output> <id> <id> <fac> <id> X X <oper> X <int> 25 Abstract Parse Tree program int X; begin X = 25; write X; end 1. What if we had declared Y instead of X? 2. What if we had exchanged the two statements? CSE 3341/655; Part 2
Core Interpreter • Tokenizer:Inputs Core program, produces stream of tokens; • Parser: Consumes stream of tokens, produces the abstract parse tree (PT); • Printer: Given PT, prints the original prog. in a pretty format • Executor: Given PT, executes the program; • Parser, Printer, Executor: use recursive descent approach. CSE 3341/655; Part 2
Tokenizer • Tokens: Reserved words: program, begin, end, int, if, then, else, while, loop, read, write Operators/special symbols: ; , = ! [ ] && or ( ) + - * != == < > <= >= Integers (unsigned) Identifiers (start with uc letter, followed by zero or more uc letters followed by zero or more digits) CSE 3341/655; Part 2
Tokenizer methods ... • getToken(): returns (info about) current token; Repeated calls to getToken() return same token. • skipToken(): skips current token; next token becomes current token; so next call to getToken() will return new token. • intVal(): returns the value of the current (integer) token; (what if current token is not an integer? -- error!) • idName(): returns the name (string) of the current (id) token. (what if current token is not an id? -- error!) CSE 3341/655; Part 2
Recursive Descent • Key idea: Single procedure PN corr. to each non-term. N PN is responsible for every occurrence of N andonly occurrences of N Will use this approach for parsing, printing, execution • Details: • Obtain abstract parse tree • Pass root node to PS (S is starting non-term.) • Each PN gets most of the work done by procedures correspoding to the children of the nodes it receives as argument CSE 3341/655; Part 2
... ... ... void execIf( ?? ) { bool b = evalCond( ??); if (b) then { execSS(??); return; } else if (?alt?) then {execSS(??); return; } else return; } So, need: 1. Non-term. at current node 2. Alternative at current node 3. Move to children nodes Recursive Descent (contd.) Example <if> <stmt seq> <cond> <stmt seq> CSE 3341/655; Part 2
A (bad!) representation of PTs An array representation of parse trees: • Each node in tree ↔ row in array; • Each row has 5 columns: • Number corresponding to thenon-terminal at the node; • Number corresponding to alternative used; • The row numbers of children nodes. Representation of the <if> statement in the last page: ... CSE 3341/655; Part 2
Recursive Descent (contd) void execIf( int n ) { // n is row no. of <if> node bool b = evalCond( PT[n,3]); // PT is the parse tree array if (b) then { execSS(PT[n,4]); return; } else if (PT[n,2] == 2) then {execSS(PT[n,5]); return; } else return; } • Why do we need PT[n,1]? • Why 5 columns in a row? • What about <int>? what about <id>? CSE 3341/655; Part 2
Recursive Descent (contd) void printIf( int n ) { // n: row no. of <if> node // check PT[n,1] to see if this is <if> node write(“if”); printCond( PT[n,3]); // don’t we have to evaluate the condition? write(“then”); printSS(PT[n,4]); // what if it was not an <SS>? if (PT[n,2]==2) { write(“else”); printSS(PT[n,5]); } write(“end;”); } CSE 3341/655; Part 2
Recursive Descent (contd) void printAssign( int n ) { // n: row no. of <assign> node // check PT[n,1] to see if this is <assign> node printId( PT[n,3] ); write(“=”); print Exp( PT[n,4]); } // bug in this code! CSE 3341/655; Part 2
Recursive Descent (contd) void execAssign( int n ) { // n: row no. of <assign> node // check PT[n,1] to see if this is <assign> node int x = evalExp(PT[n,4]); // don’t we have to first take care of PT[n,3]? assignIdVal(PT[n,3], x); // what about PT[n,2]? PT[n,5]? } CSE 3341/655; Part 2
Parser Parsing is harder: No tree to descend! The trick: Build the tree *as* you descend! Approach: Calling procedure will create an "empty" node -by grabbing the next free row from the PT array- and pass it to the appropriate parse procedure CSE 3341/655; Part 2
Recursive Descent Parsing (Note: "t" is the (global) Tokenizer.) void parseIf( int n ) { // node created by *caller* - who? PT[n,1] = 8; // why? string s = t.getToken(); // if s != “if” error! PT[n,3] = nextRow++; // next free row; initialize? parseCond(PT[n,3]); // bug! PT[n,4] = nextRow++; parseSS(PT[n,4]); // bug! s = t.getToken(); if (s!=“else”) {return; // bug! bug!} t.skipToken(); PT[n,5]=nextRow++; parseSS(PT[n,5]); return; // not so fast! } CSE 3341/655; Part 2