180 likes | 198 Views
Develop an interpreter for the Core programming language using a recursive descent approach. Implement tokenizer, parser, printer, and executor for the language.
E N D
Core • Core: Simple prog. language for which you will write an interpreter as your project. • First define the Core grammar • Next look at the details of how an interpreter for Core may be written. • Approach to be used in interpreter: Recursive descent (also “syntax directed”) • **The tabs on the next two pages don’t work correctly on the classroom PCs – need to reformat for use on those …** CSE 3341/655; Part 2
BNF for Core <prog> ::= program <declseq> begin <stmtseq> end (1) <declseq> ::= <decl> | <decl> <declseq> (2) <stmtseq> ::= <stmt> | <stmt> <stmtseq> (3) <decl> ::= int <id list>; (4) <id list> ::= <id> | <id>, <id list> (5) <stmt> ::= <assign>|<if>|<loop>|<in>|<out> (6) <assign> ::= <id> = <exp>; (7) <if> ::= if <cond> then <stmtseq> end; (8) |if <cond> then <stmtseq> else <stmtseq> end; <loop> ::= while <cond> loop <stmtseq> end; (9) <in> ::=read <id list>; (10) <out> ::=write <id list>; (11) CSE 3341/655; Part 2
BNF for Core (contd.) <cond> ::= <comp>|!<cond> (12) | [<cond> && <cond>] | [<cond> or <cond>] <comp> ::= (<op> <comp op> <op>) (13) <exp> ::= <fac>|<fac>+<exp>|<fac>-<exp> (14) <fac> ::= <op> | <op> * <fac> (15) <op> ::= <int> | <id> | (<exp>) (16) <comp op> ::= != | == | < | > | <= | >= (17) <id> ::= <let> | <let><id> | <let><int> (18) <let> ::= A | B | C | ... | X | Y | Z (19) <int> ::= <digit> | <digit><int> (20) <digit> ::= 0 | 1 | 2 | 3 | ... | 9 (21) • Notes: Problem with <exp>: consider 9-5+4; fix? -5 is not a legal <no>; fix? Productions (18)-(21) have no semantic significance; CSE 3341/655; Part 2
program <decl seq> begin <stmt seq> end <decl> <stmt> <stmt seq> int <id list> ; <stmt> <assign> <id> <output> <id> <exp> = ; write ; <let> <id list> <let> <...> <id> x <let> x x Parse Tree for a simple program programint X; begin X = 25; write X; end <prog> CSE 3341/655; Part 2
programint x; begin X = 25; output X; end <prog> program <decl seq> begin <stmt seq> end <decl> <stmt> <stmt seq> int <id list> ; <stmt> ? <assign> <id> <output> <id> <exp> = ; write ; <let> <id list> <let> <...> <id> X ? <let> ? x X Concrete vs. Abstract Parse Trees CSE 3341/655; Part 2
<prog> <decl seq> <stmt seq> <decl> <stmt> <stmt> <id list> <id list> <assign> <output> <id> <id> <fac> <id> X X <oper> X <int> 25 Abstract Parse Tree program int X; begin X = 25; write X; end 1. What if we had declared Y instead of X? 2. What if we had exchanged the two statements? CSE 3341/655; Part 2
Core Interpreter • Tokenizer:Inputs Core program, produces stream of tokens; • Parser: Consumes stream of tokens, produces the abstract parse tree (PT); • Printer: Given PT, prints the original prog. in a pretty format • Executor: Given PT, executes the program; • Parser, Printer, Executor: use recursive descent approach. • Mention Lex, YACC, Flex, Bison, Antlr, … CSE 3341/655; Part 2
Tokenizer • Tokens: Reserved words: program, begin, end, int, if, then, else, while, loop, read, write Operators/special symbols: ; , = ! [ ] && or ( ) + - * != == < > <= >= Integers (unsigned) Identifiers (start with uc letter, followed by zero or more uc letters followed by zero or more digits) CSE 3341/655; Part 2
Tokenizer methods ... • getToken(): returns (info about) current token; Repeated calls to getToken() return same token. • skipToken(): skips current token; next token becomes current token; so next call to getToken() will return new token. • intVal(): returns the value of the current (integer) token; (what if current token is not an integer? -- error!) • idName(): returns the name (string) of the current (id) token. (what if current token is not an id? -- error!) CSE 3341/655; Part 2
Recursive Descent • Key idea: Single procedure PN corr. to each non-term. N PN is responsible for every occurrence of N andonly occurrences of N Will use this approach for parsing, printing, execution • Details: • Obtain abstract parse tree • Pass root node to PS (S is starting non-term.) • Each PN gets most of the work done by procedures correspoding to the children of the nodes it receives as argument CSE 3341/655; Part 2
... ... ... void execIf( ?? ) { bool b = evalCond( ??); if (b) then { execSS(??); return; } else if (?alt?) then {execSS(??); return; } else return; } So, need: 1. Non-term. at current node 2. Alternative at current node 3. Move to children nodes Recursive Descent (contd.) Example <if> <stmt seq> <cond> <stmt seq> CSE 3341/655; Part 2
A (bad!) representation of PTs An array representation of parse trees: • Each node in tree ↔ row in array; • Each row has 5 columns: • Number corresponding to thenon-terminal at the node; • Number corresponding to alternative used; • The row numbers of children nodes. Representation of the <if> statement in the last page: ... CSE 3341/655; Part 2
Recursive Descent (contd) void execIf( int n ) { // n is row no. of <if> node bool b = evalCond( PT[n,3]); // PT is the parse tree array if (b) then { execSS(PT[n,4]); return; } else if (PT[n,2] == 2) then {execSS(PT[n,5]); return; } else return; } • Why do we need PT[n,1]? • Why 5 columns in a row? • What about <int>? what about <id>? CSE 3341/655; Part 2
Recursive Descent (contd) void printIf( int n ) { // n: row no. of <if> node // check PT[n,1] to see if this is <if> node write(“if”); printCond( PT[n,3]); // don’t we have to evaluate the condition? write(“then”); printSS(PT[n,4]); // what if it was not an <SS>? if (PT[n,2]==2) { write(“else”); printSS(PT[n,5]); } write(“end;”); } CSE 3341/655; Part 2
Recursive Descent (contd) void printAssign( int n ) { // n: row no. of <assign> node // check PT[n,1] to see if this is <assign> node printId( PT[n,3] ); write(“=”); print Exp( PT[n,4]); } // bug in this code! CSE 3341/655; Part 2
Recursive Descent (contd) void execAssign( int n ) { // n: row no. of <assign> node // check PT[n,1] to see if this is <assign> node int x = evalExp(PT[n,4]); // don’t we have to first take care of PT[n,3]? assignIdVal(PT[n,3], x); // what about PT[n,2]? PT[n,5]? } CSE 3341/655; Part 2
Parser Parsing is harder: No tree to descend! The trick: Build the tree *as* you descend! Approach: Calling procedure will create an "empty" node -by grabbing the next free row from the PT array- and pass it to the appropriate parse procedure CSE 3341/655; Part 2
Recursive Descent Parsing (Note: "t" is the (global) Tokenizer.) void parseIf( int n ) { // node created by *caller* - who? PT[n,1] = 8; // why? string s = t.getToken(); // if s != “if” error! PT[n,3] = nextRow++; // next free row; initialize? parseCond(PT[n,3]); // bug! PT[n,4] = nextRow++; parseSS(PT[n,4]); // bug! s = t.getToken(); if (s!=“else”) {return; // bug! bug!} t.skipToken(); PT[n,5]=nextRow++; parseSS(PT[n,5]); return; // not so fast! } CSE 3341/655; Part 2