260 likes | 468 Views
Simple One-Pass Compiler. Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University. Outline. Translation Scheme. Annotated Parse Tree. Parsing Fundamental. Top-Down Parsers. Abstract Stack Machine. Simple Code Generation. Simple One-Pass Compiler. Scanner.
E N D
Simple One-Pass Compiler Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University
Outline • Translation Scheme. • Annotated Parse Tree. • Parsing Fundamental. • Top-Down Parsers. • Abstract Stack Machine. • Simple Code Generation.
Simple One-Pass Compiler Scanner Source program (text stream) m a i n ( ) { Parser Object Code (text stream)
Sample Grammar expr expr + term expr expr - term expr term term 0 | 1 | 2 | ... | 9
Derivation • String: 9 – 5 + 2 expr expr + term expr – term + term term – term + term 9 – term + term 9 – 5 + term 9 – 5 + 2 • leftmost/rightmost derivation
Parse Tree expr expr term expr term term 9 - 5 + 2
Translation Scheme • Context-free Grammar with Embedded Semantic Actions. expr ::= expr + term expr ::= expr – term expr ::= term term ::= 0 term ::= 1 ... term ::= 9 { print(‘+’); } { print(‘-’); } { print(‘0’); } { print(‘1’); } { print(‘9’); } emitting (พ่น) a translation
Parse Tree with Semantic Actions expr + { print(‘+’) } expr term - { print(‘-’) } 2 { print(‘2’) } expr term term 5 { print(‘5’) } 9 { print(‘9’) } Depth-first traversal Input:9 – 5 + 2 Output: 9 5 - 2 +
Location of Semantic Actions • Semantic Actions can be placed anywhere on the RHS. expr ::= {print(‘+’);} expr + term expr ::= {print(‘-’);} expr – term expr ::= term term ::= 0 {print(‘0’);} term ::= 1 {print(‘1’);} ... term ::= 9 {print(‘9’);}
Parsing Approaches • Top-down parsing • build parse tree from start symbol • match result terminal string with input stream • simple but limit in power • Bottom-up parsing • start from input token stream • build parse tree from terminal until get start symbol • complex but powerful
Top Down vs.Bottom Up start here result match start here result input token stream input token stream Top-down Parsing Bottom-up Parsing
Example type ::= simple | ^id | array [ simple ] of type simple ::= integer | char | num dotdot num Input Token String array [ num dotdot num ] of integer
Top-Down Parsing with Left-to-right Scanning of Input Stream type array [ simple ] of type Input array [ num dotdot num ] of integer lookahead token
Backtracking(Recursive-Descent Parsing) simple integer char num Input array [ num dotdot num ] of integer lookahead token
Predictive Parsing type ::= simple | ^id | array [ simple ] of type simple ::= integer | char | num dotdot num type array [ simple ] of type Input array [ num dotdot num ] of integer lookahead token
The Program for Predictive Parser match (scanner) Input (text stream) a r r a y [ OK match(‘array’) Predictive Parser Output
The Program for Predictive Parsing procedurematch ( t : token ); proceduresimple; begin begin if lookahead = tthen iflookahead = integer then lookahead := nexttokenmatch ( integer ) elseerrorelse iflookahead = char then end; match ( char ) else iflookahead = num then begin proceduretype; match ( num ) match ( dotdot ) match ( num ) begin end iflookahead is in { integer, char, num } then elseerror simple end; else if lookahead = ‘^ ‘ then begin match ( ‘ ^ ’ ); match ( id ) end else if lookahead = array then begin match ( array ); match ( ‘ [ ‘ ); simple; match ( ‘ ] ‘ ); match ( of ); type end elseerror end;
Mapping Between Production and Parser Codes type -> arrary [ simple] of type match(array); match(‘[‘); simple; match(‘]’); match(of); type scanner parser parsing (recognition) of simple
Lookahead Symbols A -> a FIRST( a ) = set of fist token in strings generated from a FIRST(simple) = { integer, char, num } FIRST( ^id ) = { ^ } FIRST(array [ simple] of type) = { array }
Rules for Predictive Parser • If A -> a and A -> b then FIRST(a) and FIRST(b) are disjoint • e-production stmt -> beginopt_stmtsend opt_stmts -> stmt_list opt_stmts | e
Left Recursion • Left Recursion => Parser loops forever A -> Aa | b expr -> expr + term | term • Rewriting... A -> b R R -> a R | e
expr expr + term expr expr - term expr term term 0 | 1 | 2 | ... | 9 Example expr term rest rest + term rest | - term rest | e term 0 | 1 | 2 | ... | 9
Semantic Actions expr term rest rest + term {print(‘+’);} rest | - term {print(‘-’);} rest | e term 0 {print(‘0’);} | 1 {print(‘1’);} ...
expr term rest rest + term {print(‘+’);} rest | - term {print(‘-’);} rest | e term 0 {print(‘0’);} ... procedure rest; begin if lookahead = ‘+’ then begin match(‘+’); term(); print(‘+’); rest(); else if lookahead = ‘-’ then begin match(‘-’); term(); print(‘-’); rest(); end; end; procedureexpr; begin term(); rest(); end;