120 likes | 399 Views
Kanat Bolazar April 13, 2010. Compiler Design 22. ANTLR AST Traversal (AST as Input, AST Grammars). Chars → Tokens → AST → .... Lexer Parser Tree Parser. ANTLR Syntax. grammar file, name.g. one rule. /** doc comment */
E N D
Kanat Bolazar April 13, 2010 Compiler Design22. ANTLR AST Traversal(AST as Input, AST Grammars)
Chars → Tokens → AST → .... Lexer Parser Tree Parser
ANTLR Syntax grammar file, name.g one rule /** doc comment */ kind grammar name; options {…} tokens {…} scopes… @header {…} @members {…} rules… /** doc comment */rule[String s, int z] returns [int x, int y] throws E options {…}scopes @init {…} @after {…} : | ; catch [Exception e] {…} finally {…} Trees ^(root child1 … childN)
What is LL(*)? • Natural extension to LL(k) lookahead DFA: Allow cyclic DFA that can skip ahead past common prefixes to see what follows • Analogy: like trying to decide which line to get in at the movies: long line, can’t see sign ahead from the back; run ahead to see sign • Predict and proceed normally with LL parse • No need to specify k a priori • Weakness: can’t deal with recursive left-prefixes ticket_line : PEOPLE+ STAR WARS 9 | PEOPLE+ AVATAR 2 ;
LL(*) Example s : ID+ ':' ‘x’ | ID+ '.' ‘y’ ; void s() { int alt=0; while (LA(1)==ID) consume(); if ( LA(1)==‘:’ ) alt=1; if ( LA(1)==‘.’ ) alt=2; switch (alt) { case 1 : … case 2 : … default : error; } } Note: ‘x’, ‘y’ not in prediction DFA
Tree Rewrite Rules • Maps an input grammar fragment to an output tree grammar fragment grammar T; options {output=AST;} stat : 'return' expr ';' -> ^('return' expr) ; decl : 'int' ID (',' ID)* -> ^('int' ID+) ; decl : 'int' ID (',' ID)* -> ^('int' ID)+ ;
Template Rewrite Rules • Reference template name with attribute assigments as args: • Template assign is defined like this: grammar T; options {output=template;} s : ID '=' INT ';' -> assign(x={$ID.text},y={$INT.text}) ; group T; assign(x,y) ::= "<x> := <y>;"
ANTLR AST (Abstract Syntax Tree) Processing • ANTLR allows creation and manipulation of ASTs • 1. Generate an AST (file.mj → AST in memory) grammar MyLanguage; options { output = AST; ASTLabelType = CommonTree; } • 2. Traverse, process AST → AST: tree grammar TypeChecker; options { tokenVocab = MyLanguage; output = AST; ASTLabelType = CommonTree; } 3. AST → action (Java): grammar Interpreter; options { tokenVocab = MyLanguage; }
AST Processing: Calculator 2, 3 • ANTLR expression evaluator (calculator) examples: • http://www.antlr.org/wiki/display/ANTLR3/Expression+evaluator • We are interested in the examples that build an AST, and evaluate (interpret) the language AST. • These are in the calculator.zip, as examples 2 and 3. grammar Expr; options { output=AST; ASTLabelType=CommonTree; } Expr AST tree grammar Eval; options { tokenVocab=Expr; ASTLabelType=CommonTree; } Eval
grammar Expr; options { output=AST; ASTLabelType=CommonTree; } prog: ( stat {System.out.println( $stat.tree.toStringTree());} )+ ; stat: expr NEWLINE -> expr | ID '=' expr NEWLINE -> ^('=' ID expr) | NEWLINE -> ; expr: multExpr (('+'^|'-'^) multExpr)* ; multExpr : atom ('*'^ atom)* ; atom: INT | ID | '('! expr ')'! ; tree grammar Eval; options { tokenVocab=Expr; ASTLabelType=CommonTree; } @header { import java.util.HashMap; } @members { HashMap memory = new HashMap(); } prog: stat+ ; stat: expr {System.out.println($expr.value);} | ^('=' ID expr) {memory.put($ID.text, new Integer($expr.value));} ; expr returns [int value] : ^('+' a=expr b=expr) {$value = a+b;} | ^('-' a=expr b=expr) {$value = a-b;} | ^('*' a=expr b=expr) {$value = a*b;} | ID { Integer v = (Integer)memory.get($ID.text); if ( v!=null ) $value = v.intValue(); else System.err.println("undefined var "+$ID.text); } | INT {$value = Integer.parseInt($INT.text);} ;
AST → AST, AST → Template • The ANTLR Tree construction page has examples of processing ASTs: • AST → AST: Can be used for typechecking, processing (taking derivative of polynomials/formula) • AST → Java (action): Often the final step where AST is needed no more. • AST → Template: Can simplify Java/action when output is templatized • Please see Calculator examples as well. They show which files have to be shared so tree grammars can be used.
Our Tree Grammar • Look at sample output from our AST generator (syntax_test_ast.txt): 9. program X27 (program X27 10. 11. // constants 12. final int CONST = 25; (final (TYP int) CONST 25) 13. final char CH = '\n'; (final (TYP char) CH '\n') 14. final notype[] B3 = 35; (final (ARRAY notype) B3 35) 15. 16. // classes (types) 17. class Helper { (class Helper 18. // only variable declarations... 19. int x; (VARLIST (VAR (TYP int) x) 20. char y; (VAR (TYP char) y) 21. foo[] bar; (VAR (ARRAY foo) bar))) 22. } • We can create our tree grammar from this • Also look at imaginary tokens in your AST generation