150 likes | 354 Views
Lecture 7. CUP: An LALR Parser Generator for Java. 1. Introduction and Example. CUP: a system for generating LALR parsers from simple specifications. like yacc, but written in Java, uses specifications including embedded Java code, and produces parsers which are implemented in Java.
E N D
Lecture 7 CUP: An LALR Parser Generator for Java
1. Introduction and Example • CUP: • a system for generating LALR parsers from simple specifications. • like yacc, but • written in Java, uses specifications including embedded Java code, and produces parsers which are implemented in Java.
An example CUPgrammar (1. preamble) /* CUP specification for a simple expression evaluator (no actions) */ import java_cup.runtime.*; // Preliminaries to set up and use the scanner. init with {: scanner.init(); :}; scan with {: return scanner.next_token(); :};
2. Declaration of terminals and nonterminals /* Terminals (tokens returned by the scanner). */ terminal PLUS, MINUS, TIMES, DIVIDE, MOD; terminal UMINUS, LPAREN, RPAREN; SEMI, terminal Integer NUMBER; /* Non terminals */ non terminal expr_list, expr_part; nonterminal Integer expr, term, factor; // no type ==> no value associated with the symbol
3. Precedences and association /* Precedences */ precedence left PLUS, MINUS; precedence left TIMES, DIVIDE, MOD; precedence left UMINUS; // Rules: • Terminals appearing at the same line has the same precedence. • A < B iff the line A appears is above the line that B occurs • possible associativity: left, right and nonassoc.
4. the production rules /* The grammar without actions*/ expr_list ::= expr_list expr_part | expr_part; expr_part ::= expr SEMI; expr ::= expr PLUS expr | expr MINUS expr | expr TIMES expr | expr DIVIDE expr | expr MOD expr | MINUS expr %prec UMINUS | LPAREN expr RPAREN | NUMBER ;
grammar file format in summary • preamble • declarations to specify how the parser is to be generated, and supply parts of the runtime code. • Terminal and Nonterminal declarations (name and type) • precedence and associativity of terminals • grammar rules
How to use CUP • prepare a grammar file (say, parser.cup) • invoke: • java java_cup.Main <parser.cup or • java java_cup Main parser.cup • Then two files produced: • sym.java // contains constant decl one for each terminal (and nonterminal); used by scanner to refer to terminals. • parser.java // implement the parser
Grammar rules with action codes /* The grammar */ expr_list ::= expr_list expr_part | expr_part; expr_part ::= expr:e {: System.out.println("= " + e); :} SEMI ;
Grammar rules with action codes expr ::= expr:e1 PLUS expr:e2 {: RESULT = new Integer(e1.intValue() + e2.intValue()); :} | expr:e1 MINUS expr:e2 {: RESULT = new Integer(e1.intValue() - e2.intValue()); :} | expr:e1 TIMES expr:e2 {: RESULT = new Integer(e1.intValue() * e2.intValue()); :} | expr:e1 DIVIDE expr:e2 {: RESULT = new Integer(e1.intValue() / e2.intValue()); :} | expr:e1 MOD expr:e2 {: RESULT = new Integer(e1.intValue() % e2.intValue()); :} | NUMBER:n {: RESULT = n; :} | MINUS expr:e {: RESULT = new Integer(0 - e.intValue()); :} %prec UMINUS | LPAREN expr:e RPAREN {: RESULT = e; :} ;
variables available on the action code • RESULT : bound to the value of head node • name assigned to each symbol on the rhs. • nameleft, nameright of type int for position of lexeme sequences in the input. • expr ::= expr:e1 PLUS expr:e2 {: RESULT = new Integer( e1.intValue() + e2.intValue()); :} // here e1left and e1rigth are both usable.
The java_cup.runtime.Symbol class • public class Symbol { • public Symbol(int id, int l, int r, Object o) • public Symbol(int id, Object o) • public Symbol(int id, int l, int r) … public int sym; // kind of Symbol public int parse_state; // used while staying in stack boolean used_by_parser = false; public int left, right; // left right position in input stream public Object value; // filed for storing semantic value. public String toString(); }
The scanner interface package java_cup.runtime; public interface Scanner { /** Return the next token, or <code>null</code> on end-of-file. */ public Symbol next_token() throws java.lang.Exception; }
java_cup.runtime.lr_parser • public abstract class lr_parser { • public lr_parser() {} • public lr_parser(Scanner s) • { this(); setScanenr(s) } • private Scanner _scanner; • public void setScanner(Scanner s) { _scanner = s; } • public Scanner getScanner() { return _scanner; } • public void user_init() throws java.lang.Exception { %initWithCode }
public Symbol scan() throws java.lang.Exception { • Symbol sym = getScanner().next_token(); • return (sym!=null) ? sym : new Symbol(EOF_sym()); • } • public void report_fatal_error( • String message, • Object info) • throws java.lang.Exception • public Symbol parse() throws java.lang.Exception