1 / 9

Compiler Design 15. ANTLR, ANTLRWorks Lexer and Parser Generator

Kanat Bolazar March 11, 2010. Compiler Design 15. ANTLR, ANTLRWorks Lexer and Parser Generator. ANTLR. ANTLR is a popular lexer and parser generator in Java. It allows LL(*) grammars, does top-down parsing. Similarities with LL(1) grammar: Does top-down parsing

seamus
Download Presentation

Compiler Design 15. ANTLR, ANTLRWorks Lexer and Parser Generator

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Kanat Bolazar March 11, 2010 Compiler Design15. ANTLR, ANTLRWorksLexer and Parser Generator

  2. ANTLR • ANTLR is a popular lexer and parser generator in Java. • It allows LL(*) grammars, does top-down parsing. • Similarities with LL(1) grammar: • Does top-down parsing • Grammar has to be fixed to remove left recursion • Uses lookahead tokens to decide which path to take • You can think of it as recursive-descent parsing. • Differences: • How far we can look ahead is not constrained • CommonTokenStream defines LA(k) and LT(k): • Both look ahead to k-th next token • LA(k) returns an int, the token code • LT(k) returns a Token object

  3. ANTLRWorks • ANTLRWorks is ANTLR IDE (integrated dev environ) • It has many nice features: • Automatically fills in common token definitions • Has standard IDE features like syntax highlighting • Regexp FSM (lexer machine) for tokens • Has a very nice debugger which can show: • input and output • parse tree and AST (abstract syntax tree) • call (rule) stack and events • grammar rule that is being executed • forward and backward execution • But also has some bugs in automatic features

  4. Running ANTLR: Inputs, Steps • You need three files before you run ANTLR: • a grammar file, Xyz.g (Microjava.g) • a Java test runner, Test.java • a test input file, such as sample.mj • There are three steps to running ANTLR: • antlr: Generate lexer and parser classes: • XyzLexer.java • XyzParser.java • javac: Compile these two and Test.java • XyzLexer.class, XyzParser.class • Test.class • java: Run Test with your input file • javac and java need two items in CLASSPATH: • antlrworks JAR file (antlrworks-1.3.1.jar) • current directory (.)

  5. Step 1. ANTLR • You may have an antlr executable: • antlr Xyz.g • Make sure you save a "grammar Xyz" in file Xyz.g • If you only have a JAR file instead, use: • java -jar antlr-3.2.jar Xyz.g • This creates two Java class source code files: • XyzLexer.java • XyzParser.java • By default, these files go in current directory • You can instead state where *.java should go: • antlr -o src Xyz.g • This time, *.java will appear in src directory.

  6. Step 2. Compile with javac • To lexer and parser, you need to add your runner: • Test.java • See ANTLR examples online for runner examples. • Before javac, set CLASSPATH environment var to have: • . (the current directory) • antlrworks-1.3.1.jar • In Linux/Unix, under bash, you may do: • export CLASSPATH=.:antlrworks-1.3.1.jar • Unlike this example, give full path to antlrworks JAR file. • On Windows, you may want to go to System / Environment Variables (and replace : with ; here) • Now (go under src if needed), compile everything: • javac Test.java • As Test uses other classes, everything will be compiled.

  7. Step 3. Run with java • Again, set CLASSPATH environment var as before • Go under src if needed (if you used -o option) • Run your test, give your input file: • java Test < input.txt • java Test < input.txt > output.txt • java Microjava < sample.mj • A grammar with no evaluation: • will be quiet if everything is OK • will only give syntax errors if input is not good • A grammar with output will display the output. • ANTLR doesn't allow running interactively • It buffers input and output • You can't enter 1+1 and see 2 right away • All output will be seen after EOF • (Control-Z in Windows, Control-D in Linux/Unix)

  8. ANTLRWorks, Other Java IDE • Instead of these steps, you can use ANTLRWorks. • To run under ANTLRWorks, just use its debugger. • It has ANTLR inside, and knows how to set the CLASSPATH for compiling and running. • *.java files produced by ANTLRWorks will be different, as they contain debugger commands. • To run ANTLR under a Java IDE, you may be able to define custom build rules for *.g files. • You should add the antlrworks JAR file to your project, to have ANTLR runtime libraries. • Make sure the libraries are used during both compilation and running.

  9. Next Steps • We will next see: • A demonstration of using ANTLR (three steps) • ANTLRWorks screenshots • We will also look at some grammar examples: • Calculator without evaluation • Calculator with evaluation • Calculator with AST • MicroJava lexer • Starting steps for MicroJava parser

More Related