340 likes | 590 Views
CS 153: Concepts of Compiler Design November 3 Class Meeting. Department of Computer Science San Jose State University Fall 2014 Instructor: Ron Mak www.cs.sjsu.edu/~mak. Unofficial Field Trip. Computer History Museum in Mt. View http://www.computerhistory.org/
E N D
CS 153: Concepts of Compiler DesignNovember 3 Class Meeting Department of Computer ScienceSan Jose State UniversityFall 2014Instructor: Ron Mak www.cs.sjsu.edu/~mak
Unofficial Field Trip • Computer History Museum in Mt. View • http://www.computerhistory.org/ • Saturday, November 8, 11:30 – closing time • Special free admission. • Do a self-guided tour of the new Revolution exhibit. • See a life-size working model of Charles Babbage’s Difference Engine in operation, a hand-cranked mechanical computer designed in the early 1800s. • Experience a fully restored IBM 1401 mainframe computer from the early 1960s in operation. • General info: http://en.wikipedia.org/wiki/IBM_1401 • My summer seminar: http://www.cs.sjsu.edu/~mak/1401/ • Restoration: http://ed-thelen.org/1401Project/1401RestorationPage.html
Unofficial Field Trip • The new Revolution exhibit is now open! • Walk through a timeline of the First 2000 Years of Computing History. • Historic computer systems, data processing equipment, and other artifacts. • Small theater presentations. Hollerith Census Machine Atanasoff-Berry Computer
Unofficial Field Trip • Babbage Difference Engine, fully operational. • Hand-cranked mechanical computer. • Computed polynomial functions. • Designed by Charles Babbage in the early to mid 1800s. • Arguably the world’s first computer scientist, lived 1791-1871. • He wasn’t able to build it because he lost his funding. • Live demo at 1:00 • His plans survived and this working model was built. • Includes a working printer! http://www.computerhistory.org/babbage/
Unofficial Field Trip • IBM 1401 computer, fully restored and operational • A small transistor-based mainframe computer. • Extremely popular with small businesses in the late 1950s through the mid 1960s • Maximum of 16K bytes of memory. • 800 card/minute card reader (wire brushes). • 600 line/minute line printer (impact). • 6 magnetic tape drives, no disk drives.
Tesla Headquarters Visit Bourne Joseph student Computer Engineering Caires Debra professor Computer Science DhillonGurleen student Computer Engineering DimovDima student Computer Science Estell Khalil student Computer Engineering Flores Pedro student Computer Science Gallegos Kevin student Computer Engineering Inzunza Oscar student Computer Science Joshi Hardik student Computer Science Kang Jack student Computer Science KannanPrakasam student Computer Science Koumis Alex student Computer Engineering Long Camille student Computer Engineering Mak Ron professor Computer Science NarahariSrikanth student Computer Science Nguyen Duy student Computer Engineering Patel Jay student Computer Science Shah Shukan student Computer Science Thorpe David student Computer Science Torrefranca Daniel student Computer Engineering Tran Duc student Computer Engineering Tran Hung student Computer Engineering Tsui Helen student Computer Engineering Wei Eileen student Computer Science Wong Nelson student Computer Engineering • Friday, November 14 at 2:00 • 3500 Deer Creek RoadPalo Alto
Pcl • Pclis a teeny, tiny subset of Pascal. • Use JavaCC to generate a Pcl parser and integrate with our Pascal interpreter’s • symbol table components • parse tree components • We’ll be able to parse and print the symbol table and the parse tree • in our favorite XML format • Sample program test.pcl: PROGRAM test; VAR i, j, k : integer; x, y, z : real; BEGIN i := 1; j := i + 3; x := i + j; y := 314.15926e-02 + i - j + k; z := x + i*j/k - x/y/z END.
Pcl Challenges • Get the JJTree parse trees to build properly with respect to operator precedence. • Use embedded definite node descriptors! • Decorate the parse tree with data type information. • Can be done as the tree is built, or as a separate pass. • You can use the visitor pattern to implement the pass. • Hook up to the symbol table and parse tree printing classes from the Pascal interpreter.
Pcl, cont’d options{ JJTREE_OUTPUT_DIRECTORY="src/wci/frontend"; NODE_EXTENDS="wci.intermediate.icodeimpl.ICodeNodeImpl"; ... } PARSER_BEGIN(PclParser) ... public class PclParser { // Create and initialize the symbol table stack. symTabStack = SymTabFactory.createSymTabStack(); Predefined.initialize(symTabStack); ... // Parse a Pcl program. Reader reader = new FileReader(sourceFilePath); PclParser parser = new PclParser(reader); SimpleNoderootNode = parser.program(); ...
Pcl, cont’d ... // Print the cross-reference table. CrossReferencercrossReferencer = new CrossReferencer(); crossReferencer.print(symTabStack); // Visit the parse tree nodes to decorate them with type information. TypeSetterVisitortypeVisitor = new TypeSetterVisitor(); rootNode.jjtAccept(typeVisitor, null); // Create and initialize the ICode wrapper for the parse tree. ICodeiCode = ICodeFactory.createICode(); iCode.setRoot(rootNode); programId.setAttribute(ROUTINE_ICODE, iCode); // Print the parse tree. ParseTreePrintertreePrinter = new ParseTreePrinter(System.out); treePrinter.print(symTabStack); } PARSER_END(PclParser) Demo
JavaCC Grammar Repository • Check these out to get ideas and modelshttp://mindprod.com/jgloss/javacc.html
Syntax Error Handling and JavaCC • Detect the error. • JavaCC does that based on the grammar in the .jj file. • Flag the error. • JavaCC does that for you with its error messages. • Recover from the error so you can continue parsing. • You set this up using JavaCC._
Token Errors • By default, JavaCC throws an exception whenever it encounters a bad token. • Token errors are considered extremely serious and will stop the translation unless you take care to recover from them. • Example LOGO program that moves a cursor on a screen: FORWARD 20RIGHT 120FORWARD 20
Token Errors, cont’d SKIP : { " " | "\n" | "\r" | "\r\n" } TOKEN : { <FORWARD : "FORWARD"> | <RIGHT : "RIGHT"> | <DIGITS: (["1"-"9"])+ (["0"-"9"])*> • What happens if we feed the tokenizer bad input? • FORWARD 20LEFT 120FORWARD 20 logo_tokenizer.jj
Token Errors, cont’d • One way to recover from a token error is to skip over the erroneous token. public static void main(String[] args) throws Exception { java.io.Reader reader = new java.io.FileReader(args[0]); SimpleCharStreamscs = new SimpleCharStream(reader); LogoTokenManagermgr = new LogoTokenManager(scs); while (true) { try { if (readAllTokens(mgr).kind == EOF) break; } catch (TokenMgrErrortme) { System.out.println("TokenMgrError: " + tme.getMessage()); skipTo(' '); } } }
Token Errors, cont’d private static void skipTo(char delimiter) throws java.io.IOException { String skipped = ""; char ch; System.out.print("*** SKIPPING ... "); while ((ch = input_stream.readChar()) != delimiter) { skipped += ch; } System.out.println("skipped '" + skipped + "'"); } logo_skip_chars.jj
Synchronize the Parser • Skipping over a bad token isn’t a complete solution. • The parser still needs to synchronize at the next good token and then attempt to continue parsing.
Synchronize the Parser, cont’d • First, add an error token to represent any invalid input characters: SKIP : { " " } TOKEN : { <FORWARD : "FORWARD"> | <RIGHT : "RIGHT"> | <DIGITS : (["1"-"9"])+ (["0"-"9"])*> | <EOL : "\r" | "\n" | "\r\n"> | <ERROR : ~["\r", "\n"]> } Any character except \r or \n.
Synchronize the Parser, cont’d • A program consists of one or more move (FORWARD) and turn (RIGHT) commands. • Must also allow for an erroneous command. void Program() : {} { ( try { MoveForward() {System.out.println("Processed Move FORWARD");} | TurnRight() {System.out.println("Processed Turn RIGHT");} | Error() {handleError(token);} } catch (ParseException ex) { handleError(ex.currentToken); } )+ }
Synchronize the Parser, cont’d • The Error() production rule is invoked for the <ERROR> token. • The <ERROR> token consumes the bad character. void MoveForward() : {} { <FORWARD> <DIGITS> <EOL> } void TurnRight() : {} { <RIGHT> <DIGITS> <EOL> } void Error() : {} { <ERROR> }
Synchronize the Parser, cont’d • The JAVACODE header precedes pure Java code that’s inserted into the generated parser. JAVACODE String handleError(Token token) { System.out.println("*** ERROR: Line " + token.beginLine + " after \"" + token.image + "\""); Token t; do { t = getNextToken(); } while (t.kind != EOL); return t.image; } Synchronize the parser to the next “good” token (EOL). You can do this better with a complete synchronization set! logo_synchronize.jj
Repair the Parse Tree • After the parser recovers from an error, you may want to remove a partially-built AST node. • The erroneous production must call jjtree.popNode(). JAVACODE String handleError(Token token) #void { System.out.println("*** ERROR: Line " + token.beginLine + " after \"" + token.image + "\""); Token t; do { t = getNextToken(); } while (t.kind != EOL); jjtree.popNode(); return t.image; } logo_tree_recover.jjt
Debugging the Parser • Add the option to debug the parser. • Print production rule method calls and returns. • Print which tokens are consumed. options { DEBUG_PARSER=true; }
Review: Interpreter vs. Compiler • Same front end • parser, scanner, tokens • Same intermediate tier • symbol tables, parse trees
Review: Interpreter vs. Compiler, cont’d • Different back end operations. • Interpreter: Use the symbol tables and parse trees to execute the source program. • executor • Compiler: Use the symbol tables and parse trees to generate an object program for the source program. • code generator
Target Machines • A compiler’s back end code generator produces object code for a target machine. • Target machine: the Java Virtual Machine (JVM) • Object language: the Jasmin assembly language • The Jasmin assembler translates the assembly language program into .class files. • Java implements the JVM and loads and executes .class files._
Target Machines, cont’d • Instead of using javacto compile a source program written in Java into a .class file, use your compiler to compile a source program written in your chosen language into a Jasmin object program, and then use the Jasmin assembler to create the .class file. • No matter what language the source program was originally written in, once it’s been compiled into a .classfile, the JVM will be able to load and execute it. • The JVM runs on a wide variety of hardware platforms.
Java Virtual Machine (JVM) Architecture • Java stack • runtime stack • Heap area • dynamically allocated objects • automatic garbage collection • Classarea • code for methods • constants pool • Native method stacks • support native methods, e.g., written in C • (not shown)
Java Virtual Machine Architecture, cont’d • The runtime stack contains stack frames. • Stack frame = activation record. • Each stack frame contains • local variables array • operand stack • program counter (PC) What is missing in the JVM that we had in our Pascal interpreter?
The JVM’s Java Runtime Stack • Each method invocation pushes a stack frame. • Equivalent to the activation record of our Pascal interpreter. • The stack frame currently on top of the runtime stack is the active stack frame. • A stack frame is popped off when the method returns, possibly leaving behind a return value on top of the stack.
Contents of a JVMStack Frame • Operand stack • for doing computations • Local variables array • equivalent to the memory map in our Pascal interpreter’s activation record • Program counter (PC) • keeps track of the currently executing instruction
JVM Instructions • Load and store values • Arithmetic operations • Type conversions • Object creation and management • Runtime stack management (push/pop values) • Branching • Method call and return • Throwing exceptions • Concurrency
Jasmin Assembler • Download from: • http://jasmin.sourceforge.net/ • The site also includes: • User Guide • Instruction set • Sample programs_
Example Jasmin Program hello.j • Assemble: • java –jar jasmin.jarhello.j • Execute: • java HelloWorld .class public HelloWorld .super java/lang/Object .method public static main([Ljava/lang/String;)V .limit stack 2 .limit locals 1 getstatic java/lang/System/out Ljava/io/PrintStream; ldc"Hello World." invokevirtual java/io/PrintStream/println(Ljava/lang/String;)V return .end method Demo