290 likes | 455 Views
Introduction. Compiler is tool: which translate notations from one system to another, usually from source code (high level code) to machine code (object code, target code, low level code). Compiler. Source Code. Compiler. Target Code. ?. Error Messages. What is Involved.
E N D
Introduction Compiler is tool: which translate notations from one system to another, usually from source code (high level code) to machine code (object code, target code, low level code). D. M. Akbar Hussain: Department of Software & Media Technology
Compiler Source Code Compiler Target Code ? Error Messages D. M. Akbar Hussain: Department of Software & Media Technology
What is Involved • Programming Languages • Components of a Compiler • Applications • Formal Languages D. M. Akbar Hussain: Department of Software & Media Technology
Programming Languages • We use natural languages to communicate • We use programming languages to speak with computers D. M. Akbar Hussain: Department of Software & Media Technology
Component of Compilers • Analysis • Lexical Analysis • Syntax Analysis • Semantic Analysis • Synthesis • Intermediate Code Generation • Code Optimization • Code Generation D. M. Akbar Hussain: Department of Software & Media Technology
The Input • Read string input • sequence of characters • Character set: • ASCII • ISO Latin-1 • ISO 10646 (16-bit = unicode) Ada, Java • Others (EBCDIC, JIS, etc) D. M. Akbar Hussain: Department of Software & Media Technology
The Output • Tokens: kind, name • Punctuation ( ) ; , [ ] • Operators + - ** := • Keywords begin end if while try catch • Identifiers Square_Root • String literals “press Enter to continue” • Character literals ‘x’ • Numeric literals • Integer • Floating_point D. M. Akbar Hussain: Department of Software & Media Technology
Lexical Analysis • Free form languages All modern languages are free form • White space, tabs, new line, carriage return does not matter just Ignore these. • Ordering of token is only important • Fixed format languages • Layout is critical • Fortran, label in cols 1-6 • COBOL, area A B • Lexical analyzer must know about layout to find tokens D. M. Akbar Hussain: Department of Software & Media Technology
Relevant Formalisms • Type 3 (Regular) Grammars • Regular Expressions • Finite State Machines Useful for program construction, even if hand-written D. M. Akbar Hussain: Department of Software & Media Technology
Interface to Lexical Analyzer • Either:Convert entire file to a file of tokens • Lexical analyzer is separate phase • Or:Parser calls lexical analyzer to supply next token • This approach avoids extra I/O • Parser builds tree incrementally, using successive tokens as tree nodes D. M. Akbar Hussain: Department of Software & Media Technology
Performance Issues • Speed • Lexical analysis can become bottleneck • Minimize processing per character • Skip blanks fast • I/O is also an issue (read large blocks) • We compile frequently • Compilation time is important • Especially during development • Communicate with parser through global variables D. M. Akbar Hussain: Department of Software & Media Technology
Syntax Analysis (SA) D. M. Akbar Hussain: Department of Software & Media Technology
Semantic Analyserzer D. M. Akbar Hussain: Department of Software & Media Technology
Intermediate Code Generation D. M. Akbar Hussain: Department of Software & Media Technology
Code Optimization D. M. Akbar Hussain: Department of Software & Media Technology
Code Generation D. M. Akbar Hussain: Department of Software & Media Technology
Data structure tools • Syntax tree: • Literal table: • Symbol table: D. M. Akbar Hussain: Department of Software & Media Technology
Error handler • One of the difficult part of a compiler. • Must handle a wide range of errors • Must handle multiple errors. • Must not get stuck. • Must not get into an infinite loop (typical simple-minded strategy:count errors, stop if count gets too high). D. M. Akbar Hussain: Department of Software & Media Technology
Kinds of errors ? D. M. Akbar Hussain: Department of Software & Media Technology
Sample compiler • TINY:Pages 22-26, 4-pass compiler for the TINY language based on Pascal. • C-Minus based on C :Pages 26-27 and Appendix A. D. M. Akbar Hussain: Department of Software & Media Technology
TINY Example read x; if x > 0 then fact := 1; repeat fact := fact * x; x := x - 1 until x = 0; write fact end D. M. Akbar Hussain: Department of Software & Media Technology
C-Minus Example int fact( int x ) { if (x > 1) return x * fact(x-1); else return 1; } void main( void ) { int x; x = read(); if (x > 0) write( fact(x) ); } D. M. Akbar Hussain: Department of Software & Media Technology
Structure of the TINY Compiler globals.h main.c util.h util.c scan.h scan.c parse.h parse.c symtab.h symtab.c analyze.h analyze.c code.h code.c cgen.h cgen.c D. M. Akbar Hussain: Department of Software & Media Technology
Conditional Compilation Options • NO_PARSE: Builds a scanner-only compiler. • NO_ANALYZE: Builds a compiler that parses and scans only. • NO_CODE: Builds a compiler that performs semantic analysis, but generates no code. D. M. Akbar Hussain: Department of Software & Media Technology
Listing Options (built in - not flags) • EchoSource: Echoes the TINY source program to the listing, together with line numbers. • TraceScan: Displays information on each token as the scanner recognizes it. • TraceParse: Displays the syntax tree in a linearlised format. • TraceAnalyze: Displays summary information on the symbol table and type checking. • TraceCode: Prints code generation-tracing comments to the code file. D. M. Akbar Hussain: Department of Software & Media Technology