210 likes | 393 Views
CSC 338: Compiler design and implementation. Dr. Mohamed Ben Othman. Goals:. Compilers are used everyday in most computers. Allow students to implement big programming projects.
E N D
CSC 338: Compiler design and implementation Dr. Mohamed Ben Othman
Goals: • Compilers are used everyday in most computers. • Allow students to implement big programming projects. • Parts of most projects (specially those containing command language) can be built in the same way compilers are built.
Introduction • Terminology • Compilers implementation languages • Compiler structure • Example
C Program C Compiler 8086 Machine Code Terminology • Compilers: translate programs written in high level languages. The entire program has to be translated before execution (compared to a book translation). • Interpreters: translate statement by statement (or line by line) and then this statement is given to the CPU to be executed before the next statement in being translated (compared to a man translator). • High Level Language • Source Language • Source Code • Implementation language • Target machine: the machine to which the translation will be done.
How to choose the compiler implementation language • The first compiler was written in assembly language • The compiler of Fortran77 programming language is Pascal. • It is preferable to write the compiler in the source language from which it will translate to machine language.
Source Code Compiler Object Code Linker Executable Code Translation operation The are some compilers do not respect this pattern. • Load-and-go Compilers: gives programs ready for execution. • Cross-language compilers: compilers between high level languages.
Source Code Front End Intermediate Object Code Back End Machine Code Compiler structure • Compiler is divided into two parts: Front End andBack End. • The Front End translates a program from source language to an intermediate language. • The back End translates from the intermediate language to the machine language.
Intermediate code Source Program Intermediate code generator Lexical analyzer Tokens Abstract Syntax tree Parse tree Semantic analyzer Syntactic analyzer The Front End
Intermediate code Machine Independent optimizer Optimized Intermediate code Optimized Object code Object code Object code Generator Machine dependent optimizer The Back End
Lexical analysis • The lexical analyzer reads the input program as a character stream and produces a stream of lexemes (or token strings) as output. • The lexical analyzer reads the input program character by character until it reads a word (symbol). • The lexical analyzer searches the current work in a table (called symbol table) and adds it if not found. The lexical analyzer produces an output for each symbol called token. Tokens are generally integer numbers.
Syntactic Analysis • Syntactic Analyzer (or parser): takes as input the Token stream produced by the lexical analyzer. • The parser produces a Parse Tree
Semantic Analysis • The semantic analyzer determines if the meaning is respected in the user program. • The syntax rules may be respected without respecting the meaning. • Semantic analysis is mainly to be sure that data types are used properly. • Semantic analysis is part of syntactic analysis.
Detecting errors in the source program • In all phases above the main goal is to determine if the program source respects the source language rules. • Example: for (int i $ 1; i<n; i++) x++; if x > N Y -= 3 else Y += 3; This error may be detected at lexical analysis This is a syntax error
Intermediate code generation • The intermediate code generator produces a code that is not related to the target machine. The intermediate code has to be very close to the and very easy to translate to machine language.
Object Code • The target code is the machine code. • The machine code generation is not the same as the intermediate code generation. • A assembly language code generation may be done in the same time
Lexical analysis example: (Pascal program) PROGRAM AverageNumbers(Input, Output); CONST Amount = 3; VAR Average : Real; x : ARRAY[1..Amount] OF Integer; i, Sum : Integer; BEGIN x[1]:=3; x[2]:=6; x[3]:=10; Sum := 0; FOR i := 1 TO Amount DO Sum := Sum + x[i]; Average := Sum/Amount END. { AverageNumbers }
Lexical analysis result PROGRAM ID (ID , ID ); CONST ID = NUMLITERAL; VAR ID : ID ; ID : ARRAY[1 .. ID ] OF ID ; ID, ID : ID ; BEGIN and so on
Miscellaneous • Symbol Table : contains all keywords and symbols. • Symbol Table Handler: manages the symbol table. • Error Handling : gives a clear description of errors.
Source Program Lexical analysis Syntax analysis Symbol table Inter. code generation Error handling code optimization code generation Target program Resume: