120 likes | 136 Views
Discover the vital role of compilers in translating complex data formats. Learn about lexical and syntax analysis, symbol tables, semantic analysis, intermediate code generation, optimization, and code generation phases. Enhance your understanding to become a proficient compiler user.
E N D
Introduction to Compiling CSCI 327 Computer Science Theory
Why should I care? • A compiler is a translator. Lots of other computing problems also involve translating complex data from one format to another. • If you understand how a compiler works, then you can be a better compiler user.
Phases lexical analyzer source code syntax analyzer Symbol Table semantic analyzer intermediate code gen code optimizer code generator machine code
Preprocessor • Handle all the precompile tasks such as #include
Lexical Analysis characters tokens Lexical Analysis • converts characters into tokens • The input "A = B + 90;" is converted into • id : A • assignment_operator • id : B • addition_operator • integer : 90 • semicolon • Based on regular expressions
Syntax Analysis tokens parse tree Syntax Analysis • converts tokens into a parse tree • based on Context Free Grammar • attempts to recover from errors • begins to build the symbol table • adds A and B as variables, but types are not yet known. • There is no need to progress into Semantic Analysis if there were syntax errors.
parse tree augmented parse tree symbol table Semantic Analysis Semantic Analysis semantics = meaning int A, B; A = B + 90; • can B be added with 90? • yes, if B is an int or float • no, if B is a string, etc • what type of addition is that? • int add ≠ float add • can A be assigned the result of that operation? • No need to proceed if there are semantic errors. stmt id equal expr expr op expr id int-addnum
Intermediate Code Generation augmented parse tree 3 addr code Intermediate Code Gen Three address code is easy to optimize compared to assembly. Source code of float A, B, C; A = B + C * 90; Yields 3 address code of temp1 = (float) 90; temp2 = C * temp1; A = B + temp2;
Optimization Input of temp1 = (float) 90; temp2 = C * temp1; A = B + temp2; • temp1 and temp2 should probably be in registers • If the 90.0 gets used again soon, then try to save the contents of that register. • move adjacent independent stmts closer to other appearances
Code Generation • output could be either relocatable machine code or assembly code • Plug the input A = B + temp2 into a template to generate MOV R1, B move B into register R1 ADD R1, R2 add temp2 to R1 MOV A, R1 move result of operation to memory
Phases lexical analyzer source code syntax analyzer Symbol Table semantic analyzer intermediate code gen code optimizer code generator machine code