180 likes | 269 Views
Compiler Summary. Mooly Sagiv html://www.math.tau.ac.il/~msagiv/courses/wcc03.html. Lecture Format. Compilation stages and techniques – Review Break Last year’s exam. “Semantic” errors. Syntax errors. Context Analysis. Lexical analysis. Parser. Tokens. AST. Code Generation.
E N D
Compiler Summary Mooly Sagiv html://www.math.tau.ac.il/~msagiv/courses/wcc03.html
Lecture Format • Compilation stages and techniques – Review • Break • Last year’s exam
“Semantic” errors Syntax errors Context Analysis Lexical analysis Parser Tokens AST Code Generation Annotated AST Assembler Assembly Assembled Binary code input Library code Executable code Linker output Compiler Structure Input program
regular expressions tokens scanner input program Lexical Analysis • Input: Text (input-file) • Output: Tokens (with values) Jlex
Techniques • Convert regular expressions to finite automata • Accepting states are associated with actions • Actions which are defined first take priority • Scanner uses backtracking to find longest match
Context free grammar AST parser tokens Syntax Analysis (Parsing) • Input: Stream of tokens • Output: Abstract Syntax Tree Jcup
Recursive Descent (Top Down Parsing) • Procedure for every non-terminal • The procedure of every non-terminal identifies leftmost derivations • Consider a single token • Decide which production to apply • Works for a limited class of grammars • [The parser tables can be constructed algorithmically]
Bottom-Up Parsing • Construct the tree from the leaves • Store “states” on the stack • Identifies rightmost derivations in reverse order • Works for a limited class of grammars • The parser tables can be constructed algorithmically (SLR(0)
Context Analysis • Input:Abstract Syntax Tree • Output: Annotated Abstract Syntax Trees • Semantic errors • Several tree traversals • [Can be declaratively defined using attribute grammar] • Examples: • Name resolution • type checking • Consistency of usages • Private fields, … • “Allocate” stack slots (offsets) for variables
Code Generation • Input: AST • Output Assembly
Code generation of procedures • Generate prologue assembly code for opening a stack frame • Local (automatic) variables • callee-save registers • Generate code for procedure body • Generate epilog assembly code for closing the stack frame • Restore callee-save register • Returns to the caller
Code generation for procedure body • Code for control flow statements • Normal • Runtime checks • Exceptions • Code for side-effect free expressions • 2-phase Weighted tree (optimal) • Code for basic blocks • Avoids store/loads • Construct dependency graphs • Optimize dependency graphs • Generate code with symbolic registers • Allocate architectural registers to symbolic • Code for procedure invocation • Store caller-save registers • Transfer actual parameters • Actual call
Heap allocated data • “Long” lived • Duration can exceed procedure body • Relies on Garbage collection • Library with the help of the compiler • Garbage collection techniques • Stop the world vs. incremental • Generational • Garbage collecting algorithms • Mark and sweep • Copying • Reference counts
Runtime descriptors • Additional information on the stack/heap • Type information • Dynamic class binding • Dispatch tables • Array size • Generate code to fill information
Assembler • Convert assembly to binary • Resolve labels • Two phase • Backpatch • Simple overloading • Produce relocation information
Linker • Relocate code • Address changes
Loader • Part of the operating system • Initializes runtime state
Records Procedure nesting Object oriented Classes Virtual functions Tiger vs. TC