170 likes | 318 Views
Topic 6 -Code Generation. Dr. William A. Maniatty Assistant Prof. Dept. of Computer Science University At Albany CSI 511 Programming Languages and Systems Concepts Fall 2002 Monday Wednesday 2:30-3:50 LI 99. Introduction to Code Generation.
E N D
Topic 6 -Code Generation Dr. William A. Maniatty Assistant Prof. Dept. of Computer Science University At Albany CSI 511 Programming Languages and Systems Concepts Fall 2002 Monday Wednesday 2:30-3:50 LI 99
Introduction to Code Generation • Compilers use multi-phase methods for code generation
Code Generation Vs. Semantic Analysis • Code Generation focus is the last 2 steps. • Code generation uses a more linear (less hierarchichal) representation • Programs are broken into basic blocks • Longest sequential stretches of code • Basic blocks are connected via a control flow graph • Every entry is the destination of an edge, every exit is the source of the edge
Basic Blocks Revisited • So What is a basic block? • Longest sequential code segment guaranteed to run from start to finish. • Begins with either: • A Label (destination of a branch) • The instruction after a branch • Ends with either: • A Label • A branch instruction (conditional or unconditional)
Importance of Basic Blocks • Basic blocks are important because: • They can be optimized • e.g. Instruction reordering can avoid pipeline stalls • Inside a basic block: • In the beginning we can assume an infinite number of registers • But we will have a finite number of registers • So we pick victim registers and spill their contents.
An Example • Find the syntax tree and basic blocks in Euclid's GCD algorithm.
The Syntax Tree • The syntax tree looks like:
Basic Blocks and Call Graph • The Call Graph: • Uses infix • Register a1, a2, rv are used for parameter passage • Temporaries are treated like an infinite pool of registers.
Phases Vs. Passes • Stages of compilation can either: • Pipeline processing between multiple stages • Each stage is called a phase • Later stages get information in incremental chunks • Store the final results of one stage and give the next stage access to the stored results. • Each stage is called a pass. • One stage needs to be memory resident at a time • Can exploit full information
Intermediate Forms • Intermediate Forms (or IFs) • Links the compiler's back end to the front end • Can be classified by level of abstraction • High level IFs use graph or DAG representations • Low level IFs look more like an assembly code • Quads and Triples (2 operand and 3 operand) • High level Ifs support • Incremental compilation • IDE support
Example • P-Code -Pascal, JVM - Java, RTL - GNU • Compilers can have the same front end. • Retargeting the compiler means implementing a new back end. • If you have M front ends and N target architectures • Without IFs you need MŽN frontend/backends • With Ifs you need M front ends + N backends
Address Space • Split data and instruction space • Insert labels • Purge NOPS • Add Memory Management
Live Variable Analysis • Simple idea: • Compute the control flow graph • For each node in the graph • If a variable could be used along some control flow path out of the node, the variable is said to be live • Otherwise (the variable will never be used later) the variable is said to be dead. • So if a dead variable is in a register, that register will be available for use after the last reference.
Some useful Notation • For a basic block B • in[B] are variables that B uses as input. • Or any of B's descendents. • def[B] are variables defined in B. • i.e. On the LHS of an assignment statement • out[B] are variables defined after executing B. • use[B] are variables used by B • i.e. Used as input for an operation
A Live Variable Analysis Algorithm • Input: A Control Flow graph with def and use computed for each block. • Output: Out[B], the set of live-variables at the end of each block's execution.
Global Common Subexpression Elimination • Consider a program with two statements x:=y+z and later w:=y+z • It is expensive to evaluate y+z many times • We can eliminate it as follows: • For each block B using y+z, search back along the arcs of the flow graph stopping at blocks computing y+z, or modify the values of y or z. • Create a new variable u = y+z in blocks that first compute y+z
Global Common Subexpression Elimination • We can eliminate it as follows: • For each block B using y+z, search back along the arcs of the flow graph stopping at blocks computing y+z, or modify the values of y or z. • Create a new variable u := y+z in blocks that first compute y+z • Replace x := y+z with u := y+z; x:= u; • Replace w := y+z with w := u