190 likes | 300 Views
Handling nested procedures. Method 1 : static (access) links Reference to the frame of the lexically enclosing procedure Static chains of such links are created. How do we use them to access non-locals? The compiler knows the scope s of a variable The compiler knows the current scope t
E N D
Handling nested procedures • Method 1 : static (access) links • Reference to the frame of the lexically enclosing procedure • Static chains of such links are created. • How do we use them to access non-locals? • The compiler knows the scope s of a variable • The compiler knows the current scope t • Follow s-t links
Handling nested procedures • Method 1 : static (access) links • Setting the links: • if the callee is nested directly within the caller, set its static link to point to the caller's frame pointer (or stack pointer) • if the callee has the same nesting level as the caller, set its static link to point to wherever the caller's static link points to
Handling nested procedures • Method 2 : Displays • A Display encodes the static link info in an array. • The ith element of the array points to the frame of the most recent procedure at scope level i • How? • When a new stack frame is created for a procedure at nesting level i, • save the current value of D[i] in the new stack frame (to be restored on exit) • set D[i] to the new stack frame
Handling nested procedures • Displays vs. Static links • criteria • added overhead • space • nesting depth • frequency of non-local accesses
Parameter passing • By value • actual parameter is copied • By reference • address of actual parameter is stored • By value-result • call by value, AND • the values of the formal parameters are copied back into the actual parameters. • Example: int a; void test(int x) { x = 2; a = 0; } int main () { a = 1; test(a);
Stack maintenance • Calling sequence : • code executed by the caller before and after a call • code executed by the callee at the beginning • code executed by the callee at the end
Stack maintenance • A typical calling sequence : • Caller assembles arguments and transfers control • evaluate arguments • place arguments in stack frame and/or registers • save caller-saved registers • save return address • jump to callee's first instruction
Stack maintenance • A typical calling sequence : • Callee saves info on entry • allocate memory for stack frame, update stack pointer • save callee-saved registers • save old frame pointer • update frame pointer • Callee executes
Stack maintenance • A typical calling sequence : • Callee restores info on exit and returns control • place return value in appropriate location • restore callee-saved registers • restore frame pointer • pop the stack frame • jump to return address • Caller restores info • restore caller-saved registers
Code generation • Our book's target machine (appendix A): • opcode source1, source2, destination • add r1, r2, r3 • addI r1, c, r2 • loadI c, r2 • load r1, r2 • loadAI r1, c, r2 • loadAO r1, r2, r3 • i2i r1, r2 • cmp_LE r1, r2, r3 • cbr r1, l1, l2 • jump r1
Code generation • Let's start with some examples. • Generate code from a tree representing x = a+2 - (c+d-4) • Issues: • which children should go first? • what if we already had a-c in a register? • Does it make a difference if a and c are floating point as opposed to integer? • Generate code for a case statement • Generate code for w = w*2*x*y*z
Code generation • Code generation = • instruction selection • instruction scheduling • register allocation
Instruction selection • IR to assembly • Why is it an issue? • Example: copy a value from r1 to r2 • Let me count the ways... • Criteria • How hard is it? • Use a cost model to choose. • How about register usage?
Instruction selection • How hard is it? • Can make locally optimal choices • Global optimality is NP-complete • Criteria • speed of generated code • size of generated code • power consumption • Considering registers • Assume enough registers are available, let register allocator figure it out.
Instruction scheduling • Reorder instructions to hide latencies. • Example: (1) loadAI $sp, @w, r1 (4) add r1, r1, r1 (5) loadAI $sp, @x, r2 (8) mult r1, r2, r1 (9) loadAI $sp, @y, r2 (12) mult r1, r2, r1 (13) loadAI $sp, @z, r2 (16) mult r1, r2, r1 (18) storeAI r1, $sp, @w memory ops : 3 cycles multiply : 2 cycles everything else: 1 cycle
Instruction scheduling • Reorder instructions to hide latencies. • Example: (1) loadAI $sp, @w, r1 (4) add r1, r1, r1 (5) loadAI $sp, @x, r2 (8) mult r1, r2, r1 (9) loadAI $sp, @y, r2 (12) mult r1, r2, r1 (13) loadAI $sp, @z, r2 (16) mult r1, r2, r1 (18) storeAI r1, $sp, @w (1) loadAI $sp, @w, r1 (2) loadAI $sp, @x, r2 (3) loadAI $sp, @y, r3 (4) add r1, r1, r1 (5) mult r1, r2, r1 (6) loadAI $sp, @z, r2 (7) mult r1, r3, r1 (9) mult r1, r2, r1 (11) storeAI r1, $sp, @w
Instruction scheduling • Reorder instructions to hide latencies. • Example2: (1) loadAI $sp, @x, r1 (4) mult r1, r1, r1 (6) mult r1, r1, r1 (8) mult r1, r1, r1 (10) storeAI r1, $sp, @x
Instruction scheduling • Reorder instructions to hide latencies. • We need to collect dependence info • Scheduling affects register lifetimes ==> different demand for registers • Should we do register allocation before or after? • How hard is it? • more than one instructions may be ready • too many variables may be live at the same time • NP-complete!
Register allocation • Consists of two parts: • register allocation • register assignment • Goal : minimize spills • How hard is it? • BB w/ one size of data: polynomial • otherwise, NP-complete • based on graph coloring.