740 likes | 934 Views
Course project presentations. Midterm project presentation Originally scheduled for Tuesday Nov 4 th Can move to Th Nov 6 th or Th Nov 13 th. From last time: Loop-invariant code motion. Two steps: analysis and transformations Step1: find invariant computations in loop
E N D
Course project presentations • Midterm project presentation • Originally scheduled for Tuesday Nov 4th • Can move to Th Nov 6th or Th Nov 13th
From last time: Loop-invariant code motion • Two steps: analysis and transformations • Step1: find invariant computations in loop • invariant: computes same result each time evaluated • Step 2: move them outside loop • to top if used within loop: code hoisting • to bottom if used after loop: code sinking
Detecting loop invariants • An expression is invariant in a loop L iff: (base cases) • it’s a constant • it’s a variable use, all of whose defs are outside of L (inductive cases) • it’s a pure computation all of whose args are loop-invariant • it’s a variable use with only one reaching def, and the rhs of that def is loop-invariant
Computing loop invariants • Option 1: iterative dataflow analysis • optimistically assume all expressions loop-invariant, and propagate • Option 2: build def/use chains • follow chains to identify and propagate invariant expressions • Option 3: SSA • like option 2, but using SSA instead of ef/use chains
Example using def/use chains • An expression is invariant in a loop L iff: (base cases) • it’s a constant • it’s a variable use, all of whose defs are outside of L (inductive cases) • it’s a pure computation all of whose args are loop-invariant • it’s a variable use with only one reaching def, and the rhs of that def is loop-invariant
Example using def/use chains • An expression is invariant in a loop L iff: (base cases) • it’s a constant • it’s a variable use, all of whose defs are outside of L (inductive cases) • it’s a pure computation all of whose args are loop-invariant • it’s a variable use with only one reaching def, and the rhs of that def is loop-invariant
Example using def/use chains • An expression is invariant in a loop L iff: (base cases) • it’s a constant • it’s a variable use, all of whose defs are outside of L (inductive cases) • it’s a pure computation all of whose args are loop-invariant • it’s a variable use with only one reaching def, and the rhs of that def is loop-invariant
Loop invariant detection using SSA • An expression is invariant in a loop L iff: (base cases) • it’s a constant • it’s a variable use, all of whose single defs are outside of L (inductive cases) • it’s a pure computation all of whose args are loop-invariant • it’s a variable use whose single reaching def, and the rhs of that def is loop-invariant • functions are not pure
Example using SSA • An expression is invariant in a loop L iff: (base cases) • it’s a constant • it’s a variable use, all of whose single defs are outside of L (inductive cases) • it’s a pure computation all of whose args are loop-invariant • it’s a variable use whose single reaching def, and the rhs of that def is loop-invariant • functions are not pure
Example using SSA and preheader • An expression is invariant in a loop L iff: (base cases) • it’s a constant • it’s a variable use, all of whose single defs are outside of L (inductive cases) • it’s a pure computation all of whose args are loop-invariant • it’s a variable use whose single reaching def, and the rhs of that def is loop-invariant • functions are not pure
Summary of Data dependencies • We’ve seen SSA, a way to encode data dependencies better than just def/use chains • makes CSE easier • makes loop invariant detection easier • makes code motion easier • Now we move on to looking at how to encode control dependencies
Control Dependencies • A node (basic block) Y is control-dependent on another X iff X determines whether Y executes • there exists a path from X to Y s.t. every node in the path other than X and Y is post-dominated by Y • X is not post-dominated by Y
Control Dependencies • A node (basic block) Y is control-dependent on another X iff X determines whether Y executes • there exists a path from X to Y s.t. every node in the path other than X and Y is post-dominated by Y • X is not post-dominated by Y
Control Dependence Graph • Control dependence graph: Y descendent of X iff Y is control dependent on X • label each child edge with required condition • group all children with same condition under region node • Program dependence graph: super-impose dataflow graph (in SSA form or not) on top of the control dependence graph
Summary of Control Depence Graph • More flexible way of representing control-depencies than CFG (less constraining) • Makes code motion a local transformation • However, much harder to convert back to an executable form
Course summary so far • Dataflow analysis • flow functions, lattice theoretic framework, optimistic iterative analysis, precision, MOP • Advanced Program Representations • SSA, CDG, PDG • Along the way, several analyses and opts • reaching defns, const prop & folding, available exprs & CSE, liveness & DAE, loop invariant code motion • Next: dealing with procedures
Costs of procedure calls • Up until now, we treated calls conservatively: • make the flow function for call nodes return top • start iterative analysis with incoming edge of the CFG set to top • This leads to less precise results: “lost-precision” cost • Calls also incur a direct runtime cost • cost of call, return, argument & result passing, stack frame maintainance • “direct runtime” cost
Addressing costs of procedure calls • Technique 1: try to get rid of calls, using inlining and other techniques • Technique 2: interprocedural analysis, for calls that are left
Inlining • Replace call with body of callee • Turn parameter- and result-passing into assignments • do copy prop to eliminate copies • Manage variable scoping correctly • rename variables where appropriate
Call graph nodes are procedures edges are calls, labelled by invocation counts/frequency Hard cases for builing call graph calls to/from external routines calls through pointers, function values, messages Where in the compiler should inlining be performed? Program representation for inlining
Inlining pros and cons • Pros • eliminate overhead of call/return sequence • eliminate overhead of passing args & returning results • can optimize callee in context of caller and vice versa • Cons • can increase compiled code space requirements • can slow down compilation • recursion? • Virtual inlining: simulate inlining during analysis of caller, but don’t actually perform the inlining
Which calls to inline (discussion) • What affects the decision as to which calls to inline?
Which calls to inline • What affects the decision as to which calls to inline? • size of caller and callee (easy to compute size before inlining, but what about size after inlining?) • frequency of call (static estimates or dynamic profiles) • call sites where callee benefits most from optimization (not clear how to quantify) • programmer annotations (if so, annotate procedure or call site? Also, should the compiler really listen to the programmer?)
Inlining heuristics • Strategy 1: superficial analysis • examine source code of callee to estimate space costs, use this to determine when to inline • doesn’t account for post-inlining optimizations • How can we do better?
Inlining heuristics • Strategy 2: deep analysis • perform inlining • perform post-inlining analysis/optimizations • estimate benefits from opts, and measure code space after opts • undo inlining if costs exceed benefits • better accounts for post-inlining effects • much more expensive in compile-time • How can we do better?
Inlining heuristics • Strategy 3: amortized version of 2 [Dean & Chambers 94] • perform strategy 2: an inlining “trial” • record cost/benefit trade-offs in persistent database • reuse previous cost/benefit results for “similar” call sites
Inlining heuristics • Strategy 4: use machine learning techniques • For example, use genetic algorithms to evolve heuristics for inlining • fitness is evaluated on how well the heuristics do on a set of benchmarks • cross-populate and mutate heuristics • Can work surprisingly well to derive various heuristics for compilres
Another way to remove procedure calls int f(...) { if (...) return g(...); ... return h(i(....), j(...)); }
Tail call eliminiation • Tail call: last thing before return is a call • callee returns, then caller immediately returns • Can splice out one stack frame creation and destruction by jumping to callee rather than calling • callee reuses caller’s stack frame & return address • callee will return directly to caller’s caller • effect on debugging?
Tail recursion elimination • If last operation is self-recursive call, what does tail call elimination do?
Tail recursion elimination • If last operation is self-recursive call, what does tail call elimination do? • Transforms recursion into loop: tail recursion elimination • common optimization in compilers for functional languages • required by some language specifications, eg Scheme • turns stack space usage from O(n) to O(1)