Advanced Optimizations in Programming: Loop Invariant Detection with SSA

Recap from last time • We were trying to do Common Subexpression Elimination • Compute expressions that are available at each program point

Recap from last time in FX := Y op Z(in) = in – { X ! * } – { * ! ... X ... } [ { X ! Y op Z | X  Y Æ X  Z} X := Y op Z out in FX := Y(in) = in – { X ! * } – { * ! ... X ... } [ { X ! E | Y ! E 2 in } X := Y out

Example

Problems • z := j * 4 is not optimized to z := x, even though x contains the value j * 4 • m := b + a is not optimized to m := i, even i contains a + b • w := 4 * m it not optimized to w := x, even though x contains the value 4 *m

Problems: more abstractly • Available expressions overly sensitive to name choices, operand orderings, renamings, assignments • Use SSA: distinct values have distinct names • Do copy prop before running available exprs • Adopt canonical form for commutative ops

Example in SSA in FX := Y op Z(in) = X := Y op Z out in0 in1 FX := Y(in0, in1) = X := (Y,Z) out

Example in SSA in X := Y op Z FX := Y op Z(in) = in [ { X ! Y op Z } out in0 in1 FX := Y(in0, in1) = (in0Å in1 ) [ { X ! E | Y ! E 2 in0Æ Z ! E 2 in1 } X := (Y,Z) out

Example in SSA

What about pointers?

What about pointers? • Option 1: don’t use SSA for point-to memory • Option 2: insert copies between SSA vars and real vars

SSA helps us with CSE • Let’s see what else SSA can help us with • Loop-invariant code motion

Loop-invariant code motion • Two steps: analysis and transformations • Step1: find invariant computations in loop • invariant: computes same result each time evaluated • Step 2: move them outside loop • to top if used within loop: code hoisting • to bottom if used after loop: code sinking

Example

Detecting loop invariants • An expression is invariant in a loop L iff: (base cases) • it’s a constant • it’s a variable use, all of whose defs are outside of L (inductive cases) • it’s a pure computation all of whose args are loop-invariant • it’s a variable use with only one reaching def, and the rhs of that def is loop-invariant

Computing loop invariants • Option 1: iterative dataflow analysis • optimistically assume all expressions loop-invariant, and propagate • Option 2: build def/use chains • follow chains to identify and propagate invariant expressions • Option 3: SSA • like option 2, but using SSA instead of ef/use chains

Example using def/use chains • An expression is invariant in a loop L iff: (base cases) • it’s a constant • it’s a variable use, all of whose defs are outside of L (inductive cases) • it’s a pure computation all of whose args are loop-invariant • it’s a variable use with only one reaching def, and the rhs of that def is loop-invariant

Loop invariant detection using SSA • An expression is invariant in a loop L iff: (base cases) • it’s a constant • it’s a variable use, all of whose single defs are outside of L (inductive cases) • it’s a pure computation all of whose args are loop-invariant • it’s a variable use whose single reaching def, and the rhs of that def is loop-invariant •  functions are not pure

Example using SSA • An expression is invariant in a loop L iff: (base cases) • it’s a constant • it’s a variable use, all of whose single defs are outside of L (inductive cases) • it’s a pure computation all of whose args are loop-invariant • it’s a variable use whose single reaching def, and the rhs of that def is loop-invariant •  functions are not pure

Example using SSA and preheader • An expression is invariant in a loop L iff: (base cases) • it’s a constant • it’s a variable use, all of whose single defs are outside of L (inductive cases) • it’s a pure computation all of whose args are loop-invariant • it’s a variable use whose single reaching def, and the rhs of that def is loop-invariant •  functions are not pure

Code motion

Example

Lesson from example: domination restriction

Domination restriction in for loops

Avoiding domination restriction

Another example

Data dependence restriction

Avoiding data restriction

Representing control dependencies • We’ve seen SSA, a way to encode data dependencies better than just def/use chains • makes CSE easier • makes loop invariant detection easier • makes code motion easier • Now we move on to looking at how to encode control dependencies

Control Dependences • A node (basic block) Y is control-dependent on another X iff X determines whether Y executes • there exists a path from X to Y s.t. every node in the path other than X and Y is post-dominated by Y • X is not post-dominated by Y

Example

Control Dependence Graph • Control dependence graph: Y descendent of X iff Y is control dependent on X • label each child edge with required condition • group all children with same condition under region node • Program dependence graph: super-impose dataflow graph (in SSA form or not) on top of the control dependence graph

Example

Another example

Summary so far • Dataflow analysis • flow functions, lattice theoretic framework, optimistic iterative analysis, MOP • Advanced Program Representations • SSA, CDG, PDG, code motion • Next: dealing with procedures

Interprocedural analyses and optimizations

Costs of procedure calls • Up until now, we treated calls conservatively: • make the flow function for call nodes return top • start iterative analysis with incoming edge of the CFG set to top • This leads to less precise results: “lost-precision” cost • Calls also incur a direct runtime cost • cost of call, return, argument & result passing, stack frame maintainance • “direct runtime” cost

Addressing costs of procedure calls • Technique 1: try to get rid of calls, using inlining and other techniques • Technique 2: interprocedural analysis, for calls that are left

Inlining • Replace call with body of callee • Turn parameter- and result-passing into assignments • do copy prop to eliminate copies • Manage variable scoping correctly • rename variables where appropriate

Call graph nodes are procedures edges are calls, labelled by invocation counts/frequency Hard cases for builing call graph calls to/from external routines calls through pointers, function values, messages Where in the compiler should inlining be performed? Program representation for inlining

Inlining pros and cons (discussion)

Inlining pros and cons • Pros • eliminate overhead of call/return sequence • eliminate overhead of passing args & returning results • can optimize callee in context of caller and vice versa • Cons • can increase compiled code space requirements • can slow down compilation • recursion? • Virtual inlining: simulate inlining during analysis of caller, but don’t actually perform the inlining

Advanced Optimizations in Programming: Loop Invariant Detection with SSA

Advanced Optimizations in Programming: Loop Invariant Detection with SSA

Presentation Transcript

From Last Time…

From last time…

From Last time

From Last Time

From last time

Recap from Last Week

Recap from last lesson

From Last Time…

From last time…

Recap of Last Time….

From last time…

Recap from the last lessons

From last time…

Recap from last time -I

From Last Time:

From Last Time…

From Last Time…

From Last Time….

From Last Time…

Recap from last time

From last time…

Recap from last time