1 / 32

Classic Examples of Local and Global Code Optimizations

Explore various methods like constant folding, strength reduction, and common subexpression elimination for optimizing code both locally and globally. Find classic examples to understand these techniques better.

mosleyj
Download Presentation

Classic Examples of Local and Global Code Optimizations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Local Constant folding Constant combining Strength reduction Constant propagation Common subexpression elimination Backward copy propagation Global Dead code elimination Constant propagation Forward copy propagation Common subexpression elimination Code motion Loop strength reduction Induction variable elimination Classic Examples of Local and Global Code Optimizations

  2. Local: Constant Folding • Goal: eliminate unnecessary operations • Rules: • X is an arithmetic operation • If src1(X) and src2(X) are constant, then change X by applying the operation r7 = 4 + 1 src2(X) = 1 r5 = 2 * r4r6 = r5 * 2 src1(X) = 4

  3. Local: Constant Combining • Goal: eliminate unnecessary operations • First operation often becomes dead after constant combining • Rules: • Operations X and Y in same basic block • X and Y have at least one literal src • Y uses dest(X) • None of the srcs of X have defs between X and Y (excluding Y) r7 = 5 r5 = 2 * r4r6 = r5 * 2 r6 = r4 * 4

  4. Local: Strength Reduction • Goal: replace expensive operations with cheaper ones • Rules (common): • X is an multiplication operation where src1(X) or src2(X) is a const 2k integer literal • Change X by using shift operation • For k=1 can use add r7 = 5 r5 = 2 * r4r6 = r4 * 4 r5 = r4 + r4 r6 = r4 << 2

  5. Local: Constant Propagation • Goal: replace register uses with literals (constants) in a single basic block • Rules: • Operation X is a move to register with src1(X) literal • Operation Y uses dest(X) • There is no def of dest(X) between X and Y (excluding defs at X and Y) • Replace dest(X) in Y with src1(X) r1 = 5r2 = _xr3 = 7r4 = r4 + r1r1 = r1 + r2r1 = r1 + 1r3 = 12r8 = r1 - r2r9 = r3 + r5r3 = r2 + 1r7 = r3 - r1M[r7] = 0 r4 = r4 + 5 r1 = 5 + _x r1 = 5 + _x + 1 r8 = 5 + _x + 1 - _x r9 = 12 + r5 r3 = _x + 1 r7 = _x + 1 - 5 - _x - 1

  6. Local: Common Subexpression Elimination (CSE) • Goal: eliminate re-computations of an expression • More efficient code • Resulting moves can get copy propagated (see later) • Rules: • Operations X and Y have the same opcode and Y follows X • src(X) = src(Y) for all srcs • For all srcs, no def of a src between X and Y (excluding Y) • No def of dest(X) between X and Y (excluding X and Y) • Replace Y with move dest(Y) = dest(X) r1 = r2 + r3r4 = r4 + 1r1 = 6r6 = r2 + r3r2 = r1 - 1r5 = r4 + 1r7 = r2 + r3r5 = r1 - 1 r5 = r2

  7. Local: Backward Copy Propagation • Goal: propagate LHS of moves backward • Eliminates useless moves • Rules (dataflow required) • X and Y in same block • Y is a move to register • dest(X) is a register that is not live out of the block • Y uses dest(X) • dest(Y) not used or defined between X and Y (excluding X and Y) • No uses of dest(X) after the first redef of dest(Y) • Replace src(Y) on path from X to Y with dest(X) and remove Y r1 = r8 + r9r2 = r9 + r1r4 = r2r6 = r2 + 1r9 = r1r7 = r6r5 = r7 + 1r4 = 0r8 = r2 + r7 r7 = r2 + 1 remove r7 = r6 r6 not live

  8. Global: Dead Code Elimination r1 = 3r2 = 10 • Goal: eliminate any operation who’s result is never used • Rules (dataflow required) • X is an operation with no use in def-use (DU) chain, i.e. dest(X) is not live • Delete X if removable (not a mem store or branch) • Rules too simple! • Misses deletion of r4, even after deleting r7, since r4 is live in loop • Better is to trace UD chains backwards from “critical” operations r4 = r4 + 1r7 = r1 * r4 r7 not live r3 = r3 + 1 r2 = 0 r3 = r2 + r1 M[r1] = r3

  9. Global: Constant Propagation r1 = 4r2 = 10 • Goal: globally replace register uses with literals • Rules (dataflow required) • X is a move to a register with src1(X) literal • Y uses dest(X) • dest(X) has only one def at X for use-def (UD) chains to Y • Replace dest(X) in Y with src1(X) r5 = 2r7 = r1 * r5 r7 = 8 r3 = r3 + r5 r2 = 0 r3 = r2 + r1r6 = r7 * r4 r3 = r2 + 4 r6 = 8 * r4 r3 = r3 + 2 M[r1] = r3 M[4] = r3

  10. Global: Forward Copy Propagation • Goal: globally propagate RHS of moves forward • Reduces dependence chain • May be possible to eliminate moves • Rules (dataflow required) • X is a move with src1(X) register • Y uses dest(X) • dest(X) has only one def at X for UD chains to Y • src1(X) has no def on any path from X to Y • Replace dest(X) in Y with src1(X) r1 = r2r3 = r4 r6 = r3 + 1 r2 = 0 r5 = r2 + r3 r5 = r2 + r4 r6 = r4 + 1

  11. Global: Common Subexpression Elimination (CSE) r1 = r2 * r6 • Goal: eliminate recomputations of an expression • Rules: • X and Y have the same opcode and X dominates Y • src(X) = src(Y) for all srcs • For all srcs, no def of a src on any path between X and Y (excluding Y) • Insert rx = dest(X) immediately after X for new register rx • Replace Y with move dest(Y) = rx r3 = r4 / r7 r10 = r3 r2 = r2 + 1r3 = r3 + 1 r1 = r3 * 7 r5 = r2 * r6r8 = r4 / r7 r8 = r10 r9 = r3 * 7

  12. Global: Code Motion preheader r1 = 0 • Goal: move loop-invariant computations to preheader • Rules: • Operation X in block that dominates all exit blocks • X is the only operation to modify dest(X) in loop body • All srcs of X have no defs in any of the basic blocks in the loop body • Move X to end of preheader • Note 1: if one src of X is a memory load, need to check for stores in loop body • Note 2: X must be movable and not cause exceptions r4 = M[r5] header r4 = M[r5]r7 = r4 * 3 r8 = r2 + 1r7 = r8 * r4 r3 = r2 + 1 r1 = r1 + r7 M[r1] = r3

  13. Global: Loop Strength Reduction B1: B1: i := 0t1 := n-2 i := 0t1 := n-2t2 := 4*i B2: B2: t2 := 4*iA[t2] := 0i := i+1 A[t2] := 0i := i+1t2 := t2+4 B3: B3: if i < t1 goto B2 if i < t1 goto B2 Replace expensive computations with induction variables

  14. Global: Induction Variable Elimination B1: B1: i := 0t1 := n-2t2 := 4*i t1 := 4*nt1 := t1-8t2 := 4*i B2: A[t2] := 0i := i+1t2 := t2+4 B2: A[t2] := 0t2 := t2+4 B3: B3: if i<t1 goto B2 if t2<t1 goto B2 Replace induction variable in expressions with another

More Related