260 likes | 330 Views
CS412/413. Introduction to Compilers and Translators April 12, 1999 Lecture 28: Loops and check removal. Administration. Prelim 2 in 4 days PA 4 due April 28 Reading: Appel 17, Muchnick 7.1-7.4, 14.1, 17.4.3. Loops. Most execution time in most programs is spent in loops: 90/10 is typical
E N D
CS412/413 Introduction to Compilers and Translators April 12, 1999 Lecture 28: Loops and check removal
Administration • Prelim 2 in 4 days • PA 4 due April 28 • Reading: Appel 17, Muchnick 7.1-7.4, 14.1, 17.4.3 CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Loops • Most execution time in most programs is spent in loops: 90/10 is typical • Most important targets of optimization: loops • Loop optimizations: • loop-invariant code motion • loop unrolling • loop peeling • strength reduction of expressions containing induction variables • removal of bounds checks • When to apply loop optimizations? CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
High-level optimization? • Loops may be hard to recognize in IR or quadruple form -- should we apply loop optimizations to source code or high-level IR? • Many kinds of loops: while, do/while, continue • loop optimizations benefit from other IR-level optimizations and vice-versa -- want to be able to interleave • Problem: identifying loops in call-flow graph CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Definition of a loop • A loop is a set of nodes in the control flow graph, with one distinguished node called the header. • Every node is reachablefrom header, headerreachable from everynode: strongly-connectedcomponent • No entering edges fromoutside except to header • nodes with outgoingedges: loop exit nodes header loop exit CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Nested loops • Control-flow graph may contain many loops, and loops may contain each other • Control-flow analysis : identify the loops and nesting structure inner loop CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Dominators • Notion of dominator will help us do CFA • Node A dominates node B if the only way to reach B from start node is through A • Edge in call flow graph is aback edge if destinationdominates source • A loop contains at least one back edge 1 2 back edge 5 4 3 CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
1 2 3 4 5 6 8 7 9 10 Dominator tree • Domination is transitive; if A dominates B and B dominates C, then A dominates C. Aimmediately dominates B if domination not implied transitively • Every call-flow graph has dominator tree 1 2 3 4 5 6 8 7 9 10 CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Finding dominators • Goal: for every node in call-flow graph, find its set of dominators • Properties of dominators: 1. Every node dominates itself 2. A node B is dominated by another node A if A dominates all of the predecessors of B CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Dominator data-flow analysis • Forward analysis • A node B is dominated by another node A if A dominates all of the predecessors of B in[n] = n’pred[n] out[n’] • Every node dominates itself: out[n] = in[n] {n} • Formally: L = sets of nodes ordered by, flow functions Fn(x) = x {n}, T = {all n} Standard iterative analysis works fine CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Completing control-flow analysis • Each back edge nh has an associated natural loop with h as its header: all nodes reachable from h that reach n without going through h • For each back edge, find its natural loop • Nest loops based on subsetrelationship between natural loops • Exception: natural loops may sharesame header; merge them intolarger loop. 1 2 3 4 5 6 8 7 9 10 CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
q h h q Loop-invariant hoisting • Idea: move computations that always give the same result out of the loop: only compute once! • Hoisting quadruple q: t = a + b. Use reaching definitions analysis to see if a, b are constants (conservatively) • Must also ensure q is guaranteed to be executed by loop, q is only defn of t, t not live-in at h CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Induction variables • Induction variables are variables with value ai + b on the ithiteration of a natural loop • Various optimizations can exploit information about induction variables: • strength reduction • array bounds check elimination • loop unrolling CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Identifying induction variables • Basic induction variables: only one definition of the form i = i + K • Derived induction variables: one definition of the form j = i * M + N j = 3; for (i = 0; i < n; i++) { j = j +1; k = i*4 + 8; m = k*12 + 1; … } CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Strength reduction • Every derived induction variable k can be written as a*i + b, a and b constants, i some basic induction variable • For all distinct (a,b) pairs: • insert before loop header k’ = b • insert after loop header k’ = k’ + a • Replace definition of any k whose formula isa*i + b with k = k’ • Result: multiplication(s) replaced by single addition • Additional optimizations facilitated: copy/constant propagation, dead/useless variable elimination, dead code elimination CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
h h Loop unrolling • Loop unrolling: creates K copies of loop in sequence Useless unrolling: (K=2) h CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Using induction variables • When loop test expression depends on induction variable (e.g. i < n), can use one loop test to ensure that entire unrolled loop will succeed (i+K-1 < n): remove all interior loop tests • Additional loop is needed to “finish up” 0..K-1 iterations h h Useful unrolling CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Array bounds checks • Iota+: On every expression a[i] , must ensure i < length a, i 0 (i <u length a) • Checking array bounds is expensive -- adds conditional jump to every array access expression • Array indices are often induction variables -- can use induction variable information to avoid the bounds check entirely CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Bounds checks • Can eliminate the bounds check if we can prove at compile time that it will always succeed i = 0; while (i < length a) { a[i] = b[i]; i++; } CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Rules • Given reference a[k] where k is an induction variable with value a*i + b, must find a conditional test on some induction variable j • test terminates the loop • test dominates the reference toa[k] • test is against some loop invariant such that provablyk <u length a • When to perform optimization? AST? Need domination analysis, other optimizations not done. Quadruples? Hard to recognize array element and length expressions reliably. CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Null checks • Another costly operation: checking for null pointers • Java, Iota+ : needed on every • field access or assignment (except on this) • method invocation (except on this) • array element access • string operation • Idea: Once we’ve checked for null, shouldn’t need to check again CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Example u = p.x + p.y t1 = p != 0 t1 = p != 0 if t1 goto L1 else L2 if t1 goto L1 else L2 L2: abort L2: abort L1: ax = p + 4 L1: ax = p+4 tx = M[ax] tx = M[ax] t2 = p != 0 CSE:t2 = t1 t2 = t1 if t2 goto L3 else L4 goto L4 L3: abort L3: abort L4: ay = p + 8 L4: ay = p + 8 ty = M[ay] ty = M[ay] u = tx + ty u = tx + ty CP: if t1 goto ... BP: t1 = true CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Boolean propagation • Propagated information is set of (variable, value) pairs • (v, ?), (v, T), (v, F) • Doesn’t fit into standard dataflow analysis model • different information leaves on different out-edges of if statements • need to explicitly represent information on each edge CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Finishing optimization t1 = p != 0 t1 = p != 0 CJUMP p != 0, L1 if t1 goto L1 else L2 if t1 goto L1 else L2 ABORT L2: abort L2: abort L1: MOVE(u, M[p+4] L1: ax = p+4 L1: ax = p+4 + M[p+8]) tx = M[ax] tx = M[ax] t2 = t1 goto L4 L3: abort L4: ay = p + 8 ay = p + 8 ty = M[ay] ty = M[ay] u = tx + ty u = tx + ty CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Summary • Optimizing loop code is critical to good performance • Loops can be identified automatically in control-flow graph using dominator data-flow analysis; allows interleaving of loop optimizations • Induction variables enable many loop optimizations: loop unrolling, strength reduction, array bounds checks. • Avoiding array bounds checks, null checks is important for performance but not done often enough; one reason why Java is slow and C is tough to program in. CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers
Summary of Optimization • IR optimizations: • Quadruple representation/control-flow graphs makes optimizations easier to express and check for safety • Data-flow analysis • Control-flow analyses (loops & traces) • Low-level optimizations: • Graph-coloring register allocation • List scheduling of instructions CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers