650 likes | 864 Views
Advanced Compiler Techniques. Loops. LIU Xianhua School of EECS, Peking University. Content. Concepts: Dominators Depth-First Ordering Back edges Graph depth Reducibility Natural Loops Efficiency of Iterative Algorithms Dependences & Loop Transformation. Loops are Important!.
E N D
Advanced Compiler Techniques Loops LIU Xianhua School of EECS, Peking University
Content • Concepts: • Dominators • Depth-First Ordering • Back edges • Graph depth • Reducibility • Natural Loops • Efficiency of Iterative Algorithms • Dependences & Loop Transformation “Advanced Compiler Techniques”
Loops are Important! • Loops dominate program execution time • Needs special treatment during optimization • Loops also affect the running time of program analyses • e.g., A dataflow problem can be solved in just a single pass if a program has no loops “Advanced Compiler Techniques”
Dominators • Node ddominates node n if every path from the entry to n goes through d. • written as: d dom n • Quick observations: • Every node dominates itself. • The entry dominates every node. • Common Cases: • The test of a while loop dominates all blocks in the loop body. • The test of an if-then-else dominates all blocks in either branch. “Advanced Compiler Techniques”
Dominator Tree • Immediate dominance: d idomn • d dom n, d n, no m s.t.d dom m and m dom n • Immediate dominance relationships form a tree 1 1 2 2 4 4 3 3 5 5 “Advanced Compiler Techniques”
Finding Dominators • A dataflow analysis problem: For each node, find all of its dominators. • Direction: forward • Confluence: set intersection • Boundary: OUT[Entry] = {Entry} • Initialization: OUT[B] = All nodes • Equations: • OUT[B] = IN[B] U {B} • IN[B] = p is a predecessor of B OUT[p] “Advanced Compiler Techniques”
Example:Dominators {1} 1 {1} {1} {1} 2 4 {1,2} {1,4} {1,2} {1} 3 5 {1,2,3} {1,5} “Advanced Compiler Techniques”
Depth-First Search • Start at entry. • If you can follow an edge to an unvisited node, do so. • If not, backtrack to your parent (node from which you were visited). “Advanced Compiler Techniques”
Depth-First Spanning Tree • Root = entry. • Tree edges are the edges along which we first visit the node at the head. 1 4 2 5 3 “Advanced Compiler Techniques”
Depth-First Node Order • The reverse of the order in which a DFS retreats from the nodes. 1-4-5-2-3 • Alternatively, reverse of postorder traversal of the tree. 3-2-5-4-1 1 2 4 3 5 “Advanced Compiler Techniques”
Four Kinds of Edges • Tree edges. • Advancing edges (node to proper descendant). • Retreating edges (node to ancestor, including edges to self). • Cross edges (between two nodes, neither of which is an ancestor of the other. “Advanced Compiler Techniques”
A Little Magic • Of these edges, only retreating edges go from high to low in DF order. • Example of proof: You must retreat from the head of a tree edge before you can retreat from its tail. • Also surprising: all cross edges go right to left in the DFST. • Assuming we add children of any node from the left. “Advanced Compiler Techniques”
Example:Non-Tree Edges Retreating Forward Cross 1 2 4 3 5 “Advanced Compiler Techniques”
Back Edges • An edge is a back edgeif its head dominates its tail. • Theorem:Every back edge is a retreating edge in every DFST of every flow graph. • Converse almost always true, but not always. Head reached before tail in any DFST Back edge Search must reach the tail before retreating from the head, so tail is a descendant of the head “Advanced Compiler Techniques”
Example:Back Edges {1} 1 2 {1,2} 4 {1,4} 3 5 {1,2,3} {1,5} “Advanced Compiler Techniques”
Reducible Flow Graphs • A flow graph is reducibleif every retreating edge in any DFST for that flow graph is a back edge. • Testing reducibility: Remove all back edges from the flow graph and check that the result is acyclic. • Hint why it works: All cycles must include some retreating edge in every DFST. • In particular, the edge that enters the first node of the cycle that is visited. “Advanced Compiler Techniques”
DFST on a Cycle So this is a retreating edge Search must reach these nodes before leaving the cycle Depth-first search reaches here first “Advanced Compiler Techniques”
Why Reducibility? • Folk theorem: All flow graphs in practice are reducible. • Fact: If you use only while-loops, for-loops, repeat-loops, if-then(-else), break, and continue, then your flow graph is reducible. “Advanced Compiler Techniques”
Example:Remove Back Edges 1 2 4 3 5 Remaining graph is acyclic. “Advanced Compiler Techniques”
Example: Nonreducible Graph A A B C In any DFST, one of these edges will be a retreating edge. C B A B C But no heads dominate their tails, so deleting back edges leaves the cycle. “Advanced Compiler Techniques”
Why Care AboutBack/Retreating Edges? • Proper ordering of nodes during iterative algorithm assures number of passes limited by the number of “nested” back edges. • Depth of nested loops upper-bounds the number of nested back edges. “Advanced Compiler Techniques”
DF Order and Retreating Edges • Suppose that for a RD analysis, we visit nodes during each iteration in DF order. • The fact that a definition d reaches a block will propagate in one pass along any increasing sequence of blocks. • When d arrives at the tail of a retreating edge, it is too late to propagate d from OUT to IN. • The IN at the head has already been computed for that round. “Advanced Compiler Techniques”
Example:DF Order Definition d is Gen’d by node 2. The first pass d 1 The second pass d d d 2 4 d d d d 3 5 d d “Advanced Compiler Techniques”
Depth of a Flow Graph • The depthof a flow graph with a given DFST and DF-order is the greatest number of retreating edges along any acyclic path. • For RD, if we use DF order to visit nodes, we converge in depth+2 passes. • Depth+1 passes to follow that number of increasing segments. • 1 more pass to realize we converged. “Advanced Compiler Techniques”
Example: Depth = 2 retreating retreating increasing increasing increasing 1->4->7 ---> 3->10->17 ---> 6->18->20 Pass 2 Pass 1 Pass 3 “Advanced Compiler Techniques”
Similarly . . . • AE also works in depth+2 passes. • Unavailability propagates along retreat-free node sequences in one pass. • So does LV if we use reverse of DF order. • A use propagates backward along paths that do not use a retreating edge in one pass. “Advanced Compiler Techniques”
In General . . . • The depth+2 bound works for any monotone framework, as long as information only needs to propagate along acyclic paths. • Example: if a definition reaches a point, it does so along an acyclic path. “Advanced Compiler Techniques”
However . . . • Constant propagation does not have this property. a = b L: a = b b = c c = 1 goto L b = c c = 1 “Advanced Compiler Techniques”
Why Depth+2 is Good • Normal control-flow constructs produce reducible flow graphs with the number of back edges at most the nesting depth of loops. • Nesting depth tends to be small. • A study by Knuth has shown that average depth of typical flow graphs =~2.75. “Advanced Compiler Techniques”
Example:Nested Loops 3 nested while- loops; depth = 3 3 nested repeat- loops; depth = 1 “Advanced Compiler Techniques”
Natural Loops • A natural loop is defined by: • A single entry-point called header • a headerdominates all nodes in the loop • A back edge that enters the loop header • Otherwise, it is not possible for the flow of control to return to the header directly from the "loop" ; i.e., there really is no loop. “Advanced Compiler Techniques”
Find Natural Loops • The natural loop of a back edge a->b is {b} plus the set of nodes that can reach a without going through b • Remove bfrom the flow graph, find all predecessors of a • Theorem: two natural loops are either disjoint, identical, or nested. “Advanced Compiler Techniques”
Example:Natural Loops Natural loop of 5 -> 1 Natural loop of 3 -> 2 1 2 4 3 5 “Advanced Compiler Techniques”
Relationship between Loops • If two loops do not have the same header • they are either disjoint, or • one is entirely contained (nested within) the other • innermost loop: one that contains no other loop. • If two loops share the same header • Hard to tell which is the inner loop • Combine as one 1 2 3 4 “Advanced Compiler Techniques”
Basic Parallelism • Examples: FOR i = 1 to 100 a[i] = b[i] + c[i] FOR i = 11 TO 20 a[i] = a[i-1] + 3 FOR i = 11 TO 20 a[i] = a[i-10] + 3 • Does there exist a data dependence edge between two different iterations? • A data dependence edge is loop-carriedif it crosses iteration boundaries • DoAll loops: loops without loop-carried dependences “Advanced Compiler Techniques”
Data Dependence of Variables • Output dependence • Input dependence • True dependence • Anti-dependence a = = a a = a = = a a = = a = a “Advanced Compiler Techniques”
Affine Array Accesses • Common patterns of data accesses: (i, j, k are loop indexes) A[i], A[j], A[i-1], A[0], A[i+j], A[2*i], A[2*i+1], A[i,j], A[i-1, j+1] • Array indexes are affine expressions of surrounding loop indexes • Loop indexes: in, in-1, ... , i1 • Integer constants: cn, cn-1, ... , c0 • Array index: cnin + cn-1in-1+ ... + c1i1+ c0 • Affine expression: linear expression + a constant term (c0) “Advanced Compiler Techniques”
Formulating DataDependence Analysis FOR i := 2 to 5 do A[i-2] = A[i]+1; • Between read access A[i] and write access A[i-2] there is a dependence if: • there exist two iterations ir and iw within the loop bounds, s.t. • iterations ir & iw read & write the same array element, respectively ∃integers iw, ir 2≤iw,ir≤5 ir=iw-2 • Between write access A[i-2] and write access A[i-2] there is a dependence if: ∃integers iw, iv 2≤iw,iv≤5 iw–2=iv–2 • To rule out the case when the same instance depends on itself: add constraint iw ≠ iv “Advanced Compiler Techniques”
Memory Disambiguation • Undecidable at Compile Time read(n) For i = … a[i] = a[n] “Advanced Compiler Techniques”
Domain of Data Dependence Analysis • Only use loop bounds and array indexes that are affine functions of loop variables for i = 1 to n for j = 2i to 100 a[i + 2j + 3][4i + 2j][i * i] = … … = a[1][2i + 1][j] • Assume a data dependence between the read & write operation if there exists: ∃integers ir,jr,iw,jw 1 ≤ iw, ir ≤ n 2iw≤ jw ≤ 100 2ir≤ jr ≤ 10 iw+ 2jw + 3 = 1 4iw+ 2jw = 2ir + 1 • Equate each dimension of array access; ignore non-affine ones • No solution No data dependence • Solution There may be a dependence “Advanced Compiler Techniques”
Iteration Space • An abstraction for loops • Iteration is represented as coordinates in iteration space. for i= 0, 5 for j= 0, 3 a[i, j] = 3 j i “Advanced Compiler Techniques”
Iteration Space • An abstraction for loops for i = 0, 5 for j= i, 3 a[i, j] = 0 j i “Advanced Compiler Techniques”
Iteration Space • An abstraction for loops for i = 0, 5 for j= i, 7 a[i, j] = 0 j i “Advanced Compiler Techniques”
Affine Access “Advanced Compiler Techniques”
Affine Transform j v i u “Advanced Compiler Techniques”
Loop Transformation for i = 1, 100 for j = 1, 200 A[i, j] = A[i, j] + 3 end_for end_for for u = 1, 200 for v = 1, 100 A[v,u] = A[v,u]+ 3 end_for end_for “Advanced Compiler Techniques”
Old Iteration Space for i = 1, 100 for j = 1, 200 A[i, j] = A[i, j] + 3 end_for end_for “Advanced Compiler Techniques”
New Iteration Space for u = 1, 200 for v = 1, 100 A[v,u] = A[v,u]+ 3 end_for end_for “Advanced Compiler Techniques”
Old Array Accesses for i = 1, 100 for j = 1, 200 A[i, j] = A[i, j] + 3 end_for end_for “Advanced Compiler Techniques”
New Array Accesses for u = 1, 200 for v = 1, 100 A[v,u] = A[v,u]+ 3 end_for end_for “Advanced Compiler Techniques”