330 likes | 352 Views
CMPUT680 - Fall 2006. Topic A: Data Dependence in Loops José Nelson Amaral http://www.cs.ualberta.ca/~amaral/courses/680. Reading. Wolfe, Michael, High Performance Compilers for Parallel Computing , Addison-Wesley, 1996 Chapter 5.
E N D
CMPUT680 - Fall 2006 Topic A: Data Dependence in Loops José Nelson Amaral http://www.cs.ualberta.ca/~amaral/courses/680
Reading Wolfe, Michael, High Performance Compilers for Parallel Computing, Addison-Wesley, 1996 Chapter 5 Randy Allen, Ken Kennedy, Optimizing Compilers for Modern Architectures: A Dependence-based Approach, Morgan Kauffman, 200. Chapter 2. CMPUT 680 - Compiler Design and Optimization
Basic Concept and Motivation • A loop-carried data dependence occurs when a memory access in the iteration i of a loop cannot occur before an access in some iteration i-k is performed. • There is data dependence between an access a at iteration i-k and an access b at iteration i if: • a and b access the same memory location • There is a path from a to b • Either a or b is a write CMPUT 680 - Compiler Design and Optimization
Flow dependence X = = X ... = X X = Anti-dependence ... -1 X = X = ... 0 Output dependence Three Types of Data Dependence CMPUT 680 - Compiler Design and Optimization
Example 1: S1: A = 0 S2: B = A S3: A = B + 1 S4: C = A S1f S2 S2 is flow dependent on S1 S1 S2 Data Dependence S1 S2 S3 S4 S1is the sourceand S2 is the target of the dependence. CMPUT 680 - Compiler Design and Optimization (Wolfe, pp. 138)
Data Dependence S1 Example 1: S1: A = 0 S2: B = A S3: A = B + 1 S4: C = A S2 S3 S4 S2 S3 : S3 is flow-dependent on S2 S10 S3 : S3 is output-dependent on S1 S2-1 S3 : S3 is anti-dependent on S2 CMPUT 680 - Compiler Design and Optimization
“Statement S1 depends on an instance of itself two iterations previous.” DO I = 1, N S1 A(I+2) = A(I) + B(I) ENDDO Parameterized Dependences DO I = 1, N S1 A(I+1) = A(I) + B(I) ENDDO “Statement S1 depends upon itself.” We need to be able to describe such dependences formally. CMPUT 680 - Compiler Design and Optimization (Allen-Kennedy, pp. 39)
The normalized value of an iteration k can be obtained from: Normalized(k) = (k-L+S)/S Iteration Space Example 5 8 11 14 … 26 DO I = 5, 26 STEP 3 …. ENDDO Normalized Iteration Space 1 2 3 4 … 8 Loop Normalization Given a loop of the form: DO I = L, U STEP S …. ENDDO CMPUT 680 - Compiler Design and Optimization (Allen-Kennedy, pp. 39)
Data Dependences Loop carried: between two statements instances in two different iterations of a loop. Loop independent: between two statements instances in the same loop iteration. Lexically forward: the source comes before the target . Lexically backward: otherwise. The right-hand side of an assignment is considered to precede the left-hand side. CMPUT 680 - Compiler Design and Optimization
Review of Linear Algebra Lexicographic Order Two n-vectors aand bare equal, a= b, if ai = bi, 1 i n. We say that a is less than b, a<b, if ai<bi, 1 i n. We say that a is lexicographically less than b, at levelj, a «jb, if ai = bi, 1 i < j and aj<bj. We say that ais lexicographically less than b, a « b, if there is a j, 1 j n, such that a «jb. CMPUT 680 - Compiler Design and Optimization (Wolfe, pp. 86)
Lexicographic OrderExample of vectors CMPUT 680 - Compiler Design and Optimization
Properties of Lexicographic Order Let n 1, and i, j,kdenote arbitrary vectors in Rn 1 For each u in 1un, the relation «u in Rn is irreflexive and transitive. 2 The n relations «u are pairwise disjoint: i«ujand i«vj imply that u = v. 3 If ij, there is a unique integer u such that 1 u n and exactly one of the following two conditions holds: i«uj or j«ui. 4 i«uj and j«vktogether imply thati«wk, where w = min (u,v). CMPUT 680 - Compiler Design and Optimization
(s2) X[2]=Y[2]+Z[2] X[3] =Y[3]+Z[3] X[4]=Y[4]+Z[4] (s3) A[2]=X[1]+1 A[3] =X[2]+1 A[4]=X[3]+1 i = 2 i = 3 i = 4 Data Dependence in Loops An Example Find the dependence relations due to the array X in the program below: (S1) for i = 2 to 9 do (S2) X[i] = Y[i] + Z[i] (S3) A[i] = X[i-1] + 1 (S4) end for Solution To find the data dependence relations in a simple loop, we can unroll the loop and see which statement instances depend on which others: CMPUT 680 - Compiler Design and Optimization (Wolfe, pp. 140)
(s2) X[2]=Y[2]+Z[2] X[3]=Y[3]+Z[3] X[4]=Y[4]+Z[4] (s3) A[2]=X[1]+1 A[3]=X[2]+1 A[4]=X[3]+1 i = 2 i = 3 i = 4 Data Dependence in Loops (S1) for i = 2 to 9 do (S2) X[i] = Y[i] + Z[i] (S3) A[i] = X[i-1] + 1 (S4) end for S2 There is a loop-carried, lexically forward, flow dependence from S2 to S3. (1,3) S3 Data dependence graph for statements in a loop (1,3) := iteration distance is 1, latency is 3. CMPUT 680 - Compiler Design and Optimization
Iteration Space(an informal introduction) • Iteration space and iteration-space-dependence-graph Example Show the iteration space dependence graph for the loop in our example. Solution 0 1 2 3 4 5 6 7 8 9 Iteration space dependence graph We need an abstraction for this. CMPUT 680 - Compiler Design and Optimization
S2 S3 S4 Iteration Space(an informal introduction) Iteration Vector: a vector formed by the index variable used to access an array in the loop. (S1) for i = 3 to 9 do (S2) X[i] = Y[i] + Z[i] (S3) A[i] = X[i-2] + 1 (S4) B[i] = A[i-1] + 2 (S5) end for For each dependency, there is an iteration vector for the source and one for the target i2S(X) = [3; 4; 5; 6; 7; 8; 9] i3T(X) = [1; 2; 3; 4; 5; 6; 7] i3S(A) = [3; 4; 5; 6; 7; 8; 9] i4T(A) = [2; 3; 4; 5; 6; 7; 8] CMPUT 680 - Compiler Design and Optimization
S2 S3 S4 Iteration Space(an informal introduction) Distance Vector: a vector formed by the difference between the iteration vectors of the source and target of a dependency. (S1) for i = 3 to 9 do (S2) X[i] = Y[i] + Z[i] (S3) A[i] = X[i-2] + 1 (S4) B[i] = A[i-1] + 2 (S5) end for i2S(X) = [3; 4; 5; 6; 7; 8; 9] i3T(X) = [1; 2; 3; 4; 5; 6; 7] i3S(A) = [3; 4; 5; 6; 7; 8; 9] i4T(A) = [2; 3; 4; 5; 6; 7; 8] d(X) = i3T(X) - i2S(X) d(X) = [-2; -2; -2; -2; -2; -2; -2] d(A) = i4T(A) - i3S(A) d(A) = [-1; -1; -1; -1; -1; -1; -1] CMPUT 680 - Compiler Design and Optimization
S2 S3 S4 Iteration Space(an informal introduction) Direction Vector: contain only information about the direction of the dependence but no iteration distance information. (S1) for i = 3 to 9 do (S2) X[i] = Y[i] + Z[i] (S3) A[i] = X[i-2] + 1 (S4) B[i] = A[i-1] + 2 (S5) end for The elements of a direction vector are <, >, and =. Other authors use +, -, 0. dir(X) = [<;<;<;<;<;<;<] dir(A) = [<;<;<;<;<;<;<] CMPUT 680 - Compiler Design and Optimization
Iteration Space(an informal introduction) • Each element of the direction vector can be stored in two bits. • Given a distance vector, we can compute the direction vector, but not vice-versa. CMPUT 680 - Compiler Design and Optimization
Iteration Space(an informal introduction) Example Show the index variable iteration vectors and normalized iteration vectors for the iterations in the loop below: (1) for i = 2 to 6 do (2) for j = 6 to 2 by -2 do (3) A[i, j] = A[i, j+2] +1 (4) end for (5) end for Solution Since there are two nested loops, the iteration space has two dimensions. CMPUT 680 - Compiler Design and Optimization
j 6 4 2 Iteration Space(an informal introduction) (1) for i = 2 to 6 do (2) for j = 6 to 2 by -2 do (3) A[i, j] = A[i, j+2] +1 (4) end for (5) end for 2 3 4 5 6 i Iteration space dependence graph corresponding to the index variable iteration vectors. CMPUT 680 - Compiler Design and Optimization
Distance/Direction Vectors • It is often convenient to deal with incompletely specified direction vectors Example 1: {(0, 0, 0, 1), (0, -1, 0, 1), (0, 0, 1, 1), (0, -1, 1, 1)} ==> {(0, 0, 0, 1)} Example 2: {(0, -1, 0, -1), (0, 0, 0, -1), (0, 1, 0, -1)} ==> {(0, *, 0, -1)} CMPUT 680 - Compiler Design and Optimization
Distance/Direction Vectors • Leta, b denote two vectors in Rn and s their direction vector. Then a«bif and only if s has one of the following forms: (1, *, *, …, *) (0, 1, *, …, *) (0, 0, 1, *, …, *) (0, 0, …, 0, 1). More precisely, a«u b for u in 1 u n, if and only if s has the form with a leading 1 after (u - 1) zeros. • Notation (0, 1, -1)(=, >, <) CMPUT 680 - Compiler Design and Optimization
An Example do i = 3, 100 S:A[2i] = B[i] + 2 T:C[i] = D[i] + 2A[2i+1] + A[2i - 4] + A[i] done What are the dependences and the dependence distance vectors in the example above? CMPUT 680 - Compiler Design and Optimization
An Example do i = 3, 100 S:A[2i] = B[i] + 2 T1:TEMP1 = D[i] + 2A[2i + 1] T2: TEMP2 = TEMP1 + A[2i - 4] T3: C(i) = TEMP2+ A[i] done iS(A) = [6; 8; 10; 12; 14; 16; …; 198; 200] iT1(A) = [7; 9; 11; 13; 15; 17; …; 199; 201] iT2(A) = [2; 4; 6; 8; 10; 12; …; 194; 196] iT3(A) = [3; 4; 5; 6; 7; 8; …; 99; 100] CMPUT 680 - Compiler Design and Optimization
An Example do i = 3, 100 S:A[2i] = B[i] + 2 T1:TEMP1 = D[i] + 2A[2i + 1] T2: TEMP2 = TEMP1 + A[2i - 4] T3: C(i) = TEMP2+ A[i] done iS(A) = [6; 8; 10; 12; 14; 16; …; 198; 200] iT1(A) = [7; 9; 11; 13; 15; 17; …; 199; 201] d(T1,S) = iT1(A) - iS(A) T1 is flow dependent on S with dependence distance 1. CMPUT 680 - Compiler Design and Optimization
An Example do i = 3, 100 S:A[2i] = B[i] + 2 T1:TEMP1 = D[i] + 2A[2i + 1] T2: TEMP2 = TEMP1 + A[2i - 4] T3: C(i) = TEMP2+ A[i] done iS(A) = [6; 8; 10; 12; 14; 16; …; 198; 200] iT2(A) = [2; 4; 6; 8; 10; 12; …; 194; 196] d(T2,S) = iT2(A) - iS(A) T2 is flow dependent on S with dependence distance -4. CMPUT 680 - Compiler Design and Optimization
An Example do i = 3, 100 S:A[2i] = B[i] + 2 T1:TEMP1 = D[i] + 2A[2i + 1] T2: TEMP2 = TEMP1 + A[2i - 4] T3: C(i) = TEMP2+ A[i] done iS(A) = [6; 8; 10; 12; 14; 16; …; 198; 200] iT3(A) = [3; 4; 5; 6; 7; 8; …; 99; 100] d(T3,S) = iT3(A) - iS(A) T3 is flow dependent on S with dependence distance (i-2i) = -i CMPUT 680 - Compiler Design and Optimization
Wolfe’s Definition From Michael Wolfe’s, pg. 140: “A dependence is lexically forward when the source comes before the target without passing through a loop back edge”: “An anti-dependence from a statement to itself is considered lexically forward”: x[1] x[2] + 1 Sk: x[i] x[i+1] + 1 (back edge) x[2] x[3] + 1 (back edge) x[3] x[4] + 1 CMPUT 680 - Compiler Design and Optimization
x[1] x[0] + 1 (back edge) x[2] x[1] + 1 (back edge) x[3] x[2] + 1 Wolfe’s Definition From Michael Wolfe’s, pg. 140: “A self-flow dependence is lexically backward”: Sk: x[i] x[i-1] + 1 CMPUT 680 - Compiler Design and Optimization
Allen-Kennedy Definition From Allen-Kennedy’s, pg. 45: “Suppose that there is a dependence from statement S1 on iteration i of a loop nest of n loops and statement S2on iteration j; then the dependence distance vectord(i,j) is defined as a vector of length n such that: CMPUT 680 - Compiler Design and Optimization
Allen-Kennedy Definition From Allen-Kennedy’s, pg. 46: “Suppose that there is a dependence from statement S1 on iteration i of a loop nest of n loops and statement S2 on iteration j; then the dependence direction vectorD(i,j) is defined as a vector of length n such that: CMPUT 680 - Compiler Design and Optimization
Allen-Kennedy Definition From Allen-Kennedy’s, pg. 50: “Statement S2 has a loop-carried dependence on statement S1if and only if S1 references location M on iteration j, and d(i,j) > 0 (that is, D(i,j) contains a “<“ as its leftmost non-”=“ component).” “A loop-carried dependence from statement S1 to statement S2 is said to be backward if S2appears beforeS1 in the loop body or if S1 and S2are the same statement. The carried dependence is said to be forward if S2appears afterS1 in the loop body. CMPUT 680 - Compiler Design and Optimization