310 likes | 458 Views
ELEC 7770 Advanced VLSI Design Spring 2007 Constraint Graph and Performance Optimization. Vishwani D. Agrawal James J. Danaher Professor ECE Department, Auburn University Auburn, AL 36849 vagrawal@eng.auburn.edu http://www.eng.auburn.edu/~vagrawal/COURSE/E7770_Spr07. Retiming Theorem.
E N D
ELEC 7770Advanced VLSI DesignSpring 2007Constraint Graph and Performance Optimization Vishwani D. Agrawal James J. Danaher Professor ECE Department, Auburn University Auburn, AL 36849 vagrawal@eng.auburn.edu http://www.eng.auburn.edu/~vagrawal/COURSE/E7770_Spr07 ELEC 7770: Advanced VLSI Design (Agrawal)
Retiming Theorem • Given a network G(V, E, W) and a cycle time T, (r1, . . . ) is a feasible retiming if and only if: • ri – rj ≤ wij for all edges (vi,vj) ε E • ri – rj ≤ W(vi,vj) – 1 for all node-pairs vi, vj such that D(vi,vj) > T Where, W(vi,vj) is the minimum weight path between vi and vj D(vi,vj) is the maximum delay among all minimum weight paths between vi and vj ELEC 7770: Advanced VLSI Design (Agrawal)
Timing Optimization • Find the clock period (T) by path analysis. • Set clock period to T/2 and find a feasible retiming. • If feasible, further reduce the clock period to half. • If not feasible, increase clock period. • Do a binary search for optimum clock period. • Retime the circuit. ELEC 7770: Advanced VLSI Design (Agrawal)
Representing a Constraint ri – rj ≤ wij or rj ≥ ri – wij – wij rj ri ELEC 7770: Advanced VLSI Design (Agrawal)
Constraint Graph -6 r1 ≥ r0 + 3 r1 ≥ r2 + 1 r2 ≥ r0 + 1 r2 ≥ r1 – 1 r3 ≥ r1 + 1 r3 ≥ r2 + 4 r0 ≥ r3 – 6 r1 3 1 r0 r3 -1 1 1 4 r2 ELEC 7770: Advanced VLSI Design (Agrawal)
Feasibility Condition • A set of values for variables can be found if and only if the constraint graph has no positive cycles. • This is also the condition for the solvability of the longest path problem, which provides a solution to the set of constraints. ELEC 7770: Advanced VLSI Design (Agrawal)
Example: Infeasible Constraints x2 x1 ≥ x2 + 6 x2 ≥ x1 – 3 6 x2 ≥ x1 – 3 x1 x2 3 -3 Positive cycle mean no longest path can be found. x1 ≥ x2 + 6 x1 0 3 6 ELEC 7770: Advanced VLSI Design (Agrawal)
Solving a Constraint Set -6 r1 ≥ r0 + 3 r1 ≥ r2 + 1 r2 ≥ r0 + 1 r2 ≥ r1 – 1 r3 ≥ r1 + 1 r3 ≥ r2 + 4 r0 ≥ r3 – 6 r1 3 1 r0 r3 -1 1 Longest path from source r0: r0, r1, r2, r3 Path lengths: s0=0, s1=3, s2=2, s3=6 Solution: r0=0, r1=3, r2=2, r3=6 1 4 r2 ELEC 7770: Advanced VLSI Design (Agrawal)
The General Path Problem • Find the shortest (or longest) path in a graph from a source vertex to any other vertex. • Graph has vertices and directed edges: • Edge weights can be positive or negative • Graph can be cyclic • Single source vertex – a vertex with 0 in-degree • Inconsistent problem • Negative cycles for shortest path • Positive cycles for longest path ELEC 7770: Advanced VLSI Design (Agrawal)
Dijkstra’s Shortest Path Algorithm • Greedy algorithm. • Applies to directed acyclic graphs (DAG) with positive edge weights. • Computational complexity O(|E| + |V| log |V|) ≤ O(n2) • References: • A. Aho, J. Hopcroft and J. Ullman, Data Structures and Algorithms, Reading, Massachusetts: Addison-Wesley, 1983. • T. Cormen, C. Leiserson and R. Rivest, Introduction to Algorithms, New York: McGraw-Hill, 1990. ELEC 7770: Advanced VLSI Design (Agrawal)
Dijkstra’s Shortest Path Algorithm v1 w01=15 3 v0 v3 10 source 2 6 v2 si = path weight (v0, vi) Each step marks the path with smallest weight and updates the unmarked path weights. ELEC 7770: Advanced VLSI Design (Agrawal)
Dijkstra’s Algorithm, G(V, E, W) s0(1) = 0 initialize source for ( i = 1 to n ) initialize path weights, n=|V| –1 si(1) = w0i repeat { Select an unmarked vertex vq such that sq is minimal Mark vq foreach ( unmarked vertex vi ) si = min { si, sq + wqi } } until (all vertices are marked) ELEC 7770: Advanced VLSI Design (Agrawal)
Dijkstra’s Longest Path Algorithm v1 w01=15 3 v0 v3 10 source 2 6 v2 v1 w01= -15 -3 v0 v3 -10 source -2 -6 v2 si = path length (v0, vi) ELEC 7770: Advanced VLSI Design (Agrawal)
Dijkstra’s Alg. for Cycles, Neg. Weights -2 v1 w01=15 3 v0 v3 5 source 2 4 v2 si = path weight (v0, vi) There exists a v0 to v3 path of length 5 ELEC 7770: Advanced VLSI Design (Agrawal)
Bellman’s Equations – Shortest Path vj vk wki wji For all vertices: si = min (sq + wqi) vq ε pred(vi) vi vm wmi wni vn sq = minimum path weight between source and vq ELEC 7770: Advanced VLSI Design (Agrawal)
Bellman-Ford Algorithm, G(V, E, W) Bellman-Ford { s0(1) = 0 initialize source for ( i = 1 to n ) initialize path weights, n = |V| – 1 si(1) = w0i for ( j = 1 to n ) n iterations for ( i = 1 to n ) si(j+1) = min { si(j), sk(j) + wkj } vk ε pred(vi) } if ( si(j+1) == si(j) i ) return (true) } return (false) Complexity = O(|V||E|) ≤ O(n3) ELEC 7770: Advanced VLSI Design (Agrawal)
Bellman-Ford Shortest Path n = 3 v1 w01=15 3 v0 v3 10 source 2 6 v2 si = path weight (v0, vi) ELEC 7770: Advanced VLSI Design (Agrawal)
Bellman-Ford Longest Path Reverse the sign of weights and solve shortest path problem. (Alternative: keep original weights and change min operator in algorithm to max.) n = 3 (shortest path) Weights reversed v1 w01= -15 -3 v0 v3 -10 source -2 -6 v2 si = path weight (v0, vi) ELEC 7770: Advanced VLSI Design (Agrawal)
Bellman’s Equations – Longest Path vj vk wki wji For all vertices: si = max (sq + wqi) vq ε pred(vi) vi vm wmi wni vn sq = maximum path weight between source and vq ELEC 7770: Advanced VLSI Design (Agrawal)
Bellman-Ford for Cycles, Neg. Weights n = 3 (shortest path) -2 v1 w01=15 3 v0 v3 5 source 2 4 v2 si = path weight (v0, vi) This was incorrect with Dijkstra’s shortest path algorithm ELEC 7770: Advanced VLSI Design (Agrawal)
Bellman-Ford for Negative Cycle n = 3 (shortest path) 2 v1 w01=15 -3 v0 v3 5 source 2 4 v2 si = path weight (v0, vi) Values not stabilized after n iterations. Inconsistent problem: negative cycle. ELEC 7770: Advanced VLSI Design (Agrawal)
Retiming Example FF a b c 10 5 5 Delay ELEC 7770: Advanced VLSI Design (Agrawal)
Retiming Graph FF a b c 10 5 5 1 0 0 h 0 a 10 b 5 1 c 5 Critical path = 15 It is the longest path consisting only of zero weight edges. ELEC 7770: Advanced VLSI Design (Agrawal)
Feasibility Constraints FF a b c 10 5 5 1 0 0 h 0 a 10 b 5 1 c 5 rh – ra ≤ 0 ra – rb ≤ 0 rb – rc ≤ 1 rc – rh ≤ 1 ri – rj ≤ wij edges i → j Retiming should not cause negative edge weights. ELEC 7770: Advanced VLSI Design (Agrawal)
Constraint Graph FF a b c 10 5 5 -1 0 0 rh 0 ra 10 rb 5 -1 rc 5 rh – ra ≤ 0 ra – rb ≤ 0 rb – rc ≤ 1 rc – rh ≤ 1 ri – rj ≤ wij edges i → j Retiming should not cause negative edge weights. Observation: Constraint graph has the same structure as the original retiming graph, with signs of weights reversed. Vertex labels are the retiming integer variables. ELEC 7770: Advanced VLSI Design (Agrawal)
Max Delay for Min Weight Paths 1 0 0 h 0 a 10 b 5 1 c 5 T = 15 W(b,c) = 1 D(b,c) = 10 W(b,h) = 2 D(b,h) = 10 W(b,a) = 2 D(b,a) = 20 W(c,h) = 1 D(c,h) = 5 W(c,a) = 1 D(c,a) = 15 W(c,b) = 1 D(c,b) = 20 W(h,a) = 0 D(h,a) = 10 W(h,b) = 0 D(h,b) = 15 W(h,c) = 1 D(h,c) = 20 W(a,b) = 0 D(a,b) = 15 W(a,c) = 1 D(a,c) = 20 W(a,h) = 2 D(a,h) = 20 ELEC 7770: Advanced VLSI Design (Agrawal)
Timing Optimization, T = 7.5? -1 Constraint graph (feasibility) 0 0 rh 0 ra 10 rb 5 -1 rc 5 ri – rj ≤ W(I,j) – 1 paths (i,j) such that D(i,j) > 7.5 W(b,c) = 1 D(b,c) = 10 W(b,h) = 2 D(b,h) = 10 W(b,a) = 2 D(b,a) = 20 W(c,h) = 1 D(c,h) = 5 W(c,a) = 1 D(c,a) = 15 W(c,b) = 1 D(c,b) = 20 W(h,a) = 0 D(h,a) = 10 W(h,b) = 0 D(h,b) = 15 W(h,c) = 1 D(h,c) = 20 W(a,b) = 0 D(a,b) = 15 W(a,c) = 1 D(a,c) = 20 W(a,h) = 2 D(a,h) = 20 ELEC 7770: Advanced VLSI Design (Agrawal)
Timing Optimization, T = 7.5? Positive cycle No solution 0 -1 1 1 0 1 0 0 rh 0 ra 10 rb 5 -1 rc 5 -1 -1 0 0 -1 0 W(b,c) = 1 D(b,c) = 10 W(b,h) = 2 D(b,h) = 10 W(b,a) = 2 D(b,a) = 20 W(c,h) = 1 D(c,h) = 5 W(c,a) = 1 D(c,a) = 15 W(c,b) = 1 D(c,b) = 20 W(h,a) = 0 D(h,a) = 10 W(h,b) = 0 D(h,b) = 15 W(h,c) = 1 D(h,c) = 20 W(a,b) = 0 D(a,b) = 15 W(a,c) = 1 D(a,c) = 20 W(a,h) = 2 D(a,h) = 20 ELEC 7770: Advanced VLSI Design (Agrawal)
Timing Optimization, T = 11.25? -1 rh = 0 rb = 1 rc = 0 ra = 0 0 1 1 0 0 0 rh 0 ra 10 rb 5 -1 rc 5 -1 -1 0 0 W(b,c) = 1 D(b,c) = 10 W(b,h) = 2 D(b,h) = 10 W(b,a) = 2 D(b,a) = 20 W(c,h) = 1 D(c,h) = 5 W(c,a) = 1 D(c,a) = 15 W(c,b) = 1 D(c,b) = 20 W(h,a) = 0 D(h,a) = 10 W(h,b) = 0 D(h,b) = 15 W(h,c) = 1 D(h,c) = 20 W(a,b) = 0 D(a,b) = 15 W(a,c) = 1 D(a,c) = 20 W(a,h) = 2 D(a,h) = 20 ELEC 7770: Advanced VLSI Design (Agrawal)
Retiming Graph FF a b c 10 5 5 1 0 0 h 0 a 10 b 5 1 c 5 1 0 rc = 0 rh = 0 ra = 0 rb = 1 wij_retimed = wij + rj – ri ELEC 7770: Advanced VLSI Design (Agrawal)
Retimed Circuit FF a c b 10 5 5 Logic optimization will remove these. 1 1 0 0 h 0 a 10 b 5 c 5 rc = 0 rh = 0 ra = 0 rb = 1 Critical Path = 10 ELEC 7770: Advanced VLSI Design (Agrawal)