400 likes | 545 Views
ELEC 7770 Advanced VLSI Design Spring 2014 Constraint Graph and Retiming Solution. Vishwani D. Agrawal James J. Danaher Professor ECE Department, Auburn University Auburn, AL 36849 vagrawal@eng.auburn.edu http://www.eng.auburn.edu/~vagrawal/COURSE/E7770_Spr14/course.html. Retiming Theorem.
E N D
ELEC 7770Advanced VLSI DesignSpring 2014Constraint Graph and Retiming Solution Vishwani D. Agrawal James J. Danaher Professor ECE Department, Auburn University Auburn, AL 36849 vagrawal@eng.auburn.edu http://www.eng.auburn.edu/~vagrawal/COURSE/E7770_Spr14/course.html ELEC 7770: Advanced VLSI Design (Agrawal)
Retiming Theorem • Given a network G(V, E, W) and a cycle time T, (r1, . . . ) is a feasible retiming if and only if: • ri – rj≤ wij for all edges (vi,vj) ε E • ri – rj ≤ W(vi,vj) – 1 for all node-pairs vi, vj such that D(vi,vj) > T Where, W(vi,vj) is the minimum weight path between vi and vj D(vi,vj) is the maximum delay among all minimum weight paths between vi and vj ELEC 7770: Advanced VLSI Design (Agrawal)
Retiming Theorem Explained • Condition 1, ri – rj≤ wij, is related to edge weight: • Original circuit is feasible original weight wij is positive • Originally, ri = rj = 0 • Retiming, rj flip-flops added to eij, ri flip-flops removed from eij, net reduction ri – rj must be less than wij to leave the retimed weight of eij positive. • Condition 2, ri – rj ≤ W(vi,vj) – 1 is related to path delays between node pairs being less than clock period T whenever path weight is 0. ELEC 7770: Advanced VLSI Design (Agrawal)
Examine Condition 2 W1, D1 rj ri vj W2, D2 vi W3, D3 W1 = W2 < W3, W(vi, vj) = W1 = W2, minimum weight among paths D1 > D2, therefore D(vi, vj) = D1, maximum delay of a minimum weigh path If D1 ≤ T, there is no requirement on ri, rj If D1 > T, Retimed weight W1’ = W1 – ri + rj ≥ 1 (at least 1 FF on path) or ri – rj ≤ W1 – 1 ELEC 7770: Advanced VLSI Design (Agrawal)
Timing Optimization • Find the clock period (T) by path analysis. • Set clock period to T/2 and find a feasible retiming. • If feasible, further reduce the clock period to half. • If not feasible, increase clock period. • Do a binary search for optimum clock period. • Retime the circuit. ELEC 7770: Advanced VLSI Design (Agrawal)
Representing a Constraint ri – rj ≤ wij or rj ≥ ri – wij – wij rj ri ELEC 7770: Advanced VLSI Design (Agrawal)
Constraint Graph -6 r1 ≥ r0 + 3 r1 ≥ r2 + 1 r2 ≥ r0 + 1 r2 ≥ r1 – 1 r3 ≥ r1 + 1 r3 ≥ r2 + 4 r0 ≥ r3 – 6 r1 3 1 r0 r3 -1 1 1 4 r2 ELEC 7770: Advanced VLSI Design (Agrawal)
Feasibility Condition • A set of values for variables can be found if and only if the constraint graph has no positive cycles. • This is also the condition for the solvability of the longest path problem, which provides a solution to the set of constraints. ELEC 7770: Advanced VLSI Design (Agrawal)
Example: Infeasible Constraints x2 x1 ≥ x2 + 6 x2 ≥ x1 – 3 6 x2 ≥ x1 – 3 x1 x2 3 -3 Positive cycle mean no longest path can be found. x1 ≥ x2 + 6 x1 0 3 6 ELEC 7770: Advanced VLSI Design (Agrawal)
Solving a Constraint Set -6 r1 ≥ r0 + 3 r1 ≥ r2 + 1 r2 ≥ r0 + 1 r2 ≥ r1 – 1 r3 ≥ r1 + 1 r3 ≥ r2 + 4 r0 ≥ r3 – 6 r1 3 1 r0 r3 -1 1 Longest paths from source r0 to r0, r1, r2, r3 Path lengths: s0=0, s1=3, s2=2, s3=6 Solution: r0=0, r1=3, r2=2, r3=6 1 4 r2 ELEC 7770: Advanced VLSI Design (Agrawal)
The General Path Problem • Find the shortest (or longest) path in a graph from a source vertex to all other vertices. • Graph has vertices and directed edges: • Edge weights can be positive or negative • Graph can be cyclic • Single source vertex – a vertex with 0 in-degree (not a necessary condition) • Inconsistent problems • Negative weight cycles for shortest path • Positive weight cycles for longest path ELEC 7770: Advanced VLSI Design (Agrawal)
Dijkstra’s Shortest Path Algorithm • Greedy algorithm. • Applies to directed acyclic graphs (DAG) with positive edge weights. • Computational complexity O(|E| + |V| log |V|) ≤ O(n2) • References: • A. Aho, J. Hopcroft and J. Ullman, Data Structures and Algorithms, Reading, Massachusetts: Addison-Wesley, 1983. • T. Cormen, C. Leiserson and R. Rivest, Introduction to Algorithms, New York: McGraw-Hill, 1990. ELEC 7770: Advanced VLSI Design (Agrawal)
Dijkstra’s Shortest Path Algorithm Example 1 v1 w01=15 3 v0 v3 10 source 2 6 v2 si = path weight (v0, vi) Each step marks the path with smallest weight and updates the unmarked path weights. ELEC 7770: Advanced VLSI Design (Agrawal)
Dijkstra’s Shortest Path Algorithm Example 2 v1 w01=15 3 v0 v3 6 source 2 10 v2 si = path weight (v0, vi) Each step marks the path with smallest weight and updates the unmarked path weights. ELEC 7770: Advanced VLSI Design (Agrawal)
Dijkstra’s Algorithm, G(V, E, W) s0(1) = 0 initialize source for ( i = 1 to n ) initialize path weights, n=|V| –1 si(1) = w0i repeat { Select an unmarked vertex vq such that sq is minimal Mark vq foreach ( unmarked vertex vi ) si = min { si, sq + wqi } } until (all vertices are marked) ELEC 7770: Advanced VLSI Design (Agrawal)
Try Dijkstra’s Algorithm for Your Graph http://www.dgp.toronto.edu/people/JamesStewart/270/9798s/Laffra/DijkstraApplet.html ELEC 7770: Advanced VLSI Design (Agrawal)
Dijkstra’s Longest Path Algorithm v1 Either change min to max Or change all positive weights to negatives w01=15 3 v0 v3 10 source 2 6 v2 v1 w01= -15 -3 v0 v3 -10 source -2 -6 v2 si = path length (v0, vi) ELEC 7770: Advanced VLSI Design (Agrawal)
Dijkstra’s Alg. Does Not Work for Cycles, Mixed Weights -2 v1 w01=15 3 v0 v3 5 source 2 4 v2 si = path weight (v0, vi) Algorithm stops because all vertices are marked. But, there exists a v0 to v3 path of length 5 ELEC 7770: Advanced VLSI Design (Agrawal)
Bellman’s Equations – Shortest Path vj vk wki wji For all vertices: si = min (sq + wqi) vq ε pred(vi) vi vm wmi wni vn sq = minimum path weight between source and vq ELEC 7770: Advanced VLSI Design (Agrawal)
Bellman-Ford Algorithm, G(V, E, W) Bellman-Ford { s0(1) = 0 initialize source for ( i = 1 to n ) initialize path weights, n = |V| – 1 si(1) = w0i for ( j = 1 to n ) n iterations for ( i = 1 to n ) n nodes si(j+1) = min { si(j), sk(j) + wki } vkεpred(vi) } if ( si(j+1) == si(j) i ) return (true) } return (false) Complexity = O(|V||E|) ≤ O(n3) ELEC 7770: Advanced VLSI Design (Agrawal)
Bellman-Ford Shortest Path n = 3 v1 w01=15 3 v0 v3 10 source 2 6 v2 si = path weight (v0, vi) ELEC 7770: Advanced VLSI Design (Agrawal)
Bellman-Ford Longest Path Reverse the sign of weights and solve shortest path problem. (Alternative: keep original weights and change min operator in algorithm to max.) n = 3 (shortest path) Weights reversed v1 w01= -15 -3 v0 v3 -10 source -2 -6 v2 si = path weight (v0, vi) ELEC 7770: Advanced VLSI Design (Agrawal)
Bellman’s Equations – Longest Path vj vk wki wji For all vertices: si = max (sq + wqi) vq ε pred(vi) vi vm wmi wni vn sq = maximum path weight between source and vq ELEC 7770: Advanced VLSI Design (Agrawal)
Bellman-Ford for Cycles, Neg. Weights n = 3 (shortest path) -2 v1 w01=15 3 v0 v3 5 source 2 4 v2 si = path weight (v0, vi) This was incorrect with Dijkstra’s shortest path algorithm ELEC 7770: Advanced VLSI Design (Agrawal)
Bellman-Ford for Negative Cycle n = 3 (shortest path) 2 v1 w01=15 -3 v0 v3 5 source 2 4 v2 si = path weight (v0, vi) Values not stabilized after n iterations. Inconsistent problem: negative cycle. ELEC 7770: Advanced VLSI Design (Agrawal)
Retiming Example FF a b c 10 5 5 Delay ELEC 7770: Advanced VLSI Design (Agrawal)
Retiming Graph FF a b c 10 5 5 1 0 0 h 0 a 10 b 5 1 c 5 Critical path = 15 It is the longest path consisting only of zero weight edges. ELEC 7770: Advanced VLSI Design (Agrawal)
Feasibility Constraints (Condition 1) FF a b c 10 5 5 1 rb FFs 0 0 h 0 a 10 b 5 1 c 5 rc FFs rh – ra ≤ 0 ra – rb ≤ 0 rb – rc ≤ 1 rc – rh ≤ 1 ri – rj ≤ wij edges i → j Retiming should not cause negative edge weights. ELEC 7770: Advanced VLSI Design (Agrawal)
Constraint Graph FF a b c 10 5 5 -1 0 0 rh 0 ra 10 rb 5 -1 rc 5 rh – ra ≤ 0 rh – 0 ≤ ra ra – rb ≤ 0 ra – 0 ≤ rb Constraints for rb – rc ≤ 1 rb – 1 ≤ rc Condition 1 rc – rh ≤ 1 rc – 0 ≤ rh ri – rj ≤ wij edges i → j Retiming should not cause negative edge weights. Observation: Constraint graph has the same structure as the original retiming graph, with signs of weights reversed. Vertex labels are the retiming integer variables. ELEC 7770: Advanced VLSI Design (Agrawal)
Max Delay for Min Weight Paths 1 0 0 h 0 a 10 b 5 1 c 5 T = 15 W(b,c) = 1 D(b,c) = 10 W(b,h) = 2 D(b,h) = 10 W(b,a) = 2 D(b,a) = 20 W(c,h) = 1 D(c,h) = 5 W(c,a) = 1 D(c,a) = 15 W(c,b) = 1 D(c,b) = 20 W(h,a) = 0 D(h,a) = 10 W(h,b) = 0 D(h,b) = 15 W(h,c) = 1 D(h,c) = 20 W(a,b) = 0 D(a,b) = 15 W(a,c) = 1 D(a,c) = 20 W(a,h) = 2 D(a,h) = 20 ELEC 7770: Advanced VLSI Design (Agrawal)
Timing Optimization, T = 7.5? -1 Constraint graph (feasibility) 0 0 rh 0 ra 10 rb 5 -1 rc 5 Add constraints for Condition 2: ri – rj ≤ W(I,j) – 1 paths (i,j) with D(i,j) > 7.5 W(b,c) = 1 D(b,c) = 10 W(b,h) = 2 D(b,h) = 10 W(b,a) = 2 D(b,a) = 20 W(c,h) = 1 D(c,h) = 5 W(c,a) = 1 D(c,a) = 15 W(c,b) = 1 D(c,b) = 20 W(h,a) = 0 D(h,a) = 10 W(h,b) = 0 D(h,b) = 15 W(h,c) = 1 D(h,c) = 20 W(a,b) = 0 D(a,b) = 15 W(a,c) = 1 D(a,c) = 20 W(a,h) = 2 D(a,h) = 20 ELEC 7770: Advanced VLSI Design (Agrawal)
Timing Optimization, T = 7.5? Positive cycle; no solution for longest path -1 0 1 1 0 1 0 0 rh 0 ra 10 rb 5 -1 rc 5 -1 -1 0 0 -1 0 W(b,c) = 1 D(b,c) = 10 W(b,h) = 2 D(b,h) = 10 W(b,a) = 2 D(b,a) = 20 W(c,h) = 1 D(c,h) = 5 W(c,a) = 1 D(c,a) = 15 W(c,b) = 1 D(c,b) = 20 W(h,a) = 0 D(h,a) = 10 W(h,b) = 0 D(h,b) = 15 W(h,c) = 1 D(h,c) = 20 W(a,b) = 0 D(a,b) = 15 W(a,c) = 1 D(a,c) = 20 W(a,h) = 2 D(a,h) = 20 ELEC 7770: Advanced VLSI Design (Agrawal)
Timing Optimization, T = 11.25? -1 rh = 0 rb = 1 rc = 0 ra = 0 0 1 1 0 0 0 rh 0 ra 10 rb 5 -1 rc 5 -1 -1 0 0 W(b,c) = 1 D(b,c) = 10 W(b,h) = 2 D(b,h) = 10 W(b,a) = 2 D(b,a) = 20 W(c,h) = 1 D(c,h) = 5 W(c,a) = 1 D(c,a) = 15 W(c,b) = 1 D(c,b) = 20 W(h,a) = 0 D(h,a) = 10 W(h,b) = 0 D(h,b) = 15 W(h,c) = 1 D(h,c) = 20 W(a,b) = 0 D(a,b) = 15 W(a,c) = 1 D(a,c) = 20 W(a,h) = 2 D(a,h) = 20 ELEC 7770: Advanced VLSI Design (Agrawal)
Retiming Graph FF a b c 10 5 5 1 0 0 h 0 a 10 b 5 1 c 5 1 0 rc = 0 rh = 0 ra = 0 rb = 1 wij_retimed = wij + rj – ri ELEC 7770: Advanced VLSI Design (Agrawal)
Retimed Circuit FF a c b 10 5 5 Logic optimization will remove these. 1 1 0 0 h 0 a 10 b 5 c 5 rc = 0 rh = 0 ra = 0 rb = 1 Critical Path = 10 ELEC 7770: Advanced VLSI Design (Agrawal)
Correlator Circuit Critical path delay = 24 f g e 0 0 7 7 7 re=0 0 rf=0 rg=0 0 0 0 0 0 h rh=0 1 3 3 3 3 1 1 1 rd=0 rb=0 rc=0 ra=0 a c b d Initial retiming vector = {0,0,0,0,0,0,0,0} ELEC 7770: Advanced VLSI Design (Agrawal)
Retiming Optimization ELEC 7770: Advanced VLSI Design (Agrawal)
Retiming of Correlator Circuit Critical path delay = 13.5 f g e 0→1 0→2→1 7 7 7 re= -2 0 rf= -1 rg=0 0→2→0 0 0→1→0 0→1 0→2→0 h rh=0 1→2→0 1→0 1→3→1 3 3 3 3 1→2→1 rd= -2 rb= -1 rc= -2 ra= -1 a c b d retiming vector = {-1,-1,-2,-2,-2,-1,0,0} ELEC 7770: Advanced VLSI Design (Agrawal)
Retimed Correlator Circuit Critical path delay = 13 f g e 1 1 7 7 7 re= -2 0 rf= -1 rg=0 0 0 0 0→1 0 h rh=0 0 0 1 3 3 3 3 1 rd= -2 rb= -1 rc= -2 ra= -1 a c b d retiming vector = {-1,-1,-2,-2,-2,-1,0,0} ELEC 7770: Advanced VLSI Design (Agrawal)
References C. E. Leiserson, F. Rose and J. B. Saxe, “Optimizing Synchronous Circuits by Retiming,” Proc. 3rd Caltech Conf. on VLSI, 1983, pp. 87-116. C. E. Leiserson and J. B. Saxe, “Retiming Synchronous Circuitry,” Algorithmica, vol. 6, pp. 5-35, 1991. G. De Micheli, Synthesis and Optimization of Digital Circuits, New York: McGraw-Hill, 1994, Section 9.3.1. N. Maheshwari and S. S. Sapatnekar, Timing Analysis and Optimization of Sequential Circuits, Boston: Springer, 1999, Chapter 4. ELEC 7770: Advanced VLSI Design (Agrawal)