270 likes | 416 Views
ELEC 7770 Advanced VLSI Design Spring 2012 Retiming. Vishwani D. Agrawal James J. Danaher Professor ECE Department, Auburn University Auburn, AL 36849 vagrawal@eng.auburn.edu http://www.eng.auburn.edu/~vagrawal/COURSE/E7770_Spr12/course.html. Retiming.
E N D
ELEC 7770Advanced VLSI DesignSpring 2012Retiming Vishwani D. Agrawal James J. Danaher Professor ECE Department, Auburn University Auburn, AL 36849 vagrawal@eng.auburn.edu http://www.eng.auburn.edu/~vagrawal/COURSE/E7770_Spr12/course.html ELEC 7770: Advanced VLSI Design (Agrawal)
Retiming • Retiming is a function-preserving transformation of a synchronous sequential circuit. • Flip-flops are moved according to specific rules. • Original references: • C. E. Leiserson, F. Rose and J. B. Saxe, “Optimizing Synchronous Circuits by Retiming,” Proc. 3rd Caltech Conf. on VLSI, 1983, pp. 87-116. • C. E. Leiserson and J. B. Saxe, “Retiming Synchronous Circuitry,” Algorithmica, vol. 6, pp. 5-35, 1991. ELEC 7770: Advanced VLSI Design (Agrawal)
A Trivial Example: Reduced Hardware FF FF FF ELEC 7770: Advanced VLSI Design (Agrawal)
Example 2: Faster Clock FF FF ELEC 7770: Advanced VLSI Design (Agrawal)
Example 3: Reduced Flip-Flops FF FF FF ELEC 7770: Advanced VLSI Design (Agrawal)
Applications of Retiming • Performance optimization • Area optimization • Power optimization • Testability enhancement • FPGA optimization ELEC 7770: Advanced VLSI Design (Agrawal)
Fundamental Operation of Retiming • A retiming move in a circuit is caused by moving all of the memory elements at the input of a combinational block to all of its outputs, or vice-versa. FF Combinational logic Combinational logic FF ≡ FF ELEC 7770: Advanced VLSI Design (Agrawal)
A Correlator Circuit Adder delay = 7 + + + PO host PI = = = = a1 a2 a3 a4 Comparator delay = 3 Flip-flop ELEC 7770: Advanced VLSI Design (Agrawal)
Graph Model f e g 0 0 7 7 7 0 0 0 0 0 0 h 1 3 3 3 3 1 1 1 a b c d Vertex, vi, combinational, delay = d(vi), assumed unchanged by retiming d(host) = 0 Edge, e(vi,vj) or eij, weight wij = number of flip-flops between vi and vj ELEC 7770: Advanced VLSI Design (Agrawal)
Path Delay and Path Weight • A set of connected nodes specify a path. A path does not traverse through the host node. • Path delay = ∑ d(vi) = combinational delay of path • Path weight = ∑ wij = clock delay of path • Retiming of a node i is denoted by an integer ri • It represents the number of registers moved across, initially ri = 0 • Register moved from output to input, ri → ri + 1 • Register moved from input to output, ri → ri – 1 • After retiming, edge weight wij’ = wij + rj – ri ELEC 7770: Advanced VLSI Design (Agrawal)
Example of Node Retiming r1 = 0 r2 = 0 r3 = 0 r4 = 0 r5 = 0 r6 =0 3 3 3 3 3 3 ∑ d(vi) = 12, ∑ wij = 0 r1 = 0 r2 = -1 r3 = 0 r4 = 0 r5 = 1 r6 =0 3 3 3 3 3 3 ∑ d(vi) = 12, ∑ wij = 2 ELEC 7770: Advanced VLSI Design (Agrawal)
Legal Retiming • Retiming is legal if the retimed circuit has no negative weights. • A legally retimed circuit is functionally equivalent to the original circuit – proof by Leiserson and Saxe (1991) • Retiming is the most general method for changing the register count and position without knowing the functions of vertices. ELEC 7770: Advanced VLSI Design (Agrawal)
Example FF a c b x d c 1 0 x host 0 0 ELEC 7770: Advanced VLSI Design (Agrawal)
Example: Illegal Retiming 0 0 c 1 c 1 → 0 0 0 x x host host 0 0 → –1 0 0 → –1 0 0 0 0 →1 Retiming vector = {0, 0, 0} Retiming vector = {0, 0, –1} a c FF x b d ELEC 7770: Advanced VLSI Design (Agrawal)
Example: Legal Retiming 0 →1 0 1 → 0 c 1 c 0 0 →1 x x host host 0 0 0 0 0 0 0 0 Retiming vector = {0, 1, 0} Retiming vector = {0, 0, 0} FF a c FF b x d ELEC 7770: Advanced VLSI Design (Agrawal)
Correlator Circuit Critical path delay = 24 f g e 0 0 7 7 7 re=0 0 rf=0 rg=0 0 0 0 0 0 h rh=0 1 3 3 3 3 1 1 1 rd=0 rb=0 rc=0 ra=0 a c b d Initial retiming vector = {0,0,0,0,0,0,0,0} ELEC 7770: Advanced VLSI Design (Agrawal)
Retimed Correlator Circuit Critical path delay = 13 f g e 0→1 0→1 7 7 7 re= -2 0 rf= -1 rg=0 0 0 0 0→1 0 h rh=0 1→0 1→0 1 3 3 3 3 1 rd= -2 rb= -1 rc= -2 ra= -1 a c b d retiming vector = {-1,-1,-2,-2,-2,-1,0,0} ELEC 7770: Advanced VLSI Design (Agrawal)
Retiming Theorem • Given a network G(V, E, W) and a cycle time T, (r1, . . . ) is a feasible retiming if and only if: • ri – rj≤ wij for all edges (vi,vj) ε E • ri – rj ≤ W(vi,vj) – 1 for all node-pairs vi, vj such that D(vi,vj) > T Where, W(vi,vj): is the minimum weight for all paths between vi and vj D(vi,vj): is the maximum delay among all minimum weight paths between vi and vj ELEC 7770: Advanced VLSI Design (Agrawal)
Proof of Condition 1 • We assume that the original network is legal, i.e., all edge weights are positive. • For an arbitrary edge (vi,vj) ε E: ri – rj≤ wij or wij + rj – ri≥ 0, means that after retiming the new weight wij’ = wij + rj – ri will be positive. Thus, condition 1 ensures the legality of retiming. rj flip-flops ri flip-flops wij flip-flops i j Edge (i,j) Original flip-flops, wij Retimed flip-flops, wij’ = wij + rj – ri ≥ 0 ELEC 7770: Advanced VLSI Design (Agrawal)
Proof of Condition 2 • Given: d(vi) < T, for all i. • Any retimed path whose combinational delay exceeds clock period, will have at least one flip-flop. • The above is the requirement for correct operation. rj flip-flops ri flip-flops Wij flip-flops i j Path (i,j), D(i,j) > T Original weight, Wij Retimed weight, Wij’ = Wij + rj – ri ≥ 1 ELEC 7770: Advanced VLSI Design (Agrawal)
Retiming Optimization Problem • Given the initial retiming graph G(V, E, d, w) of a synchronous system and a required clock period P, find a feasible retiming transformation such that for the retimed graph G’ CP(G’) ≤ P • Solution: • Algorithm 1 – Finds CP(G), critical path of G • Algorithm 2 – Finds feasible retiming G → G’ ELEC 7770: Advanced VLSI Design (Agrawal)
Algorithm 1: Critical Path Delay • Delete all edges (vi, vj) for which wij ≥ 1. • Create a level order for vertices such that an edge (vi, vj) requires order of vj to be higher than that of vi. • Traversing all nodes (v) in level order, compute ∆(v) • ∆(v) = d(v), if v has no incoming edge • ∆(v) = d(v) + max{∆(vi)}, for all incoming edges (vi, v)} i • CP(G) = max{∆(vj),for all vertices j} j ELEC 7770: Advanced VLSI Design (Agrawal)
Algorithm 1 Application 0 0 7 7 7 0 g e f 0 0 0 0 0 h a b c 1 3 3 3 3 d 1 1 1 ∆=24 0 0 ∆=10 7 7 7 0 g e f CP(G)=∆=24 ∆=17 0 0 0 0 0 h a 1 b c 1 1 3 3 3 3 d 1 ∆=3 ∆=3 ∆=3 ∆=3 ELEC 7770: Advanced VLSI Design (Agrawal)
Algorithm 2: Retiming for Period = P • Initialize retiming variable, r(v) = 0, for all v. • Repeat |V| – 1 times: • Derive retiming graph. • Run Algorithm 1 to determine ∆(v) for all v. • For each v such that ∆(v) > P, set r(v) to r(v) + 1. • Derive retiming graph and run Algorithm 1: • If CP(G) > P, then no feasible retiming exists. • Otherwise, CP(G) < P and the retimed graph is the required result. ELEC 7770: Advanced VLSI Design (Agrawal)
Algorithm 2 Application, P = 13 ∆=24 ∆=10 0 0 7 7 7 0 g e f CP(G)=∆=24 ∆=17 0 0 0 0 0 h ∆=3 ∆=3 ∆=3 a b c 1 3 3 3 3 d ∆=3 1 1 1 ∆=14 ∆=10 1 0 7 7 7 0 ∆=7 g e f ∆=14 0 0 1 1 0 h ∆=3 a b c 0 1 1 1 3 3 3 3 d ∆=3 ∆=3 ∆=14 ELEC 7770: Advanced VLSI Design (Agrawal)
Retimed Circuit for P = 13 Critical path delay = 13 f g e 1 1 7 7 7 re= -2 0 rf= -1 rg=0 0 0 0 1 0 h rh=0 0 1→0 1 3 3 3 3 1 rd= -2 rb= -1 rc= -2 ra= -1 a c b d retiming vector = {-1,-1,-2,-2,-2,-1,0,0} ELEC 7770: Advanced VLSI Design (Agrawal)
References • Two papers by Leiserson et al. (see slide 2). • G. De Micheli, Synthesis and Optimization of Digital Circuits, New York: McGraw-Hill, 1994. • N. Maheshwari and S. S. Sapatnekar, Timing Analysis and Optimization of Sequential Circuits, Boston: Springer, 1999. ELEC 7770: Advanced VLSI Design (Agrawal)