130 likes | 284 Views
Retiming. EECS 290A Sequential Logic Synthesis and Verification. Outline. Motivation Graphs Classical approach to retiming Improved approach using clock skew. Motivation. Retiming can reduce the clock cycle of the circuit. Critical path has delay 4. All paths have delay 2.
E N D
Retiming EECS 290A Sequential Logic Synthesis and Verification
Outline • Motivation • Graphs • Classical approach to retiming • Improved approach using clock skew
Motivation • Retiming can reduce the clock cycle of the circuit Critical path has delay 4 All paths have delay 2
Directed Graphs • Graph is set of vertices and edges G = (V,E) • Each edge is directed (has a source and a sink) • A path is the sequence of vertices connected by edges • A cycle is the circular path • Graph is strongly connected if there exist a path from any vertex to any other vertex. • For the general formulation of the graph problems, each edge e has distance, d(e), and a latency, t(e) • In this lecture (on retiming) • Graph is the sequential netlist • Vertices are combinational nodes • Edges are wires • Vertices have combinational delay • Latency of an edge is the number of latches on the edge
Classical Formulation • During retiming the registers are moved over combinational nodes: wr(euv) = r(v) + w(euv) – r(u), where r(v) are the registers moved from the outputs to the inputs of v. • For each path p: uv we define its weight w(p) as the sum total of registers on all edges. • The minimum clock period stands for the maximum 0-weight path P = max p: w(p) = 0 {d(p)} • Matrices W(u,v) and D(u,v) are defined for all pairs of vertices that are connected by a path that does not go through the host node W(u,v) = min p: uv{w(p)} and D(u,v) = max p: uv and w(p)= W(u,v) {d(p)} C. E. Leiserson and J. B. Saxe. Retiming synchronous circuitry, Algorithmica, 1991, vol. 6, pp. 5-35.
Classical Formulation (cont.) • W(u,v) denotes the minimum latency, in clock cycles, for the data flowing from u to v • D(u,v) gives the maximum delay from u to v over all path with the minimum latency • The computation of retiming labels for the clock period P is performed by solving a Linear Programming problem: r(u) – r(v) w(euv), euv E r(u) – r(v) W(u,v) – 1, D(u,v) > P • The constraints ensure that after retiming • the latency of each edge is non-negative • each path whose delay is larger than the clock period has at least one register on it
Theorem • Theorem. Let G = (V,E,d,t) be a synchronous circuit with maximum mean cycle ratio R(C*(G)), and let min(G) be the minimum clock period obtained by retiming G. Then R(C*(G)) min(G) R(C*(G)) +dmax -1 where dmax = max{ d(v) : v V }. M. C. Papaefthymiou, “Understanding retiming through maximum average-delay cycles”, Mathematical Systems Theory 27(1), 1994, pp. 65-84.
0 1 0 ? a+b a+b a+b a+b 0 ? 0 0 The Initial State Problem • In some cases, for backward retiming, the initial state cannot be computed
Reset 1 Reset a+b a+b 0 1 Reset 1 a+b 0 0 0 1 Solution to Initial State Problem • Additional hardware: multiplexers and a reset signal • Reset is 1 at the first clock, and 0 afterwards Detach initial values from latches Propagate constants in MUXes
Implementation of Retiming • Leiserson/Saxe compute the matrices, generate constraints, and then solve the LP problem • Shenoy/Rudell compute the matrix one column at a time • Reduced space requirements, still prohibitive runtime • Sapatnekar proposed a way of utilizing retiming/skew equivalence to reduce the number of constraints generated S. S. Sapatnekar, R. B. Deokar, “Utilizing the retiming-skew equivalence in a practical algorithms for retiming large circuits”, IEEE Trans. CAD, vol. 15(10), Oct.1996, pp. 1237-1248.
Sapatenekar’s Retiming Algorithm • Find ASAP and ALAP skews for a feasible clock period • Use binary search to find a feasible clock period • Perform min-delay retiming by moving latched to fit the timing window • Perform min-area retiming under delay constraints by solving a reduced LP problem • The reduced set of constraints is generated using the skews • The LP problem is solved efficiently using a variation of network simplex method • Improvement: Start by finding maximum ration using Howard’s algorithm
Example Clock period = 3 Buffer delay = 1 ALAP skew = -1 ASAP skew = -3 Initial PO PI PO ALAP PI ASAP PO PI Bounds on how far latches can move