120 likes | 175 Views
Learn about retiming, a technique that can reduce the clock cycle of a circuit, through this lecture outline. Topics covered include motivation, graphs, classical approach to retiming, and improved approach using clock skew.
E N D
Retiming EECS 290A Sequential Logic Synthesis and Verification
Outline • Motivation • Graphs • Classical approach to retiming • Improved approach using clock skew
Motivation • Retiming can reduce the clock cycle of the circuit Critical path has delay 4 All paths have delay 2
Directed Graphs • Graph is set of vertices and edges G = (V,E) • Each edge is directed (has a source and a sink) • A path is the sequence of vertices connected by edges • A cycle is the circular path • Graph is strongly connected if there exist a path from any vertex to any other vertex. • For the general formulation of the graph problems, each edge e has distance, d(e), and a latency, t(e) • In this lecture (on retiming) • Graph is the sequential netlist • Vertices are combinational nodes • Edges are wires • Vertices have combinational delay • Latency of an edge is the number of latches on the edge
Classical Formulation • During retiming the registers are moved over combinational nodes: wr(euv) = r(v) + w(euv) – r(u), where r(v) are the registers moved from the outputs to the inputs of v. • For each path p: uv we define its weight w(p) as the sum total of registers on all edges. • The minimum clock period stands for the maximum 0-weight path P = max p: w(p) = 0 {d(p)} • Matrices W(u,v) and D(u,v) are defined for all pairs of vertices that are connected by a path that does not go through the host node W(u,v) = min p: uv{w(p)} and D(u,v) = max p: uv and w(p)= W(u,v) {d(p)} C. E. Leiserson and J. B. Saxe. Retiming synchronous circuitry, Algorithmica, 1991, vol. 6, pp. 5-35.
Classical Formulation (cont.) • W(u,v) denotes the minimum latency, in clock cycles, for the data flowing from u to v • D(u,v) gives the maximum delay from u to v over all path with the minimum latency • The computation of retiming labels for the clock period P is performed by solving a Linear Programming problem: r(u) – r(v) w(euv), euv E r(u) – r(v) W(u,v) – 1, D(u,v) > P • The constraints ensure that after retiming • the latency of each edge is non-negative • each path whose delay is larger than the clock period has at least one register on it
Theorem • Theorem. Let G = (V,E,d,t) be a synchronous circuit with maximum mean cycle ratio R(C*(G)), and let min(G) be the minimum clock period obtained by retiming G. Then R(C*(G)) min(G) R(C*(G)) +dmax -1 where dmax = max{ d(v) : v V }. M. C. Papaefthymiou, “Understanding retiming through maximum average-delay cycles”, Mathematical Systems Theory 27(1), 1994, pp. 65-84.
0 1 0 ? a+b a+b a+b a+b 0 ? 0 0 The Initial State Problem • In some cases, for backward retiming, the initial state cannot be computed
Reset 1 Reset a+b a+b 0 1 Reset 1 a+b 0 0 0 1 Solution to Initial State Problem • Additional hardware: multiplexers and a reset signal • Reset is 1 at the first clock, and 0 afterwards Detach initial values from latches Propagate constants in MUXes
Implementation of Retiming • Leiserson/Saxe compute the matrices, generate constraints, and then solve the LP problem • Shenoy/Rudell compute the matrix one column at a time • Reduced space requirements, still prohibitive runtime • Sapatnekar proposed a way of utilizing retiming/skew equivalence to reduce the number of constraints generated S. S. Sapatnekar, R. B. Deokar, “Utilizing the retiming-skew equivalence in a practical algorithms for retiming large circuits”, IEEE Trans. CAD, vol. 15(10), Oct.1996, pp. 1237-1248.
Sapatenekar’s Retiming Algorithm • Find ASAP and ALAP skews for a feasible clock period • Use binary search to find a feasible clock period • Perform min-delay retiming by moving latched to fit the timing window • Perform min-area retiming under delay constraints by solving a reduced LP problem • The reduced set of constraints is generated using the skews • The LP problem is solved efficiently using a variation of network simplex method • Improvement: Start by finding maximum ration using Howard’s algorithm
Example Clock period = 3 Buffer delay = 1 ALAP skew = -1 ASAP skew = -3 Initial PO PI PO ALAP PI ASAP PO PI Bounds on how far latches can move