160 likes | 307 Views
Skewed Flip-Flop Transformation for Minimizing Leakage in Sequential Circuits. Jun Seomun , Jaehyun Kim, Youngsoo Shin Dept. of Electrical Engineering, KAIST, KOREA. Leakage Power in Technology Scaling. 250. Dynamic Power. Leakage Power. 200. 150. Power (W). 100. 50. 0. 0.25 µ.
E N D
Skewed Flip-Flop Transformation for Minimizing Leakagein Sequential Circuits Jun Seomun, Jaehyun Kim, Youngsoo Shin Dept. of Electrical Engineering, KAIST, KOREA
Leakage Power in Technology Scaling 250 Dynamic Power Leakage Power 200 150 Power (W) 100 50 0 0.25µ 0.18µ 0.13µ 0.10µ 0.07µ Technology Intel Corporation, 2002
Low Vt High Vt High Vt gates can be assigned some non–critical path Overview of Mixed Vt Technique • Mixed Vt CMOS • Low Vt : fast but high leakage • High Vt : low leakage but slow • Value of mixed Vt is limited • It considers only the combinational portion of circuits Critical path Initially all low Vt
Mixed Vt s382 s298 s344 s349 s713 s838 s400 s444 s526 s641 s9234 Motivation • Leakage of sequential elements • Sequential elements take large proportion in many controllers 100% Comb. Flip-flop 80% 60% 40% 20% 0% s382 s526 s641 s713 s838 s298 s344 s349 s400 s444 s9234
4 25 20 3 15 Delay of high Vt gate - delay of low Vt gate 2 [Average # fanout timing paths on F/Fs] / [Average # fanout timing paths on comb. Gates] 10 1 5 0 0 s298 s344 s349 s400 s444 s526 s641 s713 s838 F/F INV NAND2 NOR2 NAND3 NAND4 s9234 Why Not High Vt Flip-Flop? • Large effects on the slack • The delay overhead of high Vt flip-flops is larger than that of the other high Vt combinational gates • Flip-flop typically affects more than one of the timing paths in a circuit
Skewed Flip-Flops • Mixed Lgate flip-flop • Lager Lgate transistor • Smaller delay overhead than high Vt transistor • Footprint of gate remains almost the same • Selective assignment of larger Lgate in flip-flop • Smaller delay overhead than entire assignment in flip-flop • Maximum reduction can be obtained up to same amount of leakage reduction with the case when all gates are larger Lgate • Unequal leakage along with values of D and Q • Four kinds of SFFs • Characterized to minimize leakage corresponding to four states (D & Q) • SF00, SF01, SF10 and SF11 120 80 70 100 60 80 50 Delay [ps] Leakage [nA] 60 40 30 40 Delay : 32% Leakage : 72% 20 Leakage 20 cf. high Vt inverter Delay : 81% Leakage : 92% 10 Delay 0 0 45 46 47 48 49 50 Gate length (nm)
clk clk 1 0 D Q 0 0 clk 0 1 clk clk clk clk 1 0 clk 0 1 Larger Lgate clk Skewed Flip-Flops • Design of an SFF (in case of SF00) • Assume CK = 0 in idle state (clock gating) 0 1 1 0 clk CK 0 0 1
clk clk clk clk clk clk clk clk D D D D Q Q Q Q clk clk clk clk clk clk clk clk clk clk clk clk clk CK clk clk clk clk clk clk clk clk clk clk clk clk clk Skewed Flip-Flops • Skewed flip-flops SF00 SF01 clk clk clk clk clk CK CK CK SF10 SF11 clk
Leakage Characteristic of SFFs • 45-nm PTM, 4 nm biasing Orig. Orig. Orig. Orig. SF SF SF SF 1200 1200 00 01 10 11 800 800 Current [nA] Current [nA] 400 400 0 0 0/0 0/1 1/0 1/1 0/0 0/1 1/0 1/1 D/Q D/Q (b) SF01 (a) SF00 1200 1200 800 800 Current [nA] Current [nA] 400 400 0 0 0/0 0/1 1/0 1/1 0/0 0/1 1/0 1/1 D/Q D/Q (d) SF11 (c) SF10
0.9 0.9 Orig. clk D D SF 00 T T Voltage [V] Voltage [V] su su T ' T ' su su T T 1 1 T ' T ' 1 1 Orig. clk SF 00 0 0 Time Time CK (rising edge) CK (rising edge) (a) Rising Tsu (b) Falling Tsu Timing Characteristic of SFFs • 45-nm PTM, 4 nm biasing 40 40 Orig. Orig. Orig. Orig. SF SF SF SF 00 11 10 01 30 30 Delay [ps] Delay [ps] 20 20 10 10 0 0 Falling Tc-q Falling Tc-q Rising Tsu Falling Tsu Rising Tc-q Rising Tsu Falling Tsu Rising Tc-q (a) SF00 (b) SF01 40 30 30 20 Delay [ps] Delay [ps] 20 10 10 0 0 Falling Tc-q Falling Tc-q Rising Tsu Falling Tsu Rising Tc-q Rising Tsu Falling Tsu Rising Tc-q (c) SF10 (d) SF11
Netlist & Idle state probabilities Initial SFF assignment Skewed flip-flop transformation under timing constraints Find critical path Flip-flop transformation Find candidate Mixed Vt assignment on combinational subcircuits Substitute SFF Transformation • Utilize SFFs while maintaining timing constraints • Input : netlist & idle state probabilities of flip-flops • Output : new netlist with skewed flip-flops
For a smoother transition HSF0 : unchanged setup time delay HSF1 : unchanged clock-to-q delay Half Skewed Flip-Flops (HSFs) HSF0 HSF1
SFF Transformation Algorithm • Select a flip-flop to be transformed • Find critical path • Find candidate • Both ends of the most critical path • Larger timing improvement • Substitute • (1) Most effective SFFs in terms of delay given position and phase of transition • (2) If (1) fails, try HSFs • (3) If (2) fails, use the original flip-flops
Experimental Results • For ISCAS benchmark circuits (45-nm PTM library)
4 3 2 [Average # fanout timing paths of F/Fs] / [Average # fanout timing paths of comb. Gates] 1 0 s298 s344 s349 s400 s444 s526 s641 s713 s838 s9234 Comparison of Mixed Vt Flip-Flop 1.0 Mixed Vt FFs + Mixed Vt comb. 0.9 SFX + Mixed Vt comb. 0.8 0.7 0.6 s298 s344 s349 s382 s400 s444 s526 s641 s713 s838 s9234
Conclusion • Proposed Skewed Flip-Flops • The set of mixed Lgate flip-flops • Skewed characteristics in terms of leakage and delay • A heuristic algorithm that substitutes SFFs • An average leakage saving of 16% is achieved, compared to the use of mixed Vt alone