700 likes | 892 Views
Low Power Clocking. Through the Use of Dual Edge Triggered Flip-Flops Gabriel Ricardo Theresa Holliday. Outline. Dual Edge Flip-Flops overview Standard Cell Characterization LEON Synthesis for SET design LEON Synthesis for DET design Issues with including Dual edge into synthesis flow
E N D
Low Power Clocking Through the Use of Dual Edge Triggered Flip-Flops Gabriel Ricardo Theresa Holliday ACSEL Lab University of California, Davis
Outline • Dual Edge Flip-Flops overview • Standard Cell Characterization • LEON Synthesis for SET design • LEON Synthesis for DET design • Issues with including Dual edge into synthesis flow • Preliminary comparisons • Conclusions and Future Work • Questions ACSEL Lab University of California, Davis
Outline • Dual Edge Flip-Flops overview • Standard Cell Characterization • LEON Synthesis for SET design • LEON Synthesis for DET design • Issues with including Dual edge into synthesis flow • Preliminary comparisons • Conclusions and Future Work • Questions ACSEL Lab University of California, Davis
Symmetric Pulse Generator Flip-Flop (SPGFF) • First stage, X and Y, are dynamic, second stage static NAND • Results in small delay • Can size to trade some delay for power ACSEL Lab University of California, Davis
Operation of SPGFF • Transparency window created by CLK and CLK3 for stage 1 (CLK1 and CLK4 for stage 2), allows for X (Y) to conditionally evaluate based on input D. • Output stage NAND allows for X, Y to be passed to output based on clock value without the need for a latch. ACSEL Lab University of California, Davis
Transmission Gate Master Slave (TGMS) ACSEL Lab University of California, Davis
Comparison between SPGFF and TGMS in 0.18um ACSEL Lab University of California, Davis
Advantages of SPGFF • Lowest clock energy of other DET-CSEs, resulting in higher clock power savings • Energy delay product comparable to high performance single edge triggered clocked storage elements ACSEL Lab University of California, Davis
Outline • Dual Edge Flip-Flops overview • Standard Cell Characterization • LEON Synthesis for SET design • LEON Synthesis for DET design • Issues with including Dual edge into synthesis flow • Preliminary comparisons • Conclusions and Future Work • Questions ACSEL Lab University of California, Davis
Characterization Methodology – Generating synthesis views • Created automated process for generating synopsys liberty format (.lib) synthesis models. • Using perl scripts and gspice (spice pre/post-processor) • Characterized for timing and energy. • Can easily extend to generate cadence synthesis models (.tlf). ACSEL Lab University of California, Davis
Characterization Methodology – Trip-points • Used same trip-points as those in technology library. • Nominal conditions: 25˚C, 1.8V supply • Can easily generate best and worst case corner models (over temp and supply variation). • Cell delay: defined as clock 50% rise/fall to Output (Q or QN) 50% rise/fall • Transition time: 10%-90% rise, 90%-10% fall time ACSEL Lab University of California, Davis
Trip-points - Falling ACSEL Lab University of California, Davis
Trip-points - Rising ACSEL Lab University of California, Davis
Characterization Methodology - Drive Characteristics • Build 5x5 non-linear delay table. • Clock slope values (nano-seconds) : 0.03, 0.1, 0.4, 1.5, 3 • Output load values (fF): 0.35, 21, 38.5, 147, 311 ACSEL Lab University of California, Davis
Characterization Methodology – Trip-points • Setup time: sweep input transition towards active edge until 10% increase in clock to output delay. • Hold time: sweep input transition away from active edge until 10% increase in clock to output delay. ACSEL Lab University of California, Davis
10% push-out 10% push-out Characterization Methodology – Setup-hold ACSEL Lab University of California, Davis
Characterization Methodology – Setup and Hold • Build 3x2 non-linear delay table. (3ps accuracy) • Clock slope values (nano-seconds): 0.03, 3 • Data slope values (nano-seconds): 0.03, 0.9, 3 ACSEL Lab University of California, Davis
Characterization Methodology – Internal energy • Characterized over same data points as drive characteristics for internal energy (5x5 lookup table). • Data pin, clock pin energy tables generated (1x5 lookup table). ACSEL Lab University of California, Davis
Characterization Results- single vs dual-edge – D to Q delay TGMS SPGFF ACSEL Lab University of California, Davis
What is typical output load? • Extracted output loading from netlist for all CSEs. • Average load = 24fF • (6.8 min. inverters) • 90% of CSEs have load less than 60fF • (17 min. sized inverters) ACSEL Lab University of California, Davis
Netlist extracted CSE output loading statistics ACSEL Lab University of California, Davis
Typical region of operation Characterization Results- single vs dual-edge – Delay TGMS SPGFF ACSEL Lab University of California, Davis
Characterization Results – zoomed-in- single vs dual-edge – delay TGMS SPGFF ACSEL Lab University of California, Davis
Characterization Results- single vs dual-edge – Energy delay product TGMS SPGFF ACSEL Lab University of California, Davis
Outline • Dual Edge Flip-Flops overview • Standard Cell Characterization • LEON Synthesis for SET design • LEON Synthesis for DET design • Issues with including Dual edge into synthesis flow • Preliminary comparisons • Conclusions and Future Work • Questions ACSEL Lab University of California, Davis
Leon SPARC core configuration ACSEL Lab University of California, Davis
Leon SPARC synthesis • Synthesized using TSMC 0.18um standard cell library. • Target frequency of 200MHz • Limit use of single sized D-FF. ACSEL Lab University of California, Davis
SET- Synthesis flow ACSEL Lab University of California, Davis
SET-CSE synthesis summary Area and Power ACSEL Lab University of California, Davis
Core summary Approximately 20k-gates ACSEL Lab University of California, Davis
Clock tree loading * - based on library wire-load model ACSEL Lab University of California, Davis
Clock tree power estimation • High-fanout nets are beyond the library’s wire-load models interpolation range. • wire-load models are not meant for estimating balanced distribution nets such as clock nets. • Using library wire-load models for clock tree is not valid. • Use an H-tree estimation equation to obtain a ball-park number. ACSEL Lab University of California, Davis
H-tree estimation equation • Equation developed by ACSEL lab member Nikola Nedovic. • recursively calculates H-tree loading for a given area, number of CSEs in design, and number of H-tree levels. ACSEL Lab University of California, Davis
H-tree estimation method ACSEL Lab University of California, Davis
H-tree estimation method * Table taken from Nedovic, Nikola, Ph.D. Dissertation, UCD, “CLOCKED STORAGE ELEMENTS FOR HIGH-PERFORMANCE APPLICATIONS” ACSEL Lab University of California, Davis
Load due to CSEs Load due to wiring H-tree estimation method • Equation reduces to: ACSEL Lab University of California, Davis
Load switching power Clock driver power Total H-tree power ACSEL Lab University of California, Davis
SET-CSE synthesis summarywith H-tree estimate Area and Power ACSEL Lab University of California, Davis
SET-CSE power profilewith H-tree estimate ACSEL Lab University of California, Davis
SET-CSE Core power profile ACSEL Lab University of California, Davis
Outline • Dual Edge Flip-Flops overview • Standard Cell Characterization • LEON Synthesis for SET design • LEON Synthesis for DET design • Issues with including Dual edge into synthesis flow • Preliminary comparisons • Conclusions and Future Work • Questions ACSEL Lab University of California, Davis
Modeling DET-CSEs for Synthesis • Need to model the timing parameters for both edges. ACSEL Lab University of California, Davis
Falling-edge timing arc rising-edge timing arc Modeling DET-CSEs for Synthesis • Can model complex timing relationships for synthesis. ACSEL Lab University of California, Davis
Modeling DET-CSEs for Synthesis • Synthesis tool will time, and (try to) meet constraints for the dual-edge triggered synchronous system. ACSEL Lab University of California, Davis
Critical Not Critical Modeling DET-CSEs for Synthesis • Synthesis tool will use the worst timing arc relationship for critical path constraint. ACSEL Lab University of California, Davis
Modeling DET-CSEs for Synthesis • Synthesis tools are not capable of inferring a dual-edge triggered device from HDL code. • For meeting timing we only care about the strictest constraint anyway. (i.e. for one pair of launch and capture edges). • Unnecessary to model complex timing device. ACSEL Lab University of California, Davis
Modeling DET-CSEs for Synthesis • Simply model DET-CSE as a SET-CSE with worst-edge timing parameters. ACSEL Lab University of California, Davis
Synthesis flow for DET-CSEs ACSEL Lab University of California, Davis
Synthesis flow for DET-CSEs • Use synthesis directives to force use of DET-CSE modeled device. • Synthesize for target throughput, not frequency. • Worst-case models for meeting critical-path timing constraints. • generate a worst-case hold model, to verify the race-path. • Fastest clk-Q with worst-case hold time ACSEL Lab University of California, Davis
May have under-constrained race-path. Modeling DET-CSEs for Synthesis • Race-path modeling. ACSEL Lab University of California, Davis