280 likes | 296 Views
This research explores the impact of circuit loading on flip-flop structures, presenting energy and delay measurements under varying loading conditions. It emphasizes the importance of considering load effects in flip-flop characterization to optimize structure selection. The study also discusses the influence of output buffering on performance, proposing a method for better energy consumption.
E N D
Load-Sensitive Flip-Flop Characterization Seongmoo Heo and Krste Asanović MIT Laboratory for Computer Science http://www.cag.lcs.mit.edu/scale WVLSI 2001 April 19, 2001
Motivation • Flip-flops are one of the most important components in synchronous VLSI designs. • Critical effect on cycle time • Large fraction of total system power • Previously published work has failed to consider the effect of circuit loading on the relative ranking of flip-flop structures. [Kawaguchi et al. ’98] [Ko and Balsara ’00] [Kong et al ’00] [Lang et al ’97] [Nikolic et al ’00] [Nogawa and Ohtomo ’98] [Stojanovic and Oklobdzija ’99] [Stollo et al ’00] [Yuan and Svensson ’91] [Zyuban and Kogge ’99] [H.P. et al ’96] [J.M. et al ’96] • Fixed and usually overly large output load • Large or non-specified input drive • No output buffering
Observation • Different flip-flop designs have different inherent parasitics and output drive strength. • Different number and complexity of logic gates • Different kinds of feedback
Observation • Different flip-flop designs have different inherent parasitics and output drive strength. • Different number and complexity of logic gates • Different kinds of feedback D Q D Q D Q
Observation 2. Output loads in a circuit vary significantly. Flip-flop output load instances in a microprocessor datapath 120 (A custom-designed 32-bit MIPS CPU in 0.25mm process) 100 80 7.2fF (4 min inv gate cap) 60 115.2fF (64 min inv gate cap) # of instances 28.8fF (16 min inv gate cap) 40 20 0 1.8fF (min inv gate cap)
Our Proposal Load effects must be considered in flip-flop characterization to avoid sub-optimal selection. • We will present energy and delay measurements for various flip-flops across a range of output loading conditions(EE and absolute load size) and show that the relative rankings of structures vary. • We will show that output buffering at high load can lead to the better performance and energy consumption for some structures.
Related Work • Traditional Buffer Sizing • Logical Effort [Sutherland and Sproull] • Logical Effort: drive strength of a circuit structure • Electrical Effort: the ratio of output load to input load • Delay = intrinsic parasitic delay + LE x EE
Overview • Flip-Flop Designs • Test Bench & Simulation Setup • Delay and Energy Characterization • Delay Analysis • Energy-versus-Delay Analysis • Summary
Flip-Flop Designs Fully static and single-ended [Nikolic et al ’00]
Test Bench • Sized clock buffer • to give equal rise/fall time • Used a fixed, realistic input driver • Varied output load from • 4 min inv cap(7.2fF) to • 64 min inv cap(115.2fF). • 4 Load and Drive Configurations • EE4-min: min input drive, 4 min inv load (7.2fF) • EE16-min: min input drive, 16 min inv load (28.8fF) • EE64-min: min input drive, 64 min inv load (115.2fF) • EE4-big: 16x min input drive, 64 min inv load (115.2fF) 4 min inv cap 16 min inv cap 64 min inv cap FF
Simulation Setup • 0.25μm TSMC CMOS process, Vdd=2.5V, T=25°C • Hspice Levenberg-Marquardt method was used for transistor size optimization. • Transistor widths optimized for each load and drive conf. to give min delay or min energy for a given delay (transistor lengths were fixed at minimum.) • Parasitic capacitances included in the circuit netlists.
Delay and Energy Characterization • Minimum D-Q delay [Stojanovic et al. ’99] (.Measure command) • Total energy = input energy + internal energy + clock energy – output energy • A single test waveform with ungated clock and data toggling every cycle • For a full characterization of energy dissipation, more realistic activity patterns should be considered [Heo, Krashinsky, Asanovic ARVLSI’01]. FF 4 min inv load 16 min inv load 64 min inv load
Speed Ranking Without Buffering 3.5 (Transistors sized at each load point, but only for min delay) 3.0 • Delay = const. intrinsic parasitic delay • + output drive delay (= load size × driving capability) • Driving Capability = f(# of stages, complexity) PPCFF SAFF 1.5 MSAFF HLFF SSAPL 1.0 0.5
Influence of Buffering on Performance (Assuming no penalty for inverting output) SAFF PPCFF MSAFF 1.5 1 0.5 0 Delay (ns) 0 20 40 80 Load (min inv cap) : unbuffered : one inverter : two inverters (Min. input drive was used.) HLFF SSAPL 1.5 1 0.5 0 0 20 40 80
Speed Ranking With Buffering Allowed 3.5 • Less speed variation compared to original flip-flops 3.0 PPCFF SAFF MSAFF HLFF 1.5 SSAPL 1.0 0.5
Energy-Delay Curve : EE4-min Each point sized for min energy for a given delay Energy(fJ) Delay(ns) EE4-min: min. drive + 4 min inv load(7.2fF)
Energy-Delay Curve : EE4-min Energy(fJ) PPCFF-unbuf Delay(ns) EE4-min: min. drive + 4 min inv load(7.2fF)
Energy-Delay Curve : EE16-min SSAPL-unbuf Energy(fJ) Delay(ns) EE16-min: min. drive + 16 min inv load(28.8fF)
Energy-Delay Curve : EE16-min SSAPL-unbuf Energy(fJ) SSAPL-buf Delay(ns) EE16-min: min. drive + 16 min inv load(28.8fF)
Energy-Delay Curve : EE16-min HLFF-unbuf Energy(fJ) Delay(ns) EE16-min: min. drive + 16 min inv load(28.8fF)
Energy-Delay Curve : EE16-min HLFF-unbuf Energy(fJ) HLFF-buf Delay(ns) EE16-min: min. drive + 16 min inv load(28.8fF)
Energy-Delay Curve : EE16-min Energy(fJ) PPCFF-unbuf Delay(ns) EE16-min: min. drive + 16 min inv load(28.8fF)
Energy-Delay Curve : EE64-min Energy(fJ) MSAFF-unbuf Delay(ns) EE64-min: min. drive + 64 min inv load(115.2fF)
Energy-Delay Curve : EE64-min HLFF-buf HLFF-unbuf Energy(fJ) Delay(ns) EE64-min: min. drive + 64 min inv load(115.2fF)
Energy-Delay Curve : EE4-big Energy(fJ) Delay(ns) EE4-big: 16x min. drive + 64 min inv load(115.2fF)
Energy-Delay Curve : EE4-min vs EE4-big EE4-big EE4-min PPCFF-unbuf MSAFF-unbuf SAFF-unbuf
Energy-Delay Curve : EE4-min vs EE4-big EE4-big EE4-min MSAFF-unbuf PPCFF-unbuf MSAFF-unbuf SAFF-unbuf PPCFF-unbuf SAFF-unbuf
Summary • Different flip-flops have different gains and parasitics. • Real VLSI designs exhibit a variety of flip-flop output loads. • The output load size affects the relative performance and energy consumption of different flip-flop designs. • Therefore, output load effects should be accounted for when comparing flip-flops. • Electrical effort • Absolute output load size • Output buffering