600 likes | 662 Views
Low-Power Design of Digital VLSI Circuits Gate-Level Power Optimization. Vishwani D. Agrawal James J. Danaher Professor Dept. of Electrical and Computer Engineering Auburn University, Auburn, AL 36849 vagrawal@eng.auburn.edu http://www.eng.auburn.edu/~vagrawal. Components of Power.
E N D
Low-Power Design of Digital VLSI Circuits Gate-Level Power Optimization Vishwani D. Agrawal James J. Danaher Professor Dept. of Electrical and Computer Engineering Auburn University, Auburn, AL 36849 vagrawal@eng.auburn.edu http://www.eng.auburn.edu/~vagrawal Lectures 11-14: Gate-level optimization
Components of Power • Dynamic • Signal transitions • Logic activity • Glitches • Short-circuit (often neglected) • Static • Leakage Lectures 11-14: Gate-level optimization
Power of a Transition isc VDD Dynamic Power = CLVDD2/2+ Psc R Vo Vi CL R Ground Lectures 11-14: Gate-level optimization
Dynamic Power • Each transition of a gate consumes CV 2/2. • Methods of power saving: • Minimize load capacitances • Transistor sizing • Library-based gate selection • Reduce transitions • Logic design • Glitch reduction Lectures 11-14: Gate-level optimization
Glitch Power Reduction • Design a digital circuit for minimum transient energy consumption by eliminating hazards Lectures 11-14: Gate-level optimization
Theorem 1 • For correct operation with minimum energy consumption, a Boolean gate must produce no more than one event per transition. Output logic state changes One transition is necessary Output logic state unchanged No transition is necessary Lectures 11-14: Gate-level optimization
Event Propagation Single lumped inertial delay modeled for each gate PI transitions assumed to occur without time skew Path P1 1 3 1 0 2 4 6 P2 1 2 3 0 Path P3 5 2 0 Lectures 11-14: Gate-level optimization
Inertial Delay of an Inverter Vin dHL+dLH d = ──── 2 dHL dLH Vout time Lectures 11-14: Gate-level optimization
Multi-Input Gate A B Delay d < DPD C DPD: Differential path delay A B C DPD d d Hazard or glitch Lectures 11-14: Gate-level optimization
Balanced Path Delays A B Delay d < DPD DPD C Delay buffer A B C d No glitch Lectures 11-14: Gate-level optimization
Glitch Filtering by Inertia A B Delay d> DPD C A B C DPD d > DPD Filtered glitch Lectures 11-14: Gate-level optimization
Theorem • Given that events occur at the input of a gate, whose inertial delay is d, at times, t1 ≤ . . . ≤ tn , the number of events at the gate output cannot exceed tn – t1 ──── d min ( n , 1 + ) tn - t1 time t1 t2 t3 tn Lectures 11-14: Gate-level optimization
Minimum Transient Design • Minimum transient energy condition for a Boolean gate: | ti – tj | < d Where ti and tj are arrival times of input events and d is the inertial delay of gate Lectures 11-14: Gate-level optimization
Balanced Delay Method • All input events arrive simultaneously • Overall circuit delay not increased • Delay buffers may have to be inserted 1 1 1 1 1 No increase in critical path delay 3 1 1 1 1 1 Lectures 11-14: Gate-level optimization
Hazard Filter Method • Gate delay is made greater than maximum input path delay difference • No delay buffers needed (least transient energy) • Overall circuit delay may increase 1 1 1 1 1 3 1 1 1 1 Lectures 11-14: Gate-level optimization
Designing a Glitch-Free Circuit • Maintain specified critical path delay. • Glitch suppressed at all gates by • Path delay balancing • Glitch filtering by increasing inertial delay of gates or by inserting delay buffers when necessary. • A linear program optimally combines all objectives. Path delay = d1 |d1 – d2| < D Delay D Path delay = d2 Lectures 11-14: Gate-level optimization
Problem Complexity • Number of paths in a circuit can be exponential in circuit size. • Considering all paths through enumeration is infeasible for large circuits. • Example: c880 has 6.96M path constraints. Lectures 11-14: Gate-level optimization
Define Arrival Time Variables • di Gate delay. • Define two timing windowvariables per gate output: • tiEarliest time of signal transition at gate i. • Ti Latest time of signal transition at gate i. • Glitch suppression constraint: Ti – ti < di t1, T1 ti, Ti . . . di tn, Tn Reference: T. Raja, Master’s Thesis, Rutgers Univ., 2002. Lectures 11-14: Gate-level optimization
Linear Program • Variables: gate and buffer delays, arrival time variables. • Objective: minimize number of buffers. • Subject to: overall circuit delay constraint for all input-output paths. • Subject to: minimum transient energy condition for all multi-input gates. Lectures 11-14: Gate-level optimization
An Example: Full Adder add1b 1 1 1 1 1 1 1 1 1 Critical path delay = 6 Lectures 11-14: Gate-level optimization
Linear Program • Gate variables: d4 . . . d12 • Buffer delay variables: d15 . . . d29 • Window variables: t4 . . . t29 and T4 . . . . T29 Lectures 11-14: Gate-level optimization
Multiple-Input Gate Constraints For Gate 7: T7≥ T5 + d7 t7≤ t5 + d7d7 > T7 – t7 T7≥ T6 + d7 t7≤ t6 + d7 Glitch suppression Lectures 11-14: Gate-level optimization
Single-Input Gate Constraints Buffer 19: T16 + d19 = T19 t16 + d19 = t19 Lectures 11-14: Gate-level optimization
Critical Path Delay Constraints T11≤maxdelay T12≤maxdelay maxdelay is specified Lectures 11-14: Gate-level optimization
Objective Function • Need to minimize the number of buffers. • Because that leads to a nonlinear objective function, we use an approximate criterion: minimize ∑ (buffer delay) all buffers i.e., minimize d15 + d16 + ∙ ∙ ∙ + d29 • This gives a near optimum result. Lectures 11-14: Gate-level optimization
AMPL Solution: maxdelay =6 1 2 1 1 1 1 1 2 1 2 2 Critical path delay = 6 Lectures 11-14: Gate-level optimization
AMPL Solution: maxdelay =7 3 1 1 1 1 1 2 2 1 2 Critical path delay = 7 Lectures 11-14: Gate-level optimization
AMPL Solution: maxdelay ≥11 5 1 1 1 3 1 2 3 4 Critical path delay = 11 Lectures 11-14: Gate-level optimization
ALU4: Four-Bit ALU 74181 Maximum Power Savings (zero-buffer design): Peak = 33%, Average = 21% Lectures 11-14: Gate-level optimization
ALU4: Original and Low-Power Lectures 11-14: Gate-level optimization
Benchmark Circuits Normalized Power Max-delay (gates) 7 15 24 48 47 94 43 86 No. of Buffers 5 0 62 34 294 120 366 111 Circuit ALU4 C880 C6288 c7552 Average 0.80 0.79 0.68 0.68 0.40 0.36 0.44 0.42 Peak 0.68 0.67 0.54 0.52 0.36 0.34 0.34 0.32 Lectures 11-14: Gate-level optimization
C7552 Circuit: Spice Simulation Power Saving: Average 58%, Peak 68% Lectures 11-14: Gate-level optimization
References • R. Fourer, D. M. Gay and B. W. Kernighan, AMPL: A Modeling Language for Mathematical Programming, South San Francisco: The Scientific Press, 1993. • M. Berkelaar and E. Jacobs, “Using Gate Sizing to Reduce Glitch Power,” Proc. ProRISC Workshop, Mierlo, The Netherlands, Nov. 1996, pp. 183-188. • V. D. Agrawal, “Low Power Design by Hazard Filtering,” Proc. 10th Int’l Conf. VLSI Design, Jan. 1997, pp. 193-197. • V. D. Agrawal, M. L. Bushnell, G. Parthasarathy and R. Ramadoss, “Digital Circuit Design for Minimum Transient Energy and Linear Programming Method,” Proc. 12th Int’l Conf. VLSI Design, Jan. 1999, pp. 434-439. • T. Raja, V. D. Agrawal and M. L. Bushnell, “Minimum DynamicPower CMOS Circuit Design by a Reduced Constraint Set Linear Program,” Proc. 16thInt’l Conf. VLSI Design, Jan. 2003, pp. 527-532. • T. Raja, V. D. Agrawal, and M. L. Bushnell, “Transistor sizing of logicgates to maximize input delay variability,” J. Low Power Electron., vol.2, no. 1, pp. 121–128, Apr. 2006. • T. Raja, V. D. Agrawal, and M. L. Bushnell, “Variable Input Delay CMOS Logic for Low Power Design,” IEEE Trans. VLSI Design, vol. 17, mo. 10, pp. 1534-1545. October 2009. Lectures 11-14: Gate-level optimization
Exercise: Dynamic Power • An average gate • VDD, V = 1 volt • Output capacitance, C = 1pF • Activity factor, α = 10% • Clock frequency, f = 1GHz • What is the dynamic power consumption of a 1 million gate VLSI chip? Lectures 11-14: Gate-level optimization
Answer • Dynamic energy per transition = 0.5CV2 • Dynamic power per gate = Energy per second = 0.5 CV2 α f = 0.5 ✕ 10 – 12 ✕ 12 ✕ 0.1 ✕ 109 = 0.5 ✕ 10 – 4 = 50μW • Power for 1 million gate chip = 50W Lectures 11-14: Gate-level optimization
Components of Power • Dynamic • Signal transitions • Logic activity • Glitches • Short-circuit • Static • Leakage Lectures 11-14: Gate-level optimization
Subthreshold Conduction Vgs – Vth –Vds Ids = I0 exp( ───── ) × (1– exp ─── ) nVT VT Ids 1mA 100μA 10μA 1μA 100nA 10nA 1nA 100pA 10pA Subthreshold slope Saturation region Subthreshold region d g s Vth 0 0.3 0.6 0.9 1.2 1.5 1.8 V Vgs Lectures 11-14: Gate-level optimization
Thermal Voltage, vT VT = kT/q = 26 mV, at room temperature. When Vds is several times greater than VT Vgs – Vth Ids = I0 exp( ───── ) nVT Lectures 11-14: Gate-level optimization
Leakage Current • Leakage current equals Ids when Vgs= 0 • Leakage current, Ids = I0exp( – Vth/nVT) • At cutoff, Vgs = Vth, and Ids = I0 • Lowering leakage to 10-b ✕ I0 Vth = bnVTln 10 = 1.5b × 26 ln 10 = 90b mV • Example: To lower leakage to I0/1,000 Vth = 270 mV Lectures 11-14: Gate-level optimization
Threshold Voltage • Vth = Vt0 + γ[(Φs+Vsb)½ – Φs½] • Vt0 is threshold voltage when source is at body potential (0.4 V for 180nm process) • Φs = 2VTln(NA /ni)is surface potential • γ = (2qεsiNA)½tox /εox is body effect coefficient (0.4 to 1.0) • NA is doping level = 8×1017 cm–3 • ni = 1.45×1010 cm–3 Lectures 11-14: Gate-level optimization
Threshold Voltage, Vsb = 1.1V • Thermal voltage, VT = kT/q = 26 mV • Φs = 0.93 V • εox = 3.9×8.85×10-14 F/cm • εsi = 11.7×8.85×10-14 F/cm • tox = 40 Ao • γ = 0.6 V½ • Vth = Vt0 + γ[(Φs+Vsb)½- Φs½] = 0.68 V Lectures 11-14: Gate-level optimization
A Sample Calculation • VDD = 1.2V, 100nm CMOS process • Transistor width, W = 0.5μm • OFF device (Vgs = Vth) leakage • I0 = 20nA/μm, for low threshold transistor • I0 = 3nA/μm, for high threshold transistor • 100M transistor chip • Power = (100×106/2)(0.5×20×10-9A)(1.2V) = 600mW for all low-threshold transistors • Power = (100×106/2)(0.5×3×10-9A)(1.2V) = 90mW for all high-threshold transistors Lectures 11-14: Gate-level optimization
Dual-Threshold Chip • Low-threshold only for 20% transistors on critical path. • Leakage power = 600×0.2 + 90×0.8 = 120 + 72 = 192 mW Lectures 11-14: Gate-level optimization
Dual-Threshold CMOS Circuit Lectures 11-14: Gate-level optimization
Dual-Threshold Design • To maintain performance, all gates on critical paths are assigned low Vth . • Most other gates are assigned high Vth . • But, some gates on non-critical paths may also be assigned low Vth to prevent those paths from becoming critical. Lectures 11-14: Gate-level optimization
Integer Linear Programming (ILP) to Minimize Leakage Power • Use dual-threshold CMOS process • First, assign all gates low Vth • Use an ILP model to find the delay (Tc) of the critical path • Use another ILP model to find the optimal Vth assignment as well as the reduced leakage power for all gates without increasing Tc • Further reduction of leakage power possible by letting Tc increase Lectures 11-14: Gate-level optimization
ILP -Variables For each gate i define two variables. • Ti :the longest time at which the output of gate i can produce an event after the occurrence of an input event at a primary input of the circuit. • Xi :a variable specifyinglow or high Vth for gate i ; Xiis an integer [0, 1], 1 gate i is assigned low Vth , 0 gate i is assigned high Vth . Lectures 11-14: Gate-level optimization
ILP - objective function Leakage power: minimize the sum of all gate leakage currents, given by • ILi is the leakage current of gate i with low Vth • IHiis the leakage current of gate i with high Vth • Using SPICE simulation results, construct a leakage current look up table, which is indexed by the gate type and the input vector. Lectures 11-14: Gate-level optimization
ILP - Constraints Ti • For each gate (1) output of gate j is fanin of gate i (2) • Max delay constraints for primary outputs (PO) (3) Tmax is the maximum delay of the critical path Gate i Gate j Tj Lectures 11-14: Gate-level optimization
ILP Constraint Example • Assume all primary input (PI) signals on the left arrive at the same time. • For gate 2, constraints are Lectures 11-14: Gate-level optimization