490 likes | 512 Views
This Ph.D. dissertation explores techniques to optimize power and performance of CMOS circuits affected by process variation. It addresses issues of leakage power, glitches, and statistical power optimization, presenting results and future work suggestions.
E N D
Power and Performance Optimization of Static CMOS Circuits with Process Variation Yuanlin Lu Department of ECE, Auburn University, Auburn, AL 36849 Ph.D. Dissertation Committee: Dr. Vishwani D. Agrawal Dr. Fa Foster Dai Dr. Charles Stroud Dr. Douglas Leonard (Outsider Reader) May 25, 2007
Outline • Motivation • Problem Statement • Background • Proposed Techniques • MILP1 for Leakage and Glitch Minimization • MILP2 for Statistical Leakage Optimization under Process Variation • MILP3 for Statistical Glitch Power Reduction under Process Variation • Results • Conclusion • Suggestions for Future Work Ph.D. Final Oral Examination
Motivation • Leakage power has become a dominant contributor to the total power consumption • 65nm, leakage is ~ 50% of total power consumption • Glitches consume 20%-70% of dynamic power • Variation of process parameters increases with technology scaling • both average and standard deviation of leakage power increase • some glitch elimination technique (path balancing) is not effective • both power yield and timing yield are degraded Ph.D. Final Oral Examination
Problem Statement • Design a CMOS Circuit with Dual-Threshold Devices and Delay Elements to: • Globally minimize subthreshold leakage • Eliminate all glitches • Maintain specified performance • Statistically Design a CMOS Circuit with Dual-Threshold Devices: • Reduce the effect of process variation on subthreshold leakage • Achieve a specified timing yield • Statistically Design a CMOS Circuit by Dual-Threshold Assignment, Path Balancing and Gate Sizing to: • Minimize leakage and dynamic power (capacitance reduction and glitch elimination) • Reduce the effect of process variation on leakage and dynamic power • Achieve a specified timing yield • Allow Performance-Power Tradeoff Ph.D. Final Oral Examination
Outline • Motivation • Problem Statement • Background • Proposed Techniques • Results • Conclusion • Future Work Ph.D. Final Oral Examination
Power Consumption in CMOS Circuit CL Dynamic Switching Power + Short Circuit Power+ Subthreshold Leakage Power+ Gate Leakage Power Ph.D. Final Oral Examination
Leakage and Delay • Increasing Vth can exponentially decrease Isub • But, gate delay increases at the same time (T. Sakurai and A. R. Newton, Alpha-power Law, 1990) where αmodels channel effects (long channel α = 2, short channel α = 1.3) • While using dual Vth techniques, must consider the tradeoff between leakage reduction and performance degradation Ph.D. Final Oral Examination
Dual Threshold CMOS Dual Threshold Device library (NAND02 @ 70nm) Spice Simulation • To maintain performance, most gates on the critical path may be assigned low Vth • Most gates on the non-critical paths may be assigned high Vth to reduce leakage Ph.D. Final Oral Examination
Pdyn = ½ CLVdd2AF F – clock frequency A – switching activity Dynamic Power = Logic Switching Power + Glitch Power Dynamic Power Ph.D. Final Oral Examination
1 2 2 1 2 2 Techniques to Eliminate Glitches ? path delay difference < gate inertial delay [1] • Hazard Filtering (Gate/Transistor Sizing) • Increase gate inertial delay • Sizing gate to change gate delay • Path Balancing • Decrease path delay difference • Insert delay elements on the shorter delay signal path →3 1.5 →0.5 [1] V. D. Agrawal, International Conference on VLSI Design, 1997 Ph.D. Final Oral Examination
Timing Window- for calculating path delay difference = Ti = ti Ph.D. Final Oral Examination
Previous Work on Leakage Minimization and Glitch Power Reduction • Leakage Power Minimization by Dual-Vth CMOS Devices • Heuristic Algorithms (locally optimum solutions) • Q. Wang and S. B. K. Vrudhula, "Static Power Optimization of Deep Submicron CMOS Circuits for Dual VT Technology," Proc. ICCAD, 1998, pp. 490-496. • L. Wei, Z. Chen, M. Johnson and K. Roy, “Design and Optimization of Low Voltage High Performance Dual Threshold CMOS Circuits,” Proc. DAC, 1998, pp. 489-494. • Integer Linear Programming (globally optimum solutions) • D. Nguyen, A. Davare, M. Orshansky, D. Chinney, B. Thompson and K. Keutzer, “Minimization of Dynamic and Static Power Through Joint Assignment of Threshold Voltages and Sizing Optimization,” Proc. ISLPED, 2003, pp. 158-163. • F. Gao and J. P. Hayes, “Gate Sizing and Vt Assignment for Active-Mode Leakage Power Reduction,” Proc. ICCD, 2004, pp. 258-264 • Glitch Power Elimination by Linear Programming • T. Raja, V. D. Agrawal and M. L. Bushnell, “Minimum Dynamic Power CMOS Circuit Design by a Reduced Constraint Set Linear Program,” Proc. 16th International Conference on VLSI Design, 2003, pp. 527-532. Ph.D. Final Oral Examination
Outline • Motivation • Problem Statement • Background • Proposed Techniques • MILP1 for Leakage and Glitch Minimization • MILP2 for Statistical Leakage Optimization under Process Variation • MILP3 for Statistical Glitch Power Reduction under Process Variation • Results • Conclusion • Future Work Ph.D. Final Oral Examination
MILP1: Minimize Leakage and Dynamic Glitch Power Simultaneously • No process variation is considered. • MILP1 is a mixed integer linear program (both integer variables and continuous variables are used) . • Objective: In dual-threshold CMOS Process • Minimize leakage – MILP1 determines the optimal dual-threshold assignment • Eliminate glitches – MILP1 determines delays and positions of delay elements used to balance path delays Ph.D. Final Oral Examination
MILP1: A Mixed Integer Linear Programfor Leakage and Glitch Power Reduction • Ideal objective function: Minimize {Total leakage + No. of glitch suppressing delay elements} • Alternative objective function (linear approximation): Minimize {C1·Total leakage + C2·Total glitch suppressing delay} Ph.D. Final Oral Examination
Variables and Constants Each gate has four variables and four constants: Integer Variable: • Xi:[0,1], specifies gate threshold voltage Continuous-valued Variables: • Ti: latest time at which the output of gate i can produce an event after the occurrence of an event at primary inputs. • ti: earliest time at which the output of gate i can produce an event after the occurrence of an event at primary inputs. • Δdi,j: delay of inserted delay element at the input of gate i coming from gate j. Constants Determined by Spice Simulation • ILi and IHi: Leakage currents for low and high thresholds • DLi and DHi: Delays for low and high thresholds Ph.D. Final Oral Examination
Constraints (t1,T1) • Glitch suppression constraint for each gate i: • Constraint (1-5) makes sure that T2- t2 < d2 • Circuit delay constraint for each PO k: • , k=1,3 • Tmax can the delay of critical path or clock period specified by the circuit designer (t0,T0) (t2,T2) (t3,T3) (1) (2) (3) (4) (5) Ph.D. Final Oral Examination
Choices for a Delay Element • Two cascaded-inverter buffer - consumes additional short-circuit, subthreshold leakage and dynamic power. • All delay buffers lie on non-critical paths and are assigned high Vth; contribute little to leakage • But they add to dynamic power • Transmission gate (always on) – increases resistance • Smaller area overhead • No subthreshold leakage • Minimal capacitance increase • Used before • T. Raja, V. D. Agrawal and M. L. Bushnell, “Variable Input Delay CMOS Logic for Low Power Design,” Proc. 18th International Conference on VLSI Design, January 2005, pp. 598-605. • T. Raja, V. D. Agrawal and M. L. Bushnell, “Transistor Sizing of Logic Gates to Maximize Input Delay Variability,” JOLPE, vol. 2, no. 1, pp. 121-128, April 2006. Ph.D. Final Oral Examination
Transmission-Gate Delay Element with Minimum Capacitance • Two types of capacitances: • Diffusion capacitances: CSB CDB • Channel capacitances: CGS CGD • To minimize diffusion capacitances, we implement all the transmission-gate delay elements with the minimal width but longer channel transistors Ph.D. Final Oral Examination
Transmission-Gate Delay Element with Minimum Capacitance (Cont.) To implement a specified delay, a smallest L is needed with a minimum W. This reduces the channel capacitance of the transmission gate that is proportional to L·W. So, a minimal-width transmission gate has a minimum Ctotaland causes the smallest dynamic power overhead. Ph.D. Final Oral Examination
Outline • Motivation • Problem Statement • Background • Proposed Techniques • MILP1 for Leakage and Glitch Minimization • MILP2 for Statistical Leakage Optimization under Process Variation • MILP3 for Statistical Glitch Power Reduction under Process Variation • Results • Conclusion • Future Work Ph.D. Final Oral Examination
One Example: Process Variation Effect on Leakage and Performance • [Ref] S. Borkar, et. al., DAC 2003. • .18um CMOS process • 20X leakage variation • 30% frequency variation • high frequency but too leaky chips must be discarded • low leakage chips with too low frequency must also be discarded too leaky too slow Ph.D. Final Oral Examination
Local and Global Process Variations • Inter-die Variation (Global Variation) • refers to wafer to wafer, or die to die variation on the same wafer • affects all devices on the same chip in the same way • Intra-die Variation (Local Variation) • occurs across an individual die / chip • devices at different locations on the same chip may have different process parameters Ph.D. Final Oral Examination
Comparison of Dynamic and Leakage Power Variation of Un-Optimized C432 (1,000 Samples) Normalized Dynamic Power Nominal Normalized Leakage Power Ph.D. Final Oral Examination
Comparison of Leakage Distribution of C432 Due to Different Process Parameters’ Variation (3σ= 15%) Nominal Ph.D. Final Oral Examination
Comparison of Leakage Distribution of C432 Due to Different Process Parameters’ Variation (Cont.) Subthreshold is most sensitive to the variation in the effective channel length. Global variation has a stronger effect on the subthreshold. Ph.D. Final Oral Examination
Statistical Leakage Modeling 2000 samples of subthreshold of one MUX cell @ 90nm by Monte Carlo Spice simulation In the Spice model library, process parameters (Tox, Ndop, Vth) are random variables with Gaussian distribution Statistical subthreshold leakage has a lognormal distribution We use the statistical leakage model in [ref] R. Rao, et al., Parametric Yield Estimation Considering Leakage Variability, DAC, 2004. Ph.D. Final Oral Examination
Xiis a process parameter, Xi0 is the nominal value of Xi Let {X1, X2, X3} = {Leff, Tox, Ndop} Statistical Delay Modeling Statistical – normal distribution [ref] Deterministic Let Mean Standard Deviation [ref] A. Davoodi and A. Srivastava, ISLPED,2005. Ph.D. Final Oral Examination
Minimize" i Î gate number Subject to" k Î PO Minimize" i Î gate number Subject to" k Î PO MILP2 Formulation (Deterministic vs. Statistical) Deterministic Approach The delay and subthreshold current of every gate are assumed to be fixed and without any effect of the process variation. Basic MILP1 – Minimize total leakage while keeping the circuit performance unchanged. Statistical Approach Treat delay and timing intervals as random variables with normal distributions; leakage as random variable with lognormal distribution Basic MILP2 – Minimize total nominal leakage while keeping a certain timing yield (η). Ph.D. Final Oral Examination
Outline • Motivation • Problem Statement • Background • Proposed Techniques • MILP1 for Leakage and Glitch Minimization • MILP2 for Statistical Leakage Optimization under Process Variation • MILP3 for Statistical Glitch Power Reduction under Process Variation • Results • Conclusion • Future Work Ph.D. Final Oral Examination
Background • Dynamic power is normally much less sensitive to the process variation due to its approximately linear relation to process parameters. • Deterministic path balancing becomes ineffective under process variation because the perfect hazard filtering conditions can easily be corrupted with a very slight variation in process parameters. Nominal C432 unoptimized for glitches C432 optimized by path balancing Ph.D. Final Oral Examination
Gate Distribution without Considering Process Variation Circuits unoptimized for glitch Circuits optimized for glitch by path balancing Ph.D. Final Oral Examination
Gate Distribution under Process Variation Circuits unoptimized for glitch Circuits optimized for glitch by path balancing Glitch power of unoptimized circuits is not sensitive to process variation; Glitch power of circuits optimized by path balancing is sensitive to process variation. Ph.D. Final Oral Examination
Technique of Enhancing the Resistance of Glitch Power to Process Variations • Leave a relaxed margin for process variation resistance in advance Ph.D. Final Oral Examination
Results for C432 Monte Carlo Simulation (15% local process variation) • C432 optimized by the statistical MILP with greater emphasis on glitch power to process variation (in Section 5.2.3.1 ) (blue) • C432 optimized by the deterministic MILP (in Section 5.1.2) (Purple) Dynamic Power (logic simulation) Subthreshold Leakage (Spice simulation) Ph.D. Final Oral Examination
Outline • Motivation • Problem Statement • Background • Proposed Techniques • Results • Conclusion • Future Work Ph.D. Final Oral Examination
Results of MILP1:Leakage reduction and performance tradeoff 27℃, 70nm Ph.D. Final Oral Examination
Results of MILP1:Leakage, Dynamic and Total Power Comparison 90℃, 70nm Ph.D. Final Oral Examination
Results of MILP 2:Comparison of nominal leakage power saving due to statistical modeling with two different timing yields (η). Ph.D. Final Oral Examination
Statistical Dual-threshold Assignment • The leakage in high Vth gates is less sensitive to process variation. • Higher the percentage of high Vthgates in a circuit, narrower is the leakage power distribution (standard deviation) and lower is the average leakage power (mean). • For global process variation, all gate delays have the same percentage of variation, and do not affect the constraints in MILP, which means the dual-threshold assignment will remain the same. • Subthreshold is most sensitive to the Leffvariation. • So, we only simulate the leakage distribution of all statistically optimized circuits with local Leff variation (3σ=15%) by Spice. • To analyze the leakage distribution under process variation in the deterministic method, we considered the worst case which is too pessimistic. Ph.D. Final Oral Examination
Results of MILP 2:Leakage Power Distribution of Optimized Dual-Vth C7552 Mean and Standard Deviation of leakage power are reduced by the statistical method. Ph.D. Final Oral Examination
Results of MILP 2:Comparison of leakage power distribution with two different timing yields (η). Ph.D. Final Oral Examination
Results of MILP 2:Comparison of mean of three leakage power distributions Mean (nW) Ph.D. Final Oral Examination
Results of MILP 2:Comparison of standard deviation of three leakage power distributions Standard Deviation (nW) Ph.D. Final Oral Examination
Conclusion • A new mixed integer linear programming technique • Simultaneous minimization of leakage (dual-Vth) and elimination of glitches (path delay balancing). • Global tradeoff between power and performance. • Experimental results shows that 96%, 28% and 64% reduction in leakage, dynamic (glitch) and total power, respectively for C7552. • A second mixed integer linear programming formulation • statistically minimize the leakage power in a dual-Vth process under process variations. • Experimental results show that 30% more leakage power reduction can be achieved by using this statistical approach. • The mean and standard deviation of leakage power distribution are both reduced when a small yield loss is permitted. Ph.D. Final Oral Examination
Conclusion (cont.) • A third mixed integer linear programming formulation • Statistically minimize the total power, the leakage or the dynamic power in a dual-Vth process under process variations • The effect of process variation on glitch power is minimized. Ph.D. Final Oral Examination
Future Work • Gate leakage • MILP complexity • for SOC, MILP constraints can be generated for its submodules at a lower level, • may not guarantee a global optimization, but still would get a reasonable result within acceptable run time. • adopt relaxed LP that uses the LP solution as the starting point and then round off the variables • An approximate optimal solution with acceptable run time can be achieved. Ph.D. Final Oral Examination
gate delay + LVT design 2 2 3 7ns = + 2 3 3.2 8.2ns = + + dual-Vth design 1 2 3 FF FF 8ns Future Work (Cont.) Iterative MILP for dual-Vth design • Timing violations were found • The interdependency of delays of gates was neglected for simplicity in our MILP formulation. • If any timing violation is found, the new delays for all LVT cells are extracted from the current dual-Vth design and the MILP formulation is updated correspondingly. A different optimal solution is then given by the CPLEX solver with fewer timing violations. We continue iterations until all timing violations are eliminated. Ph.D. Final Oral Examination