400 likes | 559 Views
EE 587 SoC Design & Test. Partha Pande School of EECS Washington State University pande@eecs.wsu.edu. Power & Low Power Design Physical Design Methodologies. Metric1 : Power. Ref. 5.9 of HJS
E N D
EE 587SoC Design & Test Partha Pande School of EECS Washington State University pande@eecs.wsu.edu
Power & Low Power Design Physical Design Methodologies
Metric1 : Power • Ref. 5.9 of HJS • If we improve a design relative to power but it slows down the circuit, then it might not be acceptable • Comparing the power of two designs might be misleading • the lower power design might just be slower
Metric 2 : Energy / Operation • Rather than looking at power, look at the total energy needed to complete some operation. Fixes obvious problems with the Power metric, since changing the operating frequency does not change the answer
Technology Optimization • Energy per transition is proportional to • When the supply voltage approaches the threshold then delay increases significantly
Technology Optimization • Modification of the threshold voltage • Reduction of threshold voltage and supply reduction is offset by an increase in leakage current
Transistor Sizing • Optimum transistor sizing • The first stage is driving the gate capacitance of the second and the parasitic capacitance • input gate capacitance of both stages is given by NCref, where Cref represents the gate capacitance of a MOS device with the smallest allowable (W/L)
Transistor Sizing • When there is no parasitic capacitance contribution (i.e., α= 0), the energy increases linearly with respect to N and the solution of utilizing devices with the smallest (W/L)ratios results in the lowest power. • At high values of α, when parasitic capacitances begin to dominate over the gate capacitances, the power decreases temporarily with increasing device sizes and then starts to increase, resulting in a optimal value for N. • The initial decrease in supply voltage achieved from the reduction in delays more than compensates the increase in capacitance due to increasing N. • after some point the increase in capacitance dominates the achievable reduction in voltage, since the incremental speed increase with transistor sizing is very small • Minimum sized devices should be used when the total load capacitance is not dominated by the interconnect
Power Dissipation in Interconnects • In the deep-submicron era, interconnect wires (and the associated driver and receiver circuits) are responsible for an ever increasing fraction of the energy consumption of an integrated circuit. • Most of this increase is due to global wires, such as busses and clock and timing signals. • More than 90% of the power dissipation of traditional FPGA components (over a wide range of applications) is due to the interconnect • For gate array and cell library based designs it has been found that the power consumption of wires and clock signals can be up to 40% and 50% of the total on-chip power consumption respectively.
Low-swing Circuits Conventional Level Converter • Extra power rail • Special low-Vt device needed
Dynamically-Enabled Drivers • The basic idea is to control the charging/discharging time of the drivers so that a desirable swing on the interconnect is obtained. • Wire is floating when the driver is disabled
Low Swing Bus • Power dissipated in an n-bit bus • Increasing the number of switching bits n causes a proportional increase in power dissipation
Low Swing Bus • The voltage swing can be reduced by using an additional bus wire, called the dummy ground • This dummy ground is initially discharged to the real ground level and then immediately isolated from the ground. • The charge of bus wiring capacitance is discharged to the dummy ground instead of the real ground. • When n bits of the bus signals switch from “I” to “0,” the voltage swing is reduced to
Low Swing Bus • The bus power dissipation required to switch n bits of the bus is given as • The voltage swing is further reduced as the number of switching bits increases
SSDLC • Symmetric Source-Follower Driver with Level Converter • The driver limits the interconnect swing from Vtn to Vdd-Vtn • Assume that node in2 goes from low to high; Vtn to Vdd-Vtn. • Initially, node A sits at Vtn and node B sits at Ground. • During the transition period, with both N3 and P3 conducting, Aand B rise to Vdd-Vtn • Consequently, N2 is turned on, and out goes to low. The feedback transistor PI pulls Afurther up to Vdd to cut off P2 completely. in2 and B stay at Vdd-Vtn.
Low Power Through Circuit Design • Low-Power Logic Styles: CMOS Versus Pass-Transistor Logic by Zimmermann and Fichtner • Power savings through proper choice of logic styles • Switching Capacitance • Transition Activity • Short Circuit Currents • Power dissipation of various logic styles need to be analyzed
Circuit Design Styles • Nonclocked Logic • CMOS, Pseudo-NMOS, Differential Cascade Voltage Switch (DCVS), Pass-Transistor • Clocked Logic • Domino, Differential Current Switch Logic (DCSL)
Complementary CMOS - Advantages • Simple monotonic gates can be realized very efficiently with only a few transistors, one signal inversion level, few circuit nodes • Area and Power reduces, delay reduces • Robustness against voltage scaling and transistor sizing • Input signals are connected to gate inputs only
Complementary CMOS- Disadvantages • Large PMOS transistors • Area, Power, Delay increase • Series transistors in the output stage • Weak output driving capability • Delay increases
Pseudo-NMOS Logic • Reduced complexity of logic and hence, lower capacitance, and faster speed • Ratioed Logic, better suited for large fan-in design • Static Current • Power Dissipation is high
Performance of Pseudo-nMOS J. M. Rabaey, A. Chandrakasan and B. Nokolić, Digital Integrated Circuits, Upper Saddle River, New Jersey: Pearson Education, 2003.
Negative Aspects of Pseudo-nMOS • Output 0 state is ratioed logic. • Faster gates mean higher static power. • Low static power means slow gates.
DCVS Logic • No static power dissipation • Speed advantage of ratioed logic • Has larger area and switched capacitances
Pass-Transistor Logic Styles • One pass-transistor network is sufficient to perform the logic operation • Smaller no. of transistors, smaller input loads • Threshold Voltage Drop • Swing restoration Circuit required • Multiplexer Structure • Dual Rail Logic required
Complementary Pass-Transistor Logic (CPL) • Small input loads • Power and delay reduces • Efficient XOR and MUX implementation • Good output drive • Cross-coupled pull-up • Large short-circuit current • Substantial number of nodes • Inefficient realization of simple gates
Double Pass-Transistor Logic (DPL) • Both PMOS and NMOS logic networks are used in parallel • Full swing on the output signals • Number of transistors and the number of nodes are quite high • Substantial capacitive load
Swing Restored Pass-Transistor Logic (SRPL) • Derived from CPL, Output inverters are cross-coupled to a latch structure • Swing restoration and output buffering at the same time • Transistor sizing is difficult, poor output driving capability • Slow switching • Large short-circuit current
Single-Rail Pass-Transistor Logic (LEAP) • Single NMOS networks are required • Area, Power, Delay decreases • Swing restoration only works for • Robustness in the low voltages is not guaranteed
Comparisons between CMOS and Pass-Transistor • Pass-Transistor logic is claimed to be the low-power logic styles • All the comparisons were based on the full adder implementation • Not representative • Full adders have limited importance even in arithmetic circuits
Comparisons between CMOS and PL • Higher Performance for CPL over CMOS in case of full adder implementation • In case of multiplexer and other monotonic gates CMOS outperforms others • In case of XOR CPL is faster, but power-delay product is more • CPL provides best performance among all pass-transistor design styles
Domino Logic • Nonratioed logic – sizing of pMOS transistor is not important for output levels. • Higher Speed • Only implements noninverting logic gates • Best suited for large fan-in gates • Switching activity is high • Lower noise immunity • Large clock load
Logic Activity • Probability of 0 → 1 transition: • Static CMOS, p0 p1 = p0(1 – p0) • Dynamic CMOS, p0 • Example: 2-input NOR gate • Static CMOS, Pdyn = 0.1875 CLVDD2fCK • Dynamic CMOS, Pdyn = 0.75 CLVDD2fCK p1=0.5 p1=0.25 p0=0.75 p1=0.5
Selecting a Logic Style • Static CMOS: most reliable and predictable, reasonable in power and speed, voltage scaling and device sizing are well understood. • Pass-transistor logic: beneficial for multiplexer and XOR dominated circuits like adders, etc. • For large fanin gates, static CMOS is inefficient; a choice can be made between pseudo-nMOS, dynamic CMOS and domino CMOS.