1.15k likes | 1.94k Views
Various Low-Power SoC Design Techniques. Chong-Min Kyung KAIST. Contents. Introduction Power Management using Voltage Island Technique Energy (Power) Management Approach by ARM Low Power Design Example with Samsung AP based on ARM 920T IBM Low Power Design using PowerPC Conclusions.
E N D
Various Low-Power SoC Design Techniques Chong-Min Kyung KAIST
Contents • Introduction • Power Management using Voltage Island Technique • Energy (Power) Management Approach by ARM • Low Power Design Example with Samsung AP based on ARM 920T • IBM Low Power Design using PowerPC • Conclusions
Why Low Power? • Limited Battery Capacity (Mobile Devices) • For Minimal Heat Dissipation (Heat Sink, Cooler, System Size/Weight/Cost) • For Chip/System Reliability • Save Energy; it’s limited after all!
Power vs. Energy • Power-Critical Applications ; • Heat Dissipation Requirement • Power/Ground Metal Line Width • Power/Ground Bounce due to IR drop • Energy-Critical Applications ; • Battery Lifetime • Heat Dissipation Requirement
Applications for Low Power Technology • Medical ; Implantable hearing-aid, cardiac pacemaker • Mobile Devices ; cellular phone • Military Devices ; • Hard-to-access points ; Space • Too-many-to-access points ; Sensors/Actuators in Ubiquitous World
Power Management using Voltage Island Technique
Typical Power Optimization Procedure Applications H/W Description and Synthesis InitialLayout Standard Cell/Wire Place/Route and Layout Gate-Level PowerOptimization Functional Partitioning Technology Files Cell/Interconnect Delay and Power Modeling Constraints(Delay, Power, Area, Noise) Parasitic(Resistance, Capacitance) Vdd, Vt, Wg, Wint Optimization Switching Activity Interconnects from layout Power optimized Net List CustomizedLayout Parameterized Cell/Wire Design Place/Route and Layout N Verification for Min-Power,Delay, Area, Noise Y Optimized Vdd, Vt, Wg, Wint
Active power density increasing with device scaling and increased frequency • Leakage power density increasing due to lower Vt and gate leakage • Stressing packaging, cooling, battery life, etc. • Complicates IDDq testing as well Thinning gate oxides increase gate tunneling leakage Power Challenge Source from Bergamaschi
Low Power Levers • Dynamic Techniques • Clock gating • Data gating • Power gating • Variable frequency • Variable voltage supply • Variable device threshold • Structural Techniques • Voltage Islands • Multi-threshold devices • Multi-oxide devices • Minimize capacitance by custom design • Power efficient circuits • Parallelism in micro-architecture
Standby Mode Leakage Suppression • Disconnect inactive logic from supply in standby mode • Multi-threshold • use higher Vt header/footer • suppresses logic leakage • gate & sub-threshold • Multi-oxide • Use thick oxide header/footer • suppresses gate leakage • Header/footer gate voltage • Overdrive: increase freq. • under-drive: reduces leakage • Header/footer well bias • Forward bias : increase freq. • Reverse bias : reduce leakage • Voltage Islands
Standby Power Reduction Mechanism • On-chip supervisor manages standby power • Clock gating • Functional clock gating (fine clock control) • Voltage scaling, shutdown • SOC latch save/restore • Timeout and interrupt driven DC/DC Supplies Select Shutdown 1.0-1.8V Scalable VDD Domain 3.3V I/O System Clks Freeze I/O Freeze Clk SoC Logic LSSD Latches Serial NVRAM RTC Suspend Ctrl Logic Scan Chains Irq Clk IIC Ctrl PG Wake Reset Scan Ctrl Logic 3 Data Battery Backed Domain Reset Logic
Voltage Island Concept Vdd1 Vdd2 • Trade off power for delay by running functional blocks at different voltages • Can use mix of Low and High Vt to balance performance and leakage • Switch off inactive blocks to reduce leakage power • Requires IP standards for powermanagement, clock gating, etc. Vddo SWITCH SWITCH Low VT Logic Logic Delay vs. Voltage 30 25 20 15 10 5 0 Std. Vt Low Vt IP2 IP1 Power Management Unit Ddelay (ps) E.g.: Telecom ASIC with 1.0/1.2 V islands saved : 16 % active power 50 % standby power 0.7 0.8 0 .9 1.0 1.1 1.2 1.3 Voltage (Vdd) Source from Bergamaschi
Power Management Unit Bus Interfaces DC/DC Converter ReconfigurableRegister Units ControlPerformance Unit Well-bias generator Clock generator Clock Control Unit Power Management State Machine Clock & Power-Gating MonitorUnit Device Performance Monitor Thermal Monitor Timer / Counter Power ControlUnit IP Core Interfaces
Busses with Different Voltages • One clock & One signaling voltage • Some approaches : • Temporarily scaling V & F to for comm. • Separate different voltages with bridges Hot Bus Cold Bus bridge bridge Cool Bus
Power Management I/O’s, VReg, Gnd Memory Arrays Vdd 3 Low Vt device arrays Optimized for low active power ROM Vdd1 Analog Vdd 5 Vdd 1 RLM 1 Memory Arrays Vdd 3 Low Vt device arrays Optimized for low active power I/O’s, VReg, Gnd RLM 2 Microcontroller Vdd 2 I/O’s, VReg, Gnd DSP Vdd 2 ROM Vdd 1 RLM 3 Memory Arrays Vdd 4 High Vt device arrays Optimized for low active power Monitor Logic Vdd 4 I/O’s, VReg, Gnd • Independently controlled domain power switches • Multiple On-Chip Voltage Islands • On-Chip Voltage Regulators
Functional Partitioning • Identifying functional components with similar inactive periods • Assigning functional components to possible chip-level power sources capable of providing required voltage level • Identifying the optimal grouping of components, based upon power sequencing (affects static power) and operating voltage (affects active power) that minimizes chip power within the limits (such as peak power) of the SoC • Identifying or creating, and connecting, logic signals that will be used to control power-sequencing circuitry or control clock gates • Connecting alternate voltage sources to latches or arrays used to save state across power sequencing
Controlling VDD and VTH for low power Software-hardware cooperation Technology-circuit cooperation • MTCMOS : Multi-Threshold CMOS • VTCMOS : Variable Threshold CMOS • Multiple : spatial assignment • Variable : temporal assignment
1 0.8 0.6 Normalized power P∞fV2 0.4 0.2 Super-linear 0 0.2 0.4 0.6 0.8 1 Required speed ∞f Dynamic power reduction • Through Software-hardware cooperation • OS and application programming Controller Clock & VDD Required speed Software Hardware Processor If you don’t need to hustle, relax and save power
Voltage Scaling Mechanism • Four power domains • On-chip supervisor for SOC voltage supplies • Level shifting and latching circuits at domain interfaces DC/DC Supplies Select Shutdown 3.3V 1.0V-1.8V Constant 3.3V I/O Domain Voltage Scalable 1.0V-1.8V Logic Supply Domain Battery Linear Regulator Suspend Ctrl Logic CPU Core Caches I/O Intf Logic Memory Intf Accelerators Drivers Recvrs Persistent 1.8V Battery-BackedDomain Regulated 1.0V PLL Supply Domain RTC Logic
Dynamic Voltage/Frequency Scaling • Freq. changed and Vdd dropped from 1.8V to 1.0V • PLL locked at 533MHz with CPU clock switched from 266MHz to 66MHz to 266MHz • Continues to execute Dhrystone benchmark
Low Leakage Cells – Standby Power Reduction • Dual-Vt Storage Cells • Low Vt for high performance • High Vt for low leakage • Gated Vdd and DRG • Power Switch • Sub threshold leakage current dominates
Need for Energy Management • Today’s mobile consumers want: • longer battery life and • smaller, lighter products • Manufacturers are adding new features and applications to add product appeal: • media players (audio, video) • gaming • video capture • Increasing processing power requirements and longer battery life are conflicting requirements • Battery technology alone offers only incremental improvement over the next several years
Software (OS, applications) System – Architecture Micro-architecture Circuits Ambient environment Si conditions Power delivery Important to optimize design at each level ARM’s partners have widely varying design-time, technology, legacy, cost constraints. IEM: current focus on top two layers Widely applicable dynamic power-optimizations Optimize for the requirements of the specific workload Layers of power optimizations
Power Manager ON RESTART IDLE STANDBY Conventional Power Management • Conventional power management schemes manage the transitions between defined power states • STANDBY is off but with state retained with clocks stopped • IDLE is a lower power mode with a slow clock running • ON state is fully powered up at maximum clock frequency • Despite the changing software workload, system runs at maximum performance while there is any work to be done
Optimizing for utilization characteristics • Conventional power management optimizes power consumption when there is nothing to do (sleep modes). • IEM optimizes power when work is being done. • Only run fast enough to meet deadlines! • Running fast and idling wastes power. • The active- and sleep-mode techniques are orthogonal. 100% Utilization Energy used 0% 100% Dynamic Voltage Scaling Energy used 0%
Scaling Technology Performance Prediction and Voltage Scaling Monitoring Threshold Scaling Meeting the Performance Requirement • Effective Energy Management requires: • AutomaticPerformance Prediction technology • Determining the lowest performance level that will get the software workload done just in time • Performance Scaling technology • Delivering just enough performance to meet the current requirement • Responding rapidly to changing performance levels
Energy Management Control Components • Software component • To automatically predict future software workloads by interacting with instrumented Operating Systems and application software • To determine the software deadlines • To balance workload and deadlines with performance • Hardware component • To accurately measure the actual system performance • To independently manage the transitions of hardware scaling blocks. e.g., clock generators and power controllers • Together these components determine and manage the lowest performance level that gets the work done
Adaptive Voltage Scaling (AVS) • AVS is a closed loop control mechanism. • Feedback from the PMU indicates the earliest opportunity to change processor frequency based on the voltage levels being output to the SoC. • APC monitors the difference between the requested performance level and the actual level achieved. • Taking into account variations due to differences in process technology and ambient temperature the system dynamically changes the voltage applied. • The lowest energy consumption is achieved OR a specified performance level can be met.
Smaller Lighter Limited Battery Improvement • Power Increase vs. Battery Improvement Year 2001 2004 2007 2010 2013 2016 Feature Size(nm) 130 90 65 45 32 22 Dynamic Power Reduction(X) 0 1.5 2.5 4.0 7.0 20 Stand-by Power Reduction(X) 2 6 15 30 150 800 [ITRS 2001] • Cellular Phone • Talk Time : about 12Hrs • Standby : about 1 month Volumetric Energy Density(Whr/L) Gravimetric Energy Density(Whr/Kg) FuelCell 800 • Cellular Phone • Talk Time : 2Hrs ~ 4Hrs • Standby : about 1 week 600 Only 4~5 X improvement In Battery lifetime! 400 Li-Ion / Polymer NI-MH 200 100 200 300 400 500 600 700 800 900
Problem Statement • Power Analysis on CMOS Inverter
Problem Statement • Dynamic Power • Average Short Circuit Current • Sub-threshold Leakage Current
Feature Size > 0.25um 0.18/0.13/0.09um… Performance(AP) < 200MHz 300/400/533MHz, 1GHz Core Voltage 5.0/3.3/2.5V 1.8/1.2/1.0V … VTH(Threshold) > +/- 0.6V +/- 0.5, 0.4, 0.3V … TR Leakage Negligible Exponential growing(SD/Gate) Stand-by Mode PLL-off(Clock-off) V/MTMOS, High VTH/High VDD Low Power Focus on Operating Power Focus on Operating/Stand-by Problem Statement • Domination of Leakage Current
As CMOS scales down the following stand-by leakage current rises rapidly. Source to drain leakage (diffusion+tunneling) as Lg scales down Gate leakage current (tunneling) as Tox scales down Body to drain leakage current (tunneling) as channel doping scales up Active and Leakage Power with CMOS Scaling
Vg=0V Sub-threshold Leakage Turn off Source to drain tunneling Gate oxide tunneling Vd=Vdd Drain to Body tunneling (BTB) Vg=Vdd Turn on Vd=0V Two cases of Leakage Mechanism
) 1 10 2 0 10 -1 10 Drain leakage -2 10 High-K gate dielectric Current Density (A/cm -3 10 Gate leakage -4 10 -5 10 -6 10 20 25 30 35 40 Tox (A) Gate Leakage Current Reduction with High-K Gate Dielectric
Gate Leakage Current Reduction with High-K Gate Dielectric • As Tox scales gate leakage current increases exponentially due to exponential increase of tunneling probability with reduction of physical tunneling distance. • Physically thicker gate dielectric allows lower leakage current but lower oxide capacitance reducing on-current • Using high k (dielectric constant) material, both thicker physical thickness and higher oxide capacitance can be achieved. • Applying high-k gate dielectric, several orders of magnitude lower gate leakage current can be achieved with similar oxide capacitance
Power Saving vs. Abstraction Layers • Power Saving v.s. Abstraction Layers Design Time System/Algorithm/Architecture have a large potential!
PeriodicWakeup Wakeup & Operation Idle/Stand-by Time System Level Consideration for Low Power Design • Mobile Device’s Behavior according to Time (Operation Time is less than 10%) “Need Various Power Modes In System”
Power Management : Example General Clock Gating Controlling the individual clock source foreach IP block by the on/off controlling of each corresponding clock source enable bit IDLE Turn off the clock source to the CPU STOP Turn off all of the clock sources includingthe external X-tal and internal PLLs SLEEP Turn off all of the clock sources and also the power-supply for the internal-logicexcept for the wake-up logic circuitry
Dynamic Voltage Scaling (DVS) • Reduction of Stand-by Power in Leaky Process • By Monitoring Data Bus Congestion • By Monitoring/Guessing Performance Needed, for Specific Application V V ΔV DVS Task Task time time Need to predict task execution time! Power gain ∝ ΔV2
Dynamic Voltage Scaling (DVS) • Stretch the execution by lowering the supply voltage • Quadratic Power saving • No later than the deadline • Processors supporting DVS • Intel Xscale • Transmeta Crusoe • DVS Algorithms • Can be implemented as HW or SW • Optimal solution in continuous voltage domain, but not in discrete voltage domain
Low Power P VDD2 Voltage Scaling for Low Power Low VDD I ds (VDD- Vth)1~2 Low Speed Speed Up I ds (VDD -Vth)1~2 Low Vth I leakage e-C xVth High Leakage Leakage Suppression
100m VTH control VDD control MTCMOS High speed High speed 10m VDD: 1.5V VDD control Dynamic power[W] VDD: 1.0V 1m VTH control Low speed Low speed VTH: 0.5V VTH: 0.25V 100n 1p 10p 100p 1n 10n 100n Leakage power[W] Low-Leakage Solution – Technology
Multi-Threshold CMOS Variable-Threshold CMOS Schematic Diagram VDD Vpb = VDD N-well VDD or V+ Low- Vth Vt Low Vt Control circuit Sleep Hi- Vth GND P-well Vnb = 0 or V- GND principle • On-off control of internal VDD or VSS • Special F/Fs, Two Vth’s • Threshold control with bulk-bias • Triple well is desirable Merit • Low leakage in stand-by mode. • Conventional design Env. • Low leakage in stand-by mode. • Conventional design Env. Demerit • Large serial MOSFET • ground bounce noise • Ultra-low voltage region?(1V) • Scalability? (junction leakage) • TR reliability under 0.1mm • Latch-up immunity, Vth controllability, Substrate noise, Gate oxide reliability • Gate leakage current VTCMOS & MTCMOS
MTCMOS : Reduce Stand-by Power with High Speed With High VTH switch (MTCMOS) Without High VTH switch Vdd Vdd • With High VTH switch, much lower leakage current flows between Vdd and Vss • High VTH MOSFET should have much lower ( >10X) leakage current compared to normal VTH MOSFET Normal or Low VTH MOSFET 0 1 0 1 Virtual Ground Vss 0 Vss High VTH switch
Active Sleep Active Multi-Threshold CMOS (MTCMOS) • Mobile Applications • Mostly in the idle state • Sub-threshold leakage Current • Power Gating • Low VTH Transistors for High Performance Logic Gates • High VTH Transistors for Low Leakage Current Gates Current Cutoff-Switch (High Vth) Logic Component (Low Vth) VDD Operating Mode Low Vth MOS Sleep Control (SC) SC High Vth MOS VGND VSS Time
High Vt CCS Sizing • The effect of CCS size • As the size decreases, logic performance also decreases. • As the size increases, leakage current and chip area also increase. • Proper sizing is very important. • CCS size should be decided within 2% performance degradation. VDD Low Vt Vop = VDD - V Switch Vmust be sizedwithin 2% performance degradation. Control GND