400 likes | 554 Views
Energy Recovery from High-frequency Clocks using DC-DC Converters. Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux , Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of Cambridge, UK. Problem. Clock power in high-performance CPUs.
E N D
Energy Recovery from High-frequency Clocks using DC-DC Converters Mehdi Alimadadi, Samad Sheikhaei, Guy Lemieux, Shahriar Mirabbasi, William Dunford University of British Columbia, Canada Patrick Palmer University of Cambridge, UK
Problem Clock power in high-performance CPUs • Cause • Charge big clock capacitor Cclk with energy • Discharge Cclk energy to GND (WASTE IT!!) • Repeat every clock cycle
Primary Contribution of This Work • Primary contribution • Discharge Cclk using DC-DC converter instead of GND • Use converter to power useful load (Rload) • Integratedclock drivers with DC-DC converters • Net savings in power Voltage feedback (for regulation) Useful Load
Summary Results • Explore 3 main DC-DC power converter topologies • Buck converter our previous work [ ISSCC 2007 ] • Boost converter this paper [ ISVLSI 2008 ] • Buck-boost converter this paper [ ISVLSI 2008 ] • 90nm layouts, 3GHz operation, < 0.3mm2
Background – Typical Clocking Architecture Bottom mesh Final H-tree Clock Source Level 3 Gaters & Final drivers Level 1 & Level 2 H-tree
Background – Typical Clocking Architecture • Clock distribution • Majority of energy used by final drivers • Levels 1, 2 • H-trees • Tunable delays (CVDs) to eliminate skew • Low-swing, differential low power, noise immunity • ~ 5W of power • Level 3 • Gaters reduce clock activity 50-85% (Power6) • Can’t eliminate all activity still need a clock to compute • Final clock drivers • Full-rail swing tapered inverters drive hundreds latches, high power • H-tree with ends shorted by Mesh low skew, high power • ~15W to 40W of power
Background –Reducing Clock Power • Clock distribution • Low-swing (differential) signals • Final drivers need full-rail • Resonant clocking (saves 80%) • Final drivers need square clock • Final clock drivers • Adiabatic switching • Low-performance, < 100MHz • Double-edge clocking • Feasible, but complex flip-flops, larger loads • Compatible with energy recovery in this paper
Background – Switch Mode Power Supplies • Basic DC-DC converter topologies • Buck • Step down • 0 Vout VDD • Boost • Step up • VDD Vout • Buck-boost • Negative step up/down • Vout 0
Background – Switch Mode Power Supplies • DC-DC buck converter • CMOS inverter as power switches • Implementation of zero-voltage switching (ZVS) • Turn on NMOS when Vinv= 0 • Turn on PMOS when Vinv=Vdd
Background ISSCC 2007 Design ZVS delay circuit Integrated clock driver / power converter
Integration of Clock and SMPS • CPU clock: 3GHz clock and large Cclk • SMPS: large Mp, Mndrive chain
Integration of Clock and SMPS • Combine the driver circuits
Key Concept: Energy Recycling • Benefits • Shared driver chain • Cclk added to SMPS • Red path • NMOS drains Cclkwastes charge! • Blue path • Delay NMOS turn-on recovers clock charge! • ZVS (zero voltage switching) in power electronics
ZVS Detailed Operation • ZVS delay circuit D • Delay only rising edge of Vn • Implemented inside the clock chain
ZVS Detailed Operation (Mode 1) • Mode 1 (0 < t < DTsw) • Mp is ON • Current builds up in the inductor • Cclk charges up D = Duty cycle Tsw = Switching period
ZVS Detailed Operation (Mode 2) • Mode 2 (DTsw < t < DTsw+Tzvs) • Both power transistors are OFF • Inductor current discharges Cclk • Cclk charge is recycled to output load D = Duty cycle Tsw = Period Tzvs = ZVS delay
ZVS Detailed Operation(Mode 3) • Mode 3 (DTsw+Tzvs < t < Tsw) • Mnturns ON when Vclk 0 • ZVS for Mn • Inductor current decreases linearly D = Duty cycle Tsw = Period Tzvs = ZVS delay
Detailed Operation • ZVS delay circuit for Mn • Delay rising edge of Vn
Detailed Operation • ZVS delay circuit for Mn • Falling edges of Vp and Vn are synchronized
Effective Efficiency • How to measure power efficiency after clock drivers are integrated with DC-DC converters ? • Converter gets “free energy” from clock • Effective efficiency: how efficient a regular (standalone) power converter must be to equal the efficiency of integrated clock/power converter Raw efficiency Effective efficiency
Buck Converter – Simulation Results Open loop converter (no regulation) Higher efficiency at lowest duty cycle becauseonly a fixed amount of energy is available from Cclk
ISSCC 2007 • 90nm test chip 1mm2, buck converter 0.27mm2
Buck Converter – Chip Measurement vs. Simulation Results Chip Measurement Simulation (3GHz)
ISVLSI 2008New Design 1 Boost Converter
Boost Converter Basic operation Vclk provides power & timing 0th order result… Vout = D/(1-D)*Vdd
Boost Converter – Simulation Results • Open loop converter (no regulation) • Higher efficiency at lowest duty cycle becauseonly a fixed amount of energy is available from Cclk
ISVLSI 2008New Design 2 Buck-boost Converter
Buck-boost Converter Basic operation Vclk provides power & timing 0th order result… Vout = -D2/(1-D)*Vdd
Buck-boost Converter Open loop converter (no regulation) Higher efficiency at lowest duty cycle becauseonly a fixed amount of energy is available from Cclk
Summary Results • 90nm layouts, 3GHz operation, < 0.3mm2
Comparative Results IBM Power6 100W@1V, 341mm2 Cclk = 13pF/mm2 Other work: fully on-chip DC-DC buck converter S. Abedinpour, B. Bakkaloglu, and S. Kiaei, "A Multi-Stage Interleaved Synchronous Buck Converterwith Integrated Output Filter in a 0.18µm SiGe Process," ISSCC 2006, pp. 356–357 27mm2, 45MHz 65% power efficiency This work 0.27, 0.26, 0.20 mm2, including 0.1mm2 inductor area, 3GHz Cclk 20pF, equiv to 1.6mm2 of Power6 area DC-DC converter adds 12.5% area overhead LC filter: 310pH inductor, 350pF capacitor L and C similar and dominate layout area can stack to cut area in half Buck: 75 – 185% effective power efficiency (50% recovered) Boost: 25 – 110% effective power efficiency (20% recovered) Buck-boost: 20 – 66% effective power efficiency (30% recovered)
Conclusion • Key concepts • High switching frequency saves area • Combined driverssaves area and switching loss • Recycled charge converter load discharges Cclk • ZVS delay circuit lower power loss • Limitations • Regulation needs variable duty cycle clock • May introduce additional clock jitter • Mostly suitable for edge-triggered blocks (no latches) • Future work • Lots of improvements to make!
Thank you! Questions ?