1 / 14

Attacking the Power-Wall by Using Near-threshold Cores

Attacking the Power-Wall by Using Near-threshold Cores. Liang Wang liang@cs.virginia.edu. Power Wall. The end of Classical Scaling. Vdd : almost constant Power density: roughly increase in exponential Utilization: roughly decrease in exponential

kinsey
Download Presentation

Attacking the Power-Wall by Using Near-threshold Cores

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Attacking the Power-Wall by Using Near-threshold Cores Liang Wang liang@cs.virginia.edu

  2. Power Wall • The end of Classical Scaling. • Vdd: almost constant • Power density: roughly increase in exponential • Utilization: roughly decrease in exponential • We can fabricate more cores than we can power up * From Venkatesh, et. al. ASPLOS’10 Dark Silicon Liang Wang, ECE6332 Final

  3. Near-threshold Cores (NVt. Cores) • Pros • Low power per-core. • More cores per-chip. • Limitations • Low per-core frequency, reducing throughput gains from parallelization. • Variations, harmful for performance and functionality. Will NVt. cores be a viable solution to push down the power-wall? Liang Wang, ECE6332 Final

  4. Outline • Performance Model • Analyses and Results • Conclusion Liang Wang, ECE6332 Final

  5. System Modeling Symmetric Multi-core System A Single core v Dynamic Power Core Area: a Power: p(v) Freq: f(v) Static Power Area: A Power: P Frequency Fitted to circuit sim. Number of active cores Application with Amdahl’s Law parallel ration of  Liang Wang, ECE6332 Final

  6. Simulation Setup • Circuit • A single inverter • Ripple carry Adder (32bits, 16bits, 8bits, and 4bits) • Technology Library • A modified version of Predictive Technology Model (PTM) • Technology Nodes • 45nm, 32nm, 22nm, 16nm • Process Variants • HKMGS: High-performance High-KMetal Gate and Stress effect. • LP: Low-power process • CAD Tools • RC Compiler • Spectre driven by Ocean Liang Wang, ECE6332 Final

  7. Voltage-Frequency Scaling LP has much larger frequency drop-down comparing to HP with the same change in vdd ~8x ~400x ~15x 16nm has larger frequency drop-down comparing to 45nm With the same change in vdd ~103x Liang Wang, ECE6332 Final

  8. Design space exploration (Area) 45nm, HKMGS, IO cores, 100w, =0.99 Peak is capped by total area 2x Peak from 200 to 6.4K saturating Liang Wang, ECE6332 Final

  9. Cross-technology study 500mm2 80W 400mm2 100W Liang Wang, ECE6332 Final

  10. Compare to Dark Silicon • NVt. cores alleviate the issue of low utilization. • NVt. cores has better performance. (up to 2x) 500mm2 80W HKMGS Available cores on-chip Liang Wang, ECE6332 Final

  11. Variation • NVt. cores are very sensitive to variations • Functionality. (ratioed circuits) • Performance. (focused in this project) • Monte-Carlo simulation • Performed on every VDD setups • 100 iterations per VDD • Process and mismatch Liang Wang, ECE6332 Final

  12. Voltage-Frequency Scaling Revisited • HKMGS • Up to 5x slow down • LP • Up to 10x slow down • HKMGS • Up to 10x slow down • LP • Up to 100x slow down Liang Wang, ECE6332 Final

  13. Impact of Variation 400mm2, 100W, IO Worse Perf. Lower Utilization Flatten Vdd Liang Wang, ECE6332 Final

  14. Conclusion • In terms of performance • Simple core (IO) is better. • HP process (HKMGS) is better. • Lowering VDD reduces dark silicon, improves throughput. • Vulnerable to process variation. Liang Wang, ECE6332 Final

More Related