140 likes | 218 Views
Attacking the Power-Wall by Using Near-threshold Cores. Liang Wang liang@cs.virginia.edu. Power Wall. The end of Classical Scaling. Vdd : almost constant Power density: roughly increase in exponential Utilization: roughly decrease in exponential
E N D
Attacking the Power-Wall by Using Near-threshold Cores Liang Wang liang@cs.virginia.edu
Power Wall • The end of Classical Scaling. • Vdd: almost constant • Power density: roughly increase in exponential • Utilization: roughly decrease in exponential • We can fabricate more cores than we can power up * From Venkatesh, et. al. ASPLOS’10 Dark Silicon Liang Wang, ECE6332 Final
Near-threshold Cores (NVt. Cores) • Pros • Low power per-core. • More cores per-chip. • Limitations • Low per-core frequency, reducing throughput gains from parallelization. • Variations, harmful for performance and functionality. Will NVt. cores be a viable solution to push down the power-wall? Liang Wang, ECE6332 Final
Outline • Performance Model • Analyses and Results • Conclusion Liang Wang, ECE6332 Final
System Modeling Symmetric Multi-core System A Single core v Dynamic Power Core Area: a Power: p(v) Freq: f(v) Static Power Area: A Power: P Frequency Fitted to circuit sim. Number of active cores Application with Amdahl’s Law parallel ration of Liang Wang, ECE6332 Final
Simulation Setup • Circuit • A single inverter • Ripple carry Adder (32bits, 16bits, 8bits, and 4bits) • Technology Library • A modified version of Predictive Technology Model (PTM) • Technology Nodes • 45nm, 32nm, 22nm, 16nm • Process Variants • HKMGS: High-performance High-KMetal Gate and Stress effect. • LP: Low-power process • CAD Tools • RC Compiler • Spectre driven by Ocean Liang Wang, ECE6332 Final
Voltage-Frequency Scaling LP has much larger frequency drop-down comparing to HP with the same change in vdd ~8x ~400x ~15x 16nm has larger frequency drop-down comparing to 45nm With the same change in vdd ~103x Liang Wang, ECE6332 Final
Design space exploration (Area) 45nm, HKMGS, IO cores, 100w, =0.99 Peak is capped by total area 2x Peak from 200 to 6.4K saturating Liang Wang, ECE6332 Final
Cross-technology study 500mm2 80W 400mm2 100W Liang Wang, ECE6332 Final
Compare to Dark Silicon • NVt. cores alleviate the issue of low utilization. • NVt. cores has better performance. (up to 2x) 500mm2 80W HKMGS Available cores on-chip Liang Wang, ECE6332 Final
Variation • NVt. cores are very sensitive to variations • Functionality. (ratioed circuits) • Performance. (focused in this project) • Monte-Carlo simulation • Performed on every VDD setups • 100 iterations per VDD • Process and mismatch Liang Wang, ECE6332 Final
Voltage-Frequency Scaling Revisited • HKMGS • Up to 5x slow down • LP • Up to 10x slow down • HKMGS • Up to 10x slow down • LP • Up to 100x slow down Liang Wang, ECE6332 Final
Impact of Variation 400mm2, 100W, IO Worse Perf. Lower Utilization Flatten Vdd Liang Wang, ECE6332 Final
Conclusion • In terms of performance • Simple core (IO) is better. • HP process (HKMGS) is better. • Lowering VDD reduces dark silicon, improves throughput. • Vulnerable to process variation. Liang Wang, ECE6332 Final