370 likes | 496 Views
Designing a Processor from the Ground Up to Allow Voltage/Reliability Tradeoffs. Andrew Kahng ( UCSD ) Seokhyeong Kang ( UCSD ) Rakesh Kumar ( Illinois ) John Sartori ( Illinois ). Timing Errors. Power is a first-order design constraint Voltage scaling can significantly reduce power
E N D
Designing a Processor from the Ground Up to Allow Voltage/Reliability Tradeoffs Andrew Kahng (UCSD) Seokhyeong Kang (UCSD) Rakesh Kumar (Illinois) John Sartori (Illinois)
Timing Errors • Power is a first-order design constraint • Voltage scaling can significantly reduce power • Voltage scaling may result in timing errors Operating Voltage
Research Questions • How does a conventional processor behave when we fix frequency and scale down voltage? • How can we reduce the voltage at which timing errors are observed? • Reduce power while maintaining the same performance level
Limitation of Voltage Scaling • At some voltage, circuit breaks down Voltage scaling must halt after only 10% scaling.
Limitation of Voltage Scaling What problems are caused by steep error degradation?
Problems with Steep Error Degradation Voltage scaling limited in traditional designs.
Problems with Steep Error Degradation • No power savings as error rate increases Traditional design No reliability/power tradeoff
Problems with Steep Error Degradation • Reliability/power tradeoffs enabled • Allows switching between error tolerance techniques at different voltages/error rates Higher error rate Lower power Why do circuits fail catastrophically?
Reason for Steep Error Degradation • Critical paths are bunched up in traditional designs.
Question… How can we change the slack distribution to achieve a graceful failure characteristic?
Design Objectives and Insight • Optimize frequently exercised critical paths. Power-optimized design: Reclaim excess timing slack Slack-optimized design: Optimize critical paths Make slack distribution gradual by re-distributing slack between paths. • De-optimize rarely exercised paths. Both gradual failure and low power can be achieved.
Slack Re-distribution Example Negative Slack Positive Slack Error Rate = 1% Error Rate = 25% Negative Slack Positive Slack 0.0 -0.1
Proposed Design Flow • Input: RTL description • Output: Gradual slack design • Objective: Minimize voltage for a given error rate over a range of error rates Voltage Scaling Path Optimization Area Reduction
Iterative Optimization Iterative optimization avoids unnecessary swaps. Using fixed target results in over-optimization.
Design-level Methodology • Library characterization • Cadence SignalStorm – Synopsys Liberty generation for each voltage • Functional simulation • Cadence NC Verilog – Gate-level simulation • Slack Optimization • C++ with Synopsys PrimeTime interface • ECO P&R • Cadence SOCEncounter – Placement and Routing • Benchmark generation • Virtutech Simics – Test vector generation
Gradual Slack Distribution Slack optimization achieves gradual slack distribution.
Processor Module Optimization Slack optimized design has lowest power for all error rates.
Processor Error Rate and Power Designs with comparable error rates have much higher power/area overheads.
Reliability/Power Tradeoff Slack-optimized design enjoys continued power reduction as error rate increases.
Enhancing Razor-based Design Slack optimization extends range of voltage scaling and reduces Razor recovery cost.
Summary and Conclusion • Showed limitations of traditional processors w.r.t. voltage scaling • Traditional designs break down • Presented design technique that enables voltage/reliability tradeoffs • Optimize frequently exercised critical paths • De-optimize rarely-exercised paths • Demonstrated significant power benefits of gradual slack design • Reduced power 29% for 2% error rate, 27% on average
Slack Optimization Techniques • Path Optimization and Power Reduction
Extended Voltage Scaling • Focus on frequently exercised negative slack paths • Reduce error rate while minimizing cell swaps (power overhead) SWAP Rank paths by error rate contribution. Upsize cells in paths to increase slack.
Power Reduction • Downsize cells on rarely exercised paths • Reduce leakage power while leaving error rate unaffected SWAP Check Path Slack Toggle Rate ≈ 0
Error Rate Forecasting • Error rate contribution of one FF • Error rate of design
Significance of Processor Power • Power is a first-order design constraint • Voltage scaling can significantly reduce power Voltage Scaling: 50% Power Reduction: 80%
DVFS Benefit and Cost How effective is voltage scaling when frequency is fixed?
Alternatives – Blueshift • Goal: Optimize paths that cause errors to enable more frequency overscaling • Techniques: PCT/OSB • Uses iterative simulation loop – infeasible for large designs
Alternatives – Tightly Constrained SP&R • Goal: Optimize all paths aggressively • Technique: Traditional SP&R with aggressive target • Some paths will not meet tight constraint, and slack distribution becomes more gradual
Insight • Optimize frequently exercised paths at the expense of rarely exercised paths • Optimizing frequently exercised paths enables deeper voltage scaling • De-optimizing rarely exercised paths keeps power overhead low
Iterative Optimization Flow Scale Voltage Optimize Paths Iterate
A New Processor Design Goal • Reshape the slack distribution so processor fails gracefully
Moore’s Law • Power consumption of processor node doubles every 18 months.
Power Scaling • With current design techniques, processor power soon on par with nuclear power plant
Outline • Background and Motivation • Insight • Power Reduction Techniques • Design Flow • Results • Summary