170 likes | 178 Views
This research focuses on reducing power consumption in a 32-bit adder circuit by implementing parallelism and lowering the supply voltage. The design and verification of both the standard and low-power versions were conducted, with significant power savings observed in the low-power model.
E N D
Power Reduction in a 32-bit Adder using Parallelism and Reduced Supply voltage Clint Patterson ELEC-6270 April 24, 2009
Why? • Power reduction is a critical design goal in modern chip design. • Power reduction can be implemented through the use of parallelism and reduced supply voltage • Reduced VDD means individual components function with more delay – ideally N times for N components. • Parallel operation necessary to maintain throughput
Objectives • Design and verify a 32-bit synchronous adder circuit in VHDL • Design and verify a parallel 32-bit synchronous adder circuit (N=2) in VHDL • Determine voltage used for power analysis • Determine power savings
Standard Block Design • 64 bit input vector (2x32 bit) • 33 bit output (32 bit + 1) • 1 cycle delay for result
Component Verification - Adder • …1010 + …0101 = …1111 (C = 0) • …1011 + …0101 = …0000 (C = 1) • …0001 + …1111 = …0000 (C = 1) • …0000 + …1111 = …1111 (C = 0)
Standard Design Verification • Vector 1 clocked in at cycle 1, result at cycle 2 • Vector 2 clocked in at cycle 3, result at cycle 4 • Vector 3 clocked in at cycle 4, result at cycle 5
Low-Power Design • Same external I/O as standard design • Each adder uses a divided down clock of F/2 • 2 cycle delay for result
Component Verification – MPC • Takes in clock of frequency F and outputs a divided down clock with frequency F/2 • Clock In – 100MHz • Clock Out – 50 MHz
Low-Power Design Verification • Vector 1 clocked in at cycle 1, result at cycle 3 • Vector 2 clocked in at cycle 3, result at cycle 5 • Vector 3 clocked in at cycle 4, result at cycle 6 • Vector 4 clocked in at cycle 5, result at cycle 7
Next Steps • VHDL models were optimized in Leonardo Spectrum (Level 3) and converted to Verilog • Verilog files were converted to .myrutmod and then analyzed in Powersim • Null Results for Low Power Model (0.0000000…) • Segmentation Faults • Must Use Design Architect / Eldo for analysis • Use Verilog Files • Must determine appropriate voltages for Simulation / Analysis for meaningful comparisons.
Delay Calculation • Supply voltages determined according to formula : • F = k*(Vdd - Vt)/Vdd • 165MHz = k*(1.8V – 0.38V)/1.8V, k = 209.2 MHz • 1.5V (~156MHz), 0.9 (~120MHz) for standard supply • 0.75V (~100MHz), 0.65V (~85MHz), and 0.5V (~50MHz) for low-power supply voltages • Run both models at 100 MHz • Simple periods • Much slack for 1.5V supply (~3-4ns) • Slack for low-power model also, only needs 50MHz for 2x delay
Observations • 1.5V and 0.9V for standard model both give verified results • 0.9V shows slightly more delay than 1.5V • See next slide for comparison of more exaggerated 1.8V and 0.75V results for standard design • Simulation for 0.5V gives unreasonably low power for LP model • EZ-Wave won’t load; can’t guarantee correct performance but assume this figure not legitimate. • It is assumed that 0.5V doesn’t provide results fast enough, thus eliminating dynamic power. • Delay calculation falsely inflated due to not using alpha-power law; use 0.65V / 0.75V simulations since they are still above 50MHz requirement
Eldo Verification • Verification results for 1.8V and 0.75V with similar vectors • 0.75V too slow for verified results • 1.8V gives too much slack
Power Reduction • Admittedly, some power savings is seen in the low-power cases simply because so much slack is allowed for the 1.5V supply. • Good power reduction is still seen for the LP model at 0.75V/0.65V when comparing to 0.9V standard supply • ~63% and 75%, respectively
Conclusions • Very notable power reduction can be seen through using component parallelism and reduced supply. • 16% more area (~700 gates instead of ~600) • Verification / Analysis is a critical part of design • Understanding of available design tools is key
Future Work • Measure power reduction for a range of parallel scale implementations (N = 3, 4, 5…) to determine optimal implementation for savings. • Determine exact formula for delay vs. voltage through experimental results