160 likes | 305 Views
A Decimal Floating-Point Adder with Decoded Operands and a Decimal Leading-Zero Anticipator. By Liang-Kai Wang and Michael J. Schulte. Joseph Schneider March 12, 2010. Objective. Goal is to improve latency for DFP Adder
E N D
A Decimal Floating-Point Adder with Decoded Operands and a Decimal Leading-Zero Anticipator By Liang-Kai Wang and Michael J. Schulte Joseph Schneider March 12, 2010
Objective • Goal is to improve latency for DFP Adder • Number of modifications performed to achieve this, such as an implementation of a new internal format • Overall focus is on the design of a decimal LZA
Leading-Zero Anticipator • Detects location of most significant bit • Previous designs have been for binary, not decimal • Design of decimal LZA expected to improve latency
New Internal Format • Exponent field uncompressed • Significand encoded in BCD • New section for Leading Zero Count; Removes leading zero detection from critical path
Improvements • Internal Format removes need for Forward and Backward conversion units • Pre-correction moved in front of Swapping unit and duplicated; Keeps it out of critical path • Leading Zero Detection no longer performed in Shift Amount unit; Lead Zero Count is now an input signal, LZA used so later decimal operations do not need to recalculate it
Leading Zero Anticipator • Needed in addition and subtraction to guarantee leading zero count of output is correct • Only needed when result after addition or subtraction is not rounded; LZC is always zero when result is rounded
LZA - Addition • Preliminary LZC is the minimum number of leading zeros between the two significands being added • If there is a carry, final LZC obtained by reducing preliminary LZC by one
LZA - Subtraction • Requires Encoding unit, Correction unit, and a parallel array of decimal digit adders • Encoding unit • Converts BCD digits into strings of zeros and ones • Detects position of most significant non-zero digit in the string • Correction unit • Flag generation modules and correction trees determine if correction needs to be performed on Encoding unit’s result
Testing • IBM decNumber library version 3.56 used to verify correctness of adder • Sign, exponent, and length and value of significand randomly generated • Adder successfully passed numerous random tests and the corner cases of IBM’s test suite • Previous adder version and new adder implemented in Verilog RTL using TSMC 45nm bulk technology
Results • Both designs use same floorplan so Area Util. Rate reflects how much area used by each design • New adder 14% faster but at the cost of 18% more area
Results – Adder Area Profile • LZA takes up significant amount of area, though Kogge-Stone adder is still the largest component
Results- LZA • LZA synthesized alone; Critical path has maximum delay of 24 FO4 inverter delays • Subtractor takes up over 60% of LZA area