180 likes | 243 Views
Future Research in Computer Arithmetic. September 28, 2007 Eric Schwarz, IBM. Topics. Binary Multiplication Proofs of Overlapped Scanning Foundations 7:3 Counter Design Future Division Direct Division Remainder Avoidance active Decimal Floating-Point extremely active
E N D
Future Research in Computer Arithmetic September 28, 2007 Eric Schwarz, IBM
Topics • Binary Multiplication • Proofs of Overlapped Scanning Foundations • 7:3 Counter Design Future • Division • Direct Division • Remainder Avoidance active • Decimal Floating-Point extremely active • Pipelining Add • Multiply • Multiply-Add
Basics • Recode multiplier and separate into digits • Create multiples of the multiplicand • Multiplex the multiples • Sum all partial products in counter tree • reduce final 2 partial products in CLA
A.D. Booth in 1951 showed overlapped scanning L. Rubinfield in 1975 proved radix-4 Booth S. Vassiliadis in 1989 proved Booth for any radix History of Overlapped Scanning
High Level Counters 7:3 US Patent 5,187,679 in 1993
Power6 and Z6* processors have DFU Core L2 Data L2 Data L3 Ctrl L2 Ctrl Mem Ctrl Mem Ctrl L2 Ctrl L3 Ctrl L2 Data L2 Data Core
754R Decimal Floating Point Format • IEEE 754R defines 2 formats: • Integer coefficient & DPD (Densely Packed Decimal) • Formats for 32,64,128 bit. • C encodes 2 exponent bits and 1 decimal digit in 5 bits.
Power6 Decimal Floating Point Unit • Cycle Time: approx 5Ghz, 13 FO4 design • Hardware executes 64-bit and 128-bit formats. • 144-bit Dataflow can be split into two 72-bit pipes. • DPD coefficients are decoded into BCD for execution. • All cases are handled in hardware. (No Special Case Software Traps)
Pipelined 2 cycle Rotator 4D 4D 4D 4D 2D 2D 4D 4D 4D 4D Adder Register Adder Register +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 Coefficient Dataflow Multiple Generator 1X,3X,4X Registers 36 Digit Dataflow splits to Two 18 Digit Dataflows Doubler & Quintupler Q A mux B mux Expand DPD to BCD Prescale Table Expand DPD to BCD Operand A Lo Register Operand A Hi Register Operand B Hi Register Operand B Lo Register Magnitude Calculations: A-B B-A Multiplication: Partial Product Partial Product Accumulate Generate Q Correction and Multiplexers Working Register Hi Working Register Lo Q DEC to BIN & BIN to DEC Converters Compress BCD to DPD Result Register Hi Result Register Lo 36 digits wide (144 bits)
Power6 DFP Multiplication Partial products are formed from two multiples to reduce area 34 digit multiplication on 36 digit dataflow 1 digit every 2 cycles: 16 digit multiplication on dual 18 digit dataflows 1 digit every cycles:
Performance of Arithemetic Operations * N is the number of digits in the first operand excluding leading zeros
Future • Pipelined Adder with Rounding Injection • Lia-Kai Wang • Decimal Multiplication • Mark Erle and Michael Schulte – 3:2 Counter • Tomas Lang and A. Nannarelli - 4:2 Counter • Alvaro Vazquez et. al. 4221 • Luigi Dadda - counters • Decimal Multiply-Add Pipeline with Rounding Injection • Divide • Tomas Lang and A. Nannarelli – Base 2 and Base 5 • Intel Format
Future of Computer Arithmetic • Is based on clear proofs and expositions of the fundamental concepts. • the easier to understand, the easier to build on • Arithmetic is very active • IEEE 754R Standard currently in ballot • Decimal Floating-Point pipelined designs • new adder designs, new multiplier designs • vector processing, image processing, video game