Arithmetic III CPSC 321

Arithmetic IIICPSC 321 Andreas Klappenecker

Any Questions?

Today’s Menu Addition Multiplication Floating Point Numbers

Recall: Full Adder cin s a b cout 3 gates delay for first adder, 2(n-1) for remaining adders

Ripple Carry Adders • Each gates causes a delay • our example: 3 gates for carry generation • book has example with 2 gates • Carry might ripple through all n adders • O(n) gates causing delay • intolerable delay if n is large • Carry lookahead adders

Faster Adders Why are they called like that? cout=ab+cin(a xor b) =ab+acin+bcin =ab+(a+b)cin = g + p cin Generate g = ab Propagate p = a+b

Fast Adders Iterate the idea, generate and propagate ci+1 = gi + pici = gi + pi(gi-1 + pi-1 ci-1) = gi + pigi-1+ pipi-1ci-1 = gi + pigi-1+ pipi-1gi-2 +…+ pipi-1 …p1g0 +pipi-1 …p1p0c0 Two level AND-OR circuit Carry is known early!

A Simple ALU for MIPS • Need to support the set-on-less-than instruction (slt) • remember: slt is an arithmetic instruction • produces 1 if rs < rt and 0 otherwise • use subtraction: (a-b) < 0 implies a < b • Need to support test for equality (beq $t5, $t6, $t7) • use subtraction: (a-b) = 0 implies a = b

ALU 000 = and001 = or010 = add110 = subtract111 = slt • Note: zero is a 1 when the result is zero!

Multipliers

Multiplication • More complicated than addition • accomplished via shifting and addition • Let's look at 3 versions based on the grade school algorithm0010 (multiplicand)__ x_1011 (multiplier) 0010 x 1 00100 x 1 001000 x 0 0010000 x 1 00010110 • Shift and add if multiplier bit equals 1

Multiplication 0010 (multiplicand) __ x_1011 (multiplier) 0010 x 1 00100 x 1 001000 x 0 0010000 x 1 0010110

Multiplication • If each step took a clock cycle, this algorithm would use almost 100 clock cycles to multiply two 32-bit numbers. • Requires 64-bit wide adder • Multiplicand register 64-bit wide

Variations on a Theme • Product register has to be 64-bit • Nothing we can do about that! • Can we take advantage of that fact? • Yes! Add multiplicand to 32 MSBs • product = product >> 1 • Repeat last steps 0010 (multiplicand) __ x_1011 (multiplier) 0010 x 1 00100 x 1 001000 x 0 0010000 x 1 0010110

Second Version

Version 1 versus Version 2

Critique • Registers needed for • multiplicand • multiplier • product • Use lower 32 bits of product register: • place multiplier in lower 32 bits • add multiplicand to higher 32 bits • product = product >> 1 • repeat

Final Version Multiplier (shifts right)

Summary It was possible to improve upon the well-known grade school algorithm by • reducing the adder from 64 to 32 bits • keeping the multiplicand fixed • shifting the product register • omitting the multiplier register

The Booth Multiplier Let’s kick it up a notch!

Runs of 1’s • 011102 = 14 = 8+4+2 = 16 – 2 • Runs of 1s (current bit, bit to the right): • 10 beginning of run • 11 middle of a run • 01 end of a run of 1s • 00 middle of a run of 0s

Run’s of 1’s • 0111 1111 11002 = 2044 • How do you get this conversion quickly? • 0111 11112 = 128 – 1 = 127 • 0111 1111 11112 = 2048 – 1 • 0111 1111 11002 = 2048 – 1 – 3 = 2048 – 4

Example 0010 0110 0000 shift -0010 sub 0000 shift 0010 add 00001100 0010 0110 0000 shift 0010 add 0010 add 0000 shift 00001100

Booth Multiplication Current and previous bit 00: middle of run of 0s, no action 01: end of a run of 1s, add multiplicand 10: beginning of a run of 1s, subtract mcnd 11: middle of string of 1s, no action

Example: 0010 x 0110

Negative numbers Booth’s multiplication works also with negative numbers: 2 x -3 = -6 00102 x 11012 = 1111 10102

Negative Numbers 00102 x 11012 = 1111 10102 0) Mcnd 0010 Prod 0000 1101,0 1) Mcnd 0010 Prod 1110 1101,1 sub 1) Mcnd 0010 Prod 1111 0110,1 >> 2) Mcnd 0010 Prod 0001 0110,1 add 2) Mcnd 0010 Prod 0000 1011,0 >> 3) Mcnd 0010 Prod 1110 1011,0 sub 3) Mcnd 0010 Prod 1111 0101,1 >> 4) Mcnd 0010 Prod 1111 0101,1 nop 4) Mcnd 0010 Prod 1111 1010,1 >>

Summary • Extends the final version of the grade school algorithm • Simple change: add, subtract, or do nothing if last and previous bit respectively satisfy 0,1; 1,0 or 0,0; 1,1 • 0111 11002 = 128 – 4 = 1000 0002 – 0000 01002

Floating Point Numbers

Floating Point Numbers We often use calculations based on real numbers, such as • e = 2.71828… • Pi = 3.14592… We represent approximations to such numbers by floating point numbers • 1.xxxxxxxxxx2 x 2yyyy

Floating-Point Representation: float We need to distribute the 32 bits among sign, exponent, and significand • seeeeeeeexxxxxxxxxxxxxxxxxxxxxxx The general form of such a number is • (-1)s x F x 2E • s is the sign, F is derived from the significand field, and E is derived from the exponent field

Floating Point Representation: double • 1 bit sign, 11 bits for exponent, 52 bits for significand • seeeeeeeeeeexxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Range of float: 2.0 x 10-38 … 2.0 x 1038 Range of double: 2.0 x 10-308 … 2.0 x 10308

IEEE 754 Floating-Point Standard • Makes leading bit of normalized binary number implicit 1 + significand • If significand is s1 s2 s3 s4 s5 s6 … then the value is (-1)s x (1 + s1/2 + s2/4 + s3/8 + … ) 2E • Design goal of IEEE 754: Integer comparisons should yield meaningful comparisons for floating point numbers

IEEE 754 Standard • Negative exponents are a difficulty for sorting • Idea: most positive … most negative 1111 1111 … 0000 0000 • IEEE 754 uses a bias of 127 for single precision. • Exponent -1 is represented by -1 + 127 = 126

IEEE 754 Example Represent -0.75 in single precision format. -0.75 = -3/4 = -112 / 4 = -0.112 In scientific notation: -0.11 x 20 = -1.1 x 2-1 the latter form is normalized sc. notation Value: (-1)s x (1+ significand) x 2(Expnt – 127)

Example (cont’d) • -1.1 x 2-1 = (-1)1 x (1 + .1000 0000 0000 0000 0000 000) x 2(126 – 127) The single precision representation is 1 0111 11101000 0000 0000 0000 0000 000 BAM!

Conclusion • We learned how to multiply • Three variations on the grade school algorithm • Booth multiplication • Floating point representation a la IEEE 754 (Photo’s are courtesy of www.emerils.com, some graphs are due to Patterson and Hennessy)

Arithmetic III CPSC 321

Arithmetic III CPSC 321

Presentation Transcript

Verilog II CPSC 321

The Memory Hierarchy II CPSC 321

Assembly Language II CPSC 321

Part III The Arithmetic/Logic Unit

CPSC 321

Computer Architecture CPSC 321

321

Arithmetic

Quantum Computing II CPSC 321

Pipelined Processor II (cont’d) CPSC 321

Quantum Computing CPSC 321

Part III The Arithmetic/Logic Unit

Review CPSC 321

cpsc

Arithmetic

Arithmetic II CPSC 321

Computer Architecture CPSC 321

Arithmetic

Arithmetic II CPSC 321

Arithmetic

321

Part III The Arithmetic/Logic Unit