ELEC 5200-001/6200-001 Computer Architecture and Design Spring 2007 Computer Arithmetic (Chapter 3)

ELEC 5200-001/6200-001Computer Architecture and DesignSpring 2007Computer Arithmetic(Chapter 3) Vishwani D. Agrawal James J. Danaher Professor Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849 http://www.eng.auburn.edu/~vagrawal vagrawal@eng.auburn.edu ELEC 5200-001/6200-001 Lecture 6

What Goes on Inside ALU? • Machine instr.: add $t1, $s1, $s2 • What it means to computer: 000000 10001 10010 01000 00000 100000 Arithmetic Logic Unit (ALU) Control unit Flags Registers Registers ELEC 5200-001/6200-001 Lecture 6

Basic Idea • Hardware can only deal with binary digits, 0 and 1. • Must represent all numbers, integers or floating point, positive or negative, by binary digits, called bits. • Devise electronic circuits to perform arithmetic operations, add, subtract, multiply and divide, on binary numbers. ELEC 5200-001/6200-001 Lecture 6

Positive Integers • Decimal system: made of 10 digits, {0,1,2, . . . , 9} 41 = 4×101 + 1×100 255 = 2×102 + 5×101 + 5×100 • Binary system: made of two digits, {0,1} 00101001 = 0×27 + 0×26 + 1×25 + 0×24 +1×23 +0×22 + 0×21 + 1×20 =32 + 8 +1 = 41 11111111 = 255, largest number with 8 binary digits, 28-1 ELEC 5200-001/6200-001 Lecture 6

Base or Radix • For decimal system, 10 is called the base or radix. • Decimal 41 is also written as 4110 or 41ten • Base (radix) for binary system is 2. • Thus, 41ten = 1010012 or 101001two • Also, 111ten = 1101111two and 111two = 7ten ELEC 5200-001/6200-001 Lecture 6

Signed Integers – What Not to Do • Use fixed length binary representation • Use left-most bit (called most significant bit or MSB) for sign: 0 for positive 1 for negative • Example: +18ten = 00010010two - 18ten = 10010010two ELEC 5200-001/6200-001 Lecture 6

Why Not Use Sign Bit • Sign and magnitude bits should be differently treated in arithmetic operations. • Addition and subtraction require different logic circuits. • Overflow is difficult to detect. • “Zero” has two representations: +0ten = 00000000two - 0ten = 10000000two • Signed-integers are not used in modern computers. ELEC 5200-001/6200-001 Lecture 6

Integers With Sign – Other Ways • Use fixed-length representation, but no sign bit • 1’s complement: To form a negative number, complement each bit in the given number. • 2’s complement: To form a negative number, start with the given number, subtract one, and then complement each bit, or first complement each bit, and then add 1. • 2’s complement is the most preferred representation. ELEC 5200-001/6200-001 Lecture 6

1’s-Complement • To change the sign of a binary integer simply complement (invert) each bit. • Example: 3 = 0011, -3 = 1100 • n-bit representation: Negation is equivalent to subtraction from 2n – 1 Infinite universe -3 0 3 0 3 6 -6 -3 -0 Modulo-16 universe 0 3 6 9 12 15 0000 0011 0110 1001 1100 1111 ELEC 5200-001/6200-001 Lecture 6

2’s-Complement • Add 1 to 1’s-complement representation. • Some properties: • Only one representation for 0 • Exactly as many positive numbers as negative numbers • Slight asymmetry – there is one negative number with no positive counterpart ELEC 5200-001/6200-001 Lecture 6

Three Representations Sign-magnitude 000 = +0 001 = +1 010 = +2 011 = +3 100 = - 0 101 = - 1 110 = - 2 111 = - 3 1’s complement 000 = +0 001 = +1 010 = +2 011 = +3 100 = - 3 101 = - 2 110 = - 1 111 = - 0 2’s complement 000 = +0 001 = +1 010 = +2 011 = +3 100 = - 4 101 = - 3 110 = - 2 111 = - 1 (Preferred) ELEC 5200-001/6200-001 Lecture 6

2’s Complement Numbers 000 -1 +1 0 -1 111 001 +1 Positive numbers 010 +2 Negative numbers -2 110 011 +3 -3 101 - 4 100 Overflow Negation ELEC 5200-001/6200-001 Lecture 6

2’s Complement n-bit Numbers • Range: - 2n-1 through 2n-1- 1 • Unique zero: 00000000 . . . . . 0 • Negation rule: see slide 8. • Expansion of bit length: stretch the left-most bit all the way, e.g., 11111101 is still – 3. • Overflow rule: If two numbers with the same sign bit (both positive or both negative) are added, the overflow occurs if and only if the result has the opposite sign. • Subtraction rule: for A – B, add – B to A. ELEC 5200-001/6200-001 Lecture 6

Converting 2’s Compliment to Decimal n-2 an-1an-2 . . . a1a0 = -2n-1an-1 + Σ 2i ai i=0 8-bit conversion box -128 64 32 16 8 4 2 1 -128 64 32 16 8 4 2 1 1 1 1 1 1 1 0 1 Example -128+64+32+16+8+4+1 = -128 + 125 = -3 ELEC 5200-001/6200-001 Lecture 6

For More on 2’s Complement • Chapter 2 in D. E. Knuth, The Art of Computer Programming: Seminumerical Algorithms, Volume II, Second Edition, Addison-Wesley, 1981. • A. al’Khwarizmi, Hisab al-jabr w’al-muqabala, 830. Donald E. Knuth (1938 - ) Abu Abd-Allah ibn Musa al’Khwarizmi (~780 – 850) ELEC 5200-001/6200-001 Lecture 6

MIPS • MIPS architecture uses 32-bit numbers. What is the range of integers (positive and negative) that can be represented? Positive integers: 0 to 2,147,483,647 Negative integers: - 1 to - 2,147,483,648 • What are the binary representations of the extreme positive and negative integers? 0111 1111 1111 1111 1111 1111 1111 1111 = 231 - 1= 2,147,483,647 1000 0000 0000 0000 0000 0000 0000 0000 = - 231 = - 2,147,483,648 • What is the binary representation of zero? 0000 0000 0000 0000 0000 0000 0000 0000 ELEC 5200-001/6200-001 Lecture 6

Addition • Adding bits: • 0 + 0 = 0 • 0 + 1 = 1 • 1 + 0 = 1 • 1 + 1 = (1) 0 • Adding integers: carry 1 1 0 0 0 0 . . . . . . 0 1 1 1 two = 7ten + 0 0 0 . . . . . . 0 1 1 0 two = 6ten = 0 0 0 . . . . . . 1 (1)1 (1)0 (0)1 two = 13ten ELEC 5200-001/6200-001 Lecture 6

Subtraction • Direct subtraction • Two’s complement subtraction 0 0 0 . . . . . . 0 1 1 1 two = 7ten - 0 0 0 . . . . . . 0 1 1 0 two = 6ten = 0 0 0 . . . . . . 0 0 0 1two = 1ten 1 1 1 . . . . . . 1 1 0 0 0 0 . . . . . . 0 1 1 1 two = 7ten + 1 1 1 . . . . . . 1 0 1 0 two = - 6ten = 0 0 0 . . . . . . 0 (1) 0 (1) 0 (0)1 two = 1ten ELEC 5200-001/6200-001 Lecture 6

Overflow: An Error • Examples: Addition of 3-bit integers (range - 4 to +3) • -2-3 = -5110 = -2 + 101 = -3 = 1011 = 3 (error) • 3+2 = 5011 = 3 010 = 2 = 101 = -3 (error) • Overflow rule: If two numbers with the same sign bit (both positive or both negative) are added, the overflow occurs if and only if the result has the opposite sign. 000 111 0 001 1 -1 010 – + 2 110 -2 3 -3 011 -4 101 100 Overflow ELEC 5200-001/6200-001 Lecture 6

a half_sum XOR b carry_out AND Design Hardware Bit by Bit • Adding two bits: a b half_sum carry_out 0 0 0 0 0 1 1 0 1 0 1 0 1 1 0 1 • Half-adder circuit ELEC 5200-001/6200-001 Lecture 6

One-bit Full-Adder • One-bit full-adder truth table a b ci half_sum carry_out sum co 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 1 0 1 0 1 0 0 1 1 1 0 0 1 1 0 0 1 0 1 0 1 0 1 1 0 0 1 1 1 0 0 1 0 1 1 1 1 0 1 1 1 ELEC 5200-001/6200-001 Lecture 6

One-bit Full-Adder Circuit ci FAi XOR sumi ai XOR AND bi AND OR Ci+1 ELEC 5200-001/6200-001 Lecture 6

32-bit Ripple-Carry Adder c0 a0 b0 sum0 FA0 sum1 a1 b1 FA1 sum2 a2 b2 FA2 sum31 FA31 a31 b31 ELEC 5200-001/6200-001 Lecture 6

How Fast is Ripple-Carry Adder? • Longest delay path (critical path) runs from cin to sum31. • Suppose delay of full-adder is 100ps. • Critical path delay = 3,200ps • Clock rate cannot be higher than 1/(3,200×1012) Hz = 312MHz. • Must use more efficient ways to handle carry. ELEC 5200-001/6200-001 Lecture 6

a0-a15 16-bit ripple carry adder b0-b15 cin Speeding Up the Adder sum0-sum15 a16-a31 16-bit ripple carry adder 0 b16-b31 0 sum16-sum31 Multiplexer a16-a31 16-bit ripple carry adder 1 b16-b31 This is a carry-select adder 1 ELEC 5200-001/6200-001 Lecture 6

Fast Adders • In general, any output of a 32-bit adder can be evaluated as a logic expression in terms of all 65 inputs. • Number of levels of logic can be reduced to log2N for N-bit adder. Ripple-carry has N levels. • More gates are needed, about log2N times that of ripple-carry design. • Fastest design is known as carry lookahead adder. ELEC 5200-001/6200-001 Lecture 6

N-bit Adder Design Options Reference: J. L. Hennessy and D. A. Patterson, Computer Architecture: A Quantitative Approach, Second Edition, San Francisco, California, 1990. ELEC 5200-001/6200-001 Lecture 6

MIPS Instructions (see p. 175) • Arithmetic: add, sub, addi, addu, subu, addiu, mfc0 • Data transfer: lw, sw, lhu, sh, lbu, sb, lui • Logical: and, or, nor, andi, ori, sll, srl • Conditional branch: beq, bne, slt, slti, sltu, sltiu • Unconditional jump: j, jr, jal ELEC 5200-001/6200-001 Lecture 6

Exception or Interrupt • If an overflow is detected while executing add, addi or sub, then the address of that instruction is placed in a register called exception program counter (EPC). • Instruction mfc0 can copy $epc to any other register, e.g., mfc0 $s1, $epc • Unsigned operations, addu, addiu and subu do not cause an exception or interrupt. ELEC 5200-001/6200-001 Lecture 6

Multifunction ALU operation c0 a0 b0 result0 ALU0 result1 a1 b1 ALU1 result2 operation a2 b2 ALU2 ci ai bi FAi 3 NOR 2 resulti mux OR 1 result31 ALU31 a31 b31 AND 0 ELEC 5200-001/6200-001 Lecture 6

Binary Multiplication (Unsigned) 1 0 0 0 two = 8ten multiplicand 1 0 0 1 two = 9ten multiplier ____________ 1 0 0 0 0 0 0 0 partial products 0 0 0 0 1 0 0 0 ____________ 1 0 0 1 0 0 0two = 72ten ELEC 5200-001/6200-001 Lecture 6

Multiplication Flowchart Start Initialize product register to 0 Partial product number, i = 1 LSB of multiplier ? Add multiplicand to product and place result in product register 1 0 Left shift multiplicand register 1 bit Right shift multiplier register 1 bit i = 32 i < 32 i = ? Done i = i + 1 ELEC 5200-001/6200-001 Lecture 6

Serial Multiplication shift left shift right Multiplicand (expanded 64-bits) 32-bit multiplier 64 64 Test LSB 32 times 0 64-bit ALU 1 1 64 64-bit product register write 3 operations per bit: shift right shift left add Need 64-bit ALU ELEC 5200-001/6200-001 Lecture 6

Serial Multiplication (Improved) Multiplicand 2 operations per bit: shift right add 32 32 1 Test LSB 32 times 32-bit ALU LSB 1 32 write 64-bit product register shift right 00000 . . . 00000 32-bit multiplier Initialized prod. Reg. ELEC 5200-001/6200-001 Lecture 6

Example: 0010two× 0011two 0010two× 0011two = 0110two, i.e., 2ten×3ten = 6ten ELEC 5200-001/6200-001 Lecture 6

Signed Multiplication • Convert numbers to magnitudes. • Multiply the two magnitudes through 32 iterations. • Negate the result if the signs of the multiplicand and multiplier differed. • Time of serial multiplication: N additions, O(N) clock cycles for N-bit integers. • A better method: Booth’s Algorithm. ELEC 5200-001/6200-001 Lecture 6

Booth Multiplier Algorithm • A. D. Booth, “A Signed Binary Multiplication Technique,” Quarterly Journal of Mechanics and Applied Math., vol. 4, pt. 2, pp. 236-240, 1951. • Direct multiplication of positive and negative integers using two’s complement addition. ELEC 5200-001/6200-001 Lecture 6

A Multiplication Trick • Consider decimal multiplication: 4 5 7 9 9 9 0 1 4 5 7 4 1 1 3 4 1 1 3 4 1 1 3 4 5 6 5 4 7 5 7 Three additions • Operations for each digit of multiplier: • Do nothing if the digit is 0 • Shift left, i.e., multiply by some power of 10 • Multiply by the digit, i.e., by a number between 1 and 9 ELEC 5200-001/6200-001 Lecture 6

What is the Trick? • Examine multiplier: 99901 = 100000 – 100 + 1 • Multiply as follows: 457 × 100000 = 45700000 457 × ( - 100) = - 45700subtraction 45654300 457 × 1 = 457addition 45654757 Reduced from three to two operations. ELEC 5200-001/6200-001 Lecture 6

Booth Algorithm: Basic Idea • Consider a multiplier, 00011110 (30) • We can write, 30 = 32 – 2, or 00100000 (32) = 25 +11111110 (-2) = - 21 00011110(0) 30 • Interpret multiplier (scan right to left), check bit-pairs: • kth bit is 1, (k-1)th bit is 0, multiplier contains -2k term • kth bit is 0, (k-1)th bit is 1, multiplier contains 2kterm • kth bit is 1, (k-1)th bit is 1, -2k is absent in multiplier • kth bit is 0, (k-1)th bit is 0, 2k is absent in multiplier • Product, M×30 = M×25 - M×21 M: multiplicand • Multiplication by 2k means a k-bit left shift ELEC 5200-001/6200-001 Lecture 6

Booth Algorithm: Example 1 • 7 × 3 = 21 0111 multiplicand = 7 ×0011(0) multiplier = 3 11111001 bit-pair 10, add -7 in two’s com. bit-pair 11, do nothing 000111 bit-pair 01, add 7 bit-pair 00, do nothing 00010101 21 ELEC 5200-001/6200-001 Lecture 6

Booth Algorithm: Example 2 • 7 × (-3) = -21 0111 multiplicand = 7 ×1101(0) multiplier = -3 11111001 bit-pair 10, add -7 in two’s com. 0000111 bit-pair 01, add 7 111001 bit-pair 10, add -7 in two’s com. bit-pair 11, do nothing 11101011 - 21 ELEC 5200-001/6200-001 Lecture 6

Booth Algorithm: Example 3 • -7 × 3 = -21 1001 multiplicand = -7 in two’s com. ×0011(0)multiplier = 3 00000111 bit-pair 10, add 7 bit-pair 11, do nothing 111001 bit-pair 01, add -7 bit-pair 00, do nothing 11101011 - 21 ELEC 5200-001/6200-001 Lecture 6

Booth Algorithm: Example 4 • -7 × (-3) = 21 1001 multiplicand = -7 in two’s com. ×1101(0)multiplier = -3 in two’s com. 00000111 bit-pair 10, add 7 1111001 bit-pair 01, add -7 in two’s com. 000111 bit-pair 10, add 7 bit-pair 11, do nothing 00010101 21 ELEC 5200-001/6200-001 Lecture 6

Booth Advantage Serial multiplication Booth algorithm 00010100 20 ×00011110 30 00000000 00010100 00010100 00010100 00010100 00000000 00000000 00000000________ 000001001011000 600 00010100 20 ×000111100 30 111111111101100 00000010100 __________________ 0000001001011000 600 Four partial product additions Two partial product additions ELEC 5200-001/6200-001 Lecture 6

ACM Announces Turing Award ACM has named Frances E. Allen the recipient of the 2006 A.M. Turing Award for "pioneering contributions to the theory and practice of optimizing compiler techniques that laid the foundation for modern optimizing compilers and automatic parallel execution." This award marks the first time that a woman has received this honor. ELEC 5200-001/6200-001 Lecture 6

Faster Multiplication • Using repeated additions, we need as many clocks as there are bits, say n, in multiplier. • Multiplication can be done in one clock. Of course, the period of clock will have to be longer; but may not be as long as n times. ELEC 5200-001/6200-001 Lecture 6

A Simple Design Mplier1·Mcand Mplier0·Mcand 32b 32b Mplier2·Mcand 33b p0 32b Mplier3·Mcand 33b p1 Clock period 32b 33b p2 Mplier31·Mcand 32b 33b p31 p32…p63 ELEC 5200-001/6200-001 Lecture 6

Adding Partial Products y3 y2 y1 y0 Multiplicand x3 x2 x1 x0 Multiplier ________________________ x0y3 x0y2 x0y1 x0y0 carry← x1y3 x1y2 x1y1 x1y0 Partial carry← x2y3 x2y2 x2y1 x2y0 Products carry← x3y3 x3y2 x3y1 x3y0 __________________________________________________ p7 p6 p5 p4 p3 p2 p1 p0 Notes: Requires three 4-bit adders. Slow. ELEC 5200-001/6200-001 Lecture 6

Array Multiplier: Carry Forward y3 y2 y1 y0 Multiplicand x3 x2 x1 x0 Multiplier ________________________ x0y3 x0y2 x0y1 x0y0 x1y3 x1y2 x1y1 x1y0 Partial x2y3 x2y2 x2y1 x2y0 Products x3y3 x3y2 x3y1 x3y0 __________________________________________________ p7 p6 p5 p4 p3 p2 p1 p0 Note: Carry is added to the next partial product. Adding the carry from the final stage needs an extra stage. These additions are faster but we need four stages. ELEC 5200-001/6200-001 Lecture 6

ELEC 5200-001/6200-001 Computer Architecture and Design Spring 2007 Computer Arithmetic (Chapter 3)