370 likes | 690 Views
Chapter 6-2 Multiplier. Multiplier Next Lecture Divider Floating Point Numbers. Multiplication of Positive Numbers using usual algorithm for multiplying integers. A lgorithm applies to unsigned numbers and to positive numbers
E N D
Chapter 6-2Multiplier • Multiplier • Next Lecture • Divider • Floating Point Numbers
Multiplication of Positive Numbersusing usual algorithm for multiplying integers • Algorithm applies to unsigned numbers and to positive numbers • Result of the product of two n-digit numbers can be accommodated in 2n digits • Binary multiplication of positive operands can be implemented in a purely combinational, two dimensional logic array 1 1 0 1 (13) Multiplicand M 1 0 1 1 (11) Multiplier Q Partial Products (143) Product P
Multiplier Implementation Multiplicand m m m m 0 0 0 0 3 2 1 0 q 0 0 p 0 q 1 0 Multiplier p 1 q 2 0 p 2 q 3 0 p p p p p 7 6 5 4 3 Partial product (PP0) PP1 PP2 PP3 p , p , ... p PP4 = = Product 7 6 0 Bit of incoming partial product PPi m j q i Typical cell Carry-out Carry-in F A Bit of outgoing partial product PP(i+1)
Array Multiplier m m m m q 3 2 1 0 0 q m m m m 1 3 2 1 0 HA FA FA HA q P m m m m 2 1 3 2 1 0 FA FA FA HA q m m 3 3 2 FA P P P P P 7 6 5 4 3 P 0 P m m 2 1 0 FA FA HA
Ripple-Carry Array Multiplier 0 m q m q m q m q 3 0 2 0 1 0 0 0 m q m q m q m q 3 1 2 1 1 1 0 1 0 FA FA FA FA m q m q m q m q 3 2 2 2 1 2 0 2 0 FA FA FA FA m q m q m q m q 3 3 2 3 1 3 0 3 0 FA FA FA FA p p p p p p p p 7 6 5 4 3 2 1 0 • For the multiplication operation M Q = P for 4-bit operands • M: m3m2m1m0 • Q: q3q2q1q0 • P: p7p6p5p4p3p2p1p0 • miqj =mi·qj
The MxN Array MultiplierCritical Path HA FA FA HA Critical Path 1 FA FA FA HA Critical Path 2 Critical Path 1 & 2 FA FA FA HA Dmult=[(M-1)+(N-2)]Dcarry +(N-1)Dsum+1Dand
Multiplier Implementation • The main component in each cell is an adder circuitry • Each AND gate determines whether a multiplicand bit mj is added to the incoming partial product bit, based on the value of the multiplier bit qj • For each row i ( 0 ≤ i ≤ 3) where qi = 1, adds the multiplicand appropriately shifted, to the incoming partial product, PPi, to generate PPi+1 • If qi = 0, PPi is passed vertically downward unchanged • PP0 is all 0s • PP4 is the desired product • The multiplicand is shifted left one position per row by the diagonal signal path
Another Method of Multiplier Design • The previous algorithm may be impractical for large numbers because it uses many gates • Multiplication can be performed using a mixture of combinational array techniques and sequential techniques that require less combinational logic • In early computers, because of the cost of logic gates, the adder circuitry in the ALU was used to perform multiplication sequentially • Called sequential circuit binary multiplier
Register A (initially 0) Shift right a a q q C n - 1 0 n - 1 0 Multiplier Q Add/Noadd control n -bit adder Control MUX sequencer 0 0 m m n - 1 0 Multiplicand M M 1 1 0 1 Initial configuration 0 0 0 0 0 1 0 1 1 C A Q 0 1 1 0 1 1 0 1 1 Add First cycle Shift 0 0 1 1 0 1 1 0 1 1 0 0 1 1 1 1 0 1 Add Second cycle Shift 0 1 0 0 1 1 1 1 0 No add 0 1 0 0 1 1 1 1 0 Third cycle Shift 0 0 1 0 0 1 1 1 1 1 0 0 0 1 1 1 1 1 Add Fourth cycle Shift 0 1 0 0 0 1 1 1 1 Product
Sequential Circuit Binary Multiplier • This circuit performs multiplication by using a single adder n times to implement the spatial addition performed by the n rows of ripple carry adders • Registers A and Q combined hold PPi while multiplier bit qi generates the signal Add/Noadd • Add/Noadd controls the addition of the multiplicandM to PPi to generate PPi+1 • The product is computed in n cycles • The partial product grows in length 1 bit per cycle from the initial vector PP0 of n 0s in register A • The carry-out from the adder is stored in Flip-Flop C • At the start, the multiplier is loaded into register Q, the multiplicand into register M, and C as well as A are cleared to 0
Sequential Circuit Binary Multiplier • At the end of each cycle, C, A and Q are shifted right by one bit position to allow for the growth of the partial product as the multiplier is shifted out of register Q • Because of this shifting, multiplier bit qi appears in the LSB position of Q to generate the Add/Noadd signal at the correct time, starting with q0 during the first cycle, q1 during the second cycle, etc... • If the adder has a delay of 10 ns • The control setting and the shift operations take another 10ns each • A hardwired multiply in a 32-bit word-length computer would take about 640ns • Multiply instructions took much longer to execute than Add instructions in early computers
Signed Operand Multiplication • Multiplication of signed operands generates a double length product in the 2's complement number system • Consider the case of a positive multiplier and a negative multiplicand • When we add a negative multiplicand to a partial product, we must extend the sign bit value of the multiplicand to the left as far as the product will extend • The previous hardware can be used for negative multiplicands if it provides for sign extension of the partial products
Sign Extension of Negative Multiplicand 1 0 0 1 1 ( - 13 ) ( ) ´ 0 1 0 1 1 + 11 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 0 0 1 1 Sign extension is 0 0 0 0 0 0 0 0 shown in blue 1 1 1 0 0 1 1 0 0 0 0 0 0 1 1 0 1 1 1 0 0 0 1 ( - 143 ) • Negative number must be the multiplicand and the positive number is the multiplier
Booth Algorithm • A powerful algorithm for signed-number multiplication • treats positive and negative numbers uniformly • So far, the number of additions equals the number of 1s in the multiplier • Consider a multiplication in which the multiplier is positive and has a single block of 1s (e.g., 00111102 = 3010) • To derive the product, we could add four appropriately shifted versions of the multiplicand (i.e., for four 1s) • We can reduce the number of operations by regarding the multiplier as the difference between two numbers, i.e., 3210-210 or 01000002-00000102 • This suggests that the product can be generated by adding 25 times the multiplicand to the 2's complement of 21 times the multiplicand • The sequence of required operations can be recoded as 0+1000-10
Booth Algorithm • -1 times the shifted multiplicand is selected when changing multiplier from 0 to 1 • +1 times the shifted multiplicand is selected when changing multiplier from 1 to 0 • The multiplier is scanned form right to left
Normal and Booth Multiplication Schemes 0 1 0 1 1 0 1 0 0 + 1 + 1 + 1 + 1 0 0 0 0 0 0 0 0 0 1 0 1 1 0 1 Normal 0 1 0 1 1 0 1 0 1 0 1 1 0 1 0 1 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 0 0 0 1 1 0 0 1 0 1 1 0 1 + 1 - 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2's complement of 1 1 1 1 1 1 1 0 1 0 0 1 1 the multiplicand 0 0 0 0 0 0 0 0 0 0 0 0 Booth 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 0 0 0 1 1 0
Booth Recoding of a Multiplier • When the least significant bit is 1 , assume an implied 0lies to its right 0 0 1 0 1 1 0 0 1 1 1 0 1 0 1 1 0 0 0 0 0 0 0 0 0 0 + 1 - 1 + 1 - 1 + 1 - 1 + 1 - 1 + 1 - 1
Booth Multiplication with a Negative Multiplier 0 1 1 0 1 ( + 13 ) 0 1 1 0 1 1 1 0 1 0 0 - 1 +1 - 1 0 ´ ( - 6 ) 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 1 1 0 0 0 0 1 1 0 1 1 1 1 0 0 1 1 0 0 0 0 0 0 1 1 1 0 1 1 0 0 1 0 ( - 78 ) • Handles both positive and negative multipliers uniformly
Correctness of Booth Technique for Negative Multipliers -2k+1 = X= • Let the leftmost zero of a negative number, X, be at bit position k • X = 11…10xk-1….x0 • The value of X is given by V(X)= -2k+1+xk-12k-1 +….+x020 • Example V(X) 11000 (-8) 11001 (-7) = -23 = -23 + 1 • For example, 1101102(-1010) is recoded as 0-1+10-10 -24+23-2 = -1010
Booth Multiplier Recoding Scheme Multiplier Version of multiplicand selected by bit i i - Bit i Bit 1 0 0 0 ´ M 0 1 + 1 ´ M 1 0 1 ´ M 1 1 0 ´ M
Booth Recoded Multipliers • Achieves some efficiency in the number of additions required when the multiplier has a few large blocks of 1s 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 Worst-case multiplier + 1 - 1 + 1 - 1 + 1 - 1 + 1 - 1 + 1 - 1 + 1 - 1 + 1 - 1 + 1 - 1 1 1 0 0 0 1 0 1 1 0 1 1 1 1 0 0 Ordinary multiplier 0 - 1 0 0 + 1 - 1 + 1 0 - 1 + 1 0 0 0 - 1 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 1 1 1 Good multiplier 0 0 0 + 1 0 0 0 0 - 1 0 0 0 + 1 0 0 - 1
Fast Multiplication Bit-pair recoding of multipliers • Halves the maximum number of summands • Derived from the booth algorithm • (+1 -1) is equivalent to (0 +1) • Because (+1 -1) is (+102 + -12) = +2M + -M = +M = (0 +1) • Instead of adding +1×M at position i+1 to -1 times the multiplicand M at a shift position i • The same result can be obtained by adding +1×M at position i • (+1 0) is equivalent to (0 +2) • (-1 +1) is equivalent to (0 -1) • The booth-recoded multiplier is examined two bits at a time, starting from the right
Multiplier Bit-Pair Recoding Multiplier bit-pair Multiplier bit on the right Multiplicand selected at position i i + 1 i i 1 0 0 0 0 ´ M 0 0 1 + 1 ´ M 0 1 0 + 1 ´ M 0 1 1 + 2 ´ M 1 0 0 2 ´ M 1 0 1 1 ´ M 1 1 0 1 ´ M 1 1 1 0 ´ M (b) Table of multiplicand selection decisions Sign extension Implied 0 to right of LSB 1 1 1 0 1 0 0 (a) Example of bit-pair recoding derived from Booth recoding 0 0 1 + 1 1 0 1 2 0
MultiplicationRequiring onlyn/2 Summands Example 0 1 1 0 1 ( + 13 ) ´ 1 1 0 1 0 ( - 6 ) 0 1 1 0 1 0 - 1 + 1 - 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 1 1 0 0 0 0 1 1 0 1 1 1 1 0 0 1 1 0 0 0 0 0 0 1 1 1 0 1 1 0 0 1 0 ( - 78 ) 0 1 1 0 1 0 - 1 - 2 1 1 1 1 1 0 0 1 1 0 1 1 1 1 0 0 1 1 0 0 0 0 0 0 1 1 1 0 1 1 0 0 1 0
Ripple-Carry Array Disadvantage • Multiplication requires many additions • Using Ripple-Carry Array is slow • Consider the addition of three n-bit numbers W, X,Y to produce the sum Z • We can first add W to X to generate a number A • Then we can add A to Y to produce Z • This can be done by using two ripple carry adders
A Different Approach • Instead of adding W to X to produce A in the upper ripple carry adder, let’s introduce the bits of Y into the inputs • This generates the vectors S and the saved carries C as the outputs • In the second row, S and C are added in a a ripple carry adder to produce Z • Carry save addition can speedup this process
Carry Save Array 0 m q m q m q m q 3 0 2 0 1 0 0 0 m q m q m q m q 3 1 2 1 1 1 0 1 m q m q m q m q 3 2 2 2 1 2 0 2 0 FA FA FA FA m q m q m q 0 m q 2 3 1 3 0 3 3 3 FA FA FA FA 0 FA FA FA FA p p p p p p p p 7 6 5 4 3 2 1 0 • For the multiplication operation M Q = P for 4-bit operands • M: m3m2m1m0 • Q: q3q2q1q0 • P: p7p6p5p4p3p2p1p0 Q: Do you see any saving here?
Carry-Save Addition Approach (45) M 1 0 1 1 0 1 (63) Q 1 1 1 1 1 1 X A 1 0 1 1 0 1 B 1 0 1 1 0 1 C 1 0 1 1 0 1 D 1 0 1 1 0 1 E 1 0 1 1 0 1 F 1 0 1 1 0 1 (2,835) Product 1 0 1 1 0 0 0 1 0 0 1 1
Complete Example M 1 0 1 1 0 1 Q x 1 1 1 1 1 1 A 1 0 1 1 0 1 B 1 0 1 1 0 1 C 1 0 1 1 0 1 S 1 1 0 0 0 0 1 1 1 C 0 0 1 1 1 1 0 0 1 D 1 0 1 1 0 1 E 1 0 1 1 0 1 F 1 0 1 1 0 1 S 1 1 0 0 0 0 1 1 2 C 0 0 1 1 1 1 0 0 2 S 1 1 0 0 0 0 1 1 1 C 0 0 1 1 1 1 0 0 1 S 1 1 0 0 0 0 1 1 2 S 1 1 0 1 0 1 0 0 0 1 1 3 C 0 0 0 0 1 0 1 1 0 0 0 3 C 0 0 1 1 1 1 0 0 2 S 0 1 0 1 1 1 0 1 0 0 1 1 4 C + 0 1 0 1 0 1 0 0 0 0 0 4 Product 1 0 1 1 0 0 0 1 0 0 1 1
Schematic Representation of C.S.A. F E D C B A Level 1 CSA C S C S 2 2 1 1 Level 2 CSA C C S 2 3 3 Level 3 CSA C S 4 4 Final addition + Product 1.7log2k – 1.7 steps, where k is the number of summands
Example Ripple-Carry vs. Carry-Save • Carry-save addition transforms W, X and Y into S and C • Advantages: all bits of S and C are produced in a short fixed amount of time after W, X, and Y are applied • Each row approximately takes one full-adder delay • Carry propagation takes place only in the last row • Carry lookahead adder could be used effectively to add the S and C vectors because all bits of S and C are available in parallel • Consider the addition of many summands • We can group the summands in threes and perform the carry save addition on each of these groups in parallel to generate S and C • Next, group all the S and C vectors into threes and perform carry save addition on them • Continue this process until there are only two vectors remaining • These remaining vectors can be added in a ripple carry or a carry lookahead adder to produce the sum