310 likes | 442 Views
CPE 335 Computer Organization MIPS Arithmetic – Part I Content from Chapter 3 and Appendix B. Dr. Iyad Jafar Adatped from Dr. Gheith Abandah Slides http://www.abandah.com/gheith/Courses/CPE335_S08/index.html. MIPS Number Representations.
CPE 335 Computer OrganizationMIPS Arithmetic – Part IContent from Chapter 3 and Appendix B Dr. Iyad Jafar Adatped from Dr. Gheith Abandah Slides http://www.abandah.com/gheith/Courses/CPE335_S08/index.html
MIPS Number Representations • Computer programs calculate both positive a negative numbers. • On approach is to use the sign and magnitude representation. • Use separate bit for the sign • Shortcomings: • Where to put the sign ? • Positive and negative zeros ! • Need complex hardware to perform arithmetic • Alternative: use the complement notation; specifically the two’s complement !
maxint MSB LSB minint MIPS Number Representations • 32-bit signed numbers (2’s complement):0000 0000 0000 0000 0000 0000 0000 0000two = 0ten0000 0000 0000 0000 0000 0000 0000 0001two = + 1ten... 0111 1111 1111 1111 1111 1111 1111 1110two = + 2,147,483,646ten0111 1111 1111 1111 1111 1111 1111 1111two = + 2,147,483,647ten1000 0000 0000 0000 0000 0000 0000 0000two = – 2,147,483,648ten1000 0000 0000 0000 0000 0000 0000 0001two = – 2,147,483,647ten... 1111 1111 1111 1111 1111 1111 1111 1110two = – 2ten1111 1111 1111 1111 1111 1111 1111 1111two = – 1ten • If we use N bits to represent a signed number using two’s complement, then • The maximum number is 2N-1 – 1 • The minimum number is -2N-1
1011 and add a 1 1010 complement all the bits Review: 2’s Complement Binary Representation -23 = -(23 - 1) = • Negate 23 - 1 =
MIPS Number Representations • Converting signed numbers to decimal • (1111 1110 )2= • -1*2^7 + 1*2^6 + 1*2^5 + 1*2^4 + 1*2^3 + 2^2 + 1*2^1 + 0*2^0 = -2 • Converting <32-bit values into 32-bit values • Sign Extension: copy the most significant bit (the sign bit) into the “empty” bits 0010 -> 0000 0010 1010 -> 1111 1010 • Zero Extension: place zeros in the extended bits. • 0010 -> 0000 0010 1010 -> 0000 1010
MIPS Number Representation • How to negate a number ? • There is no special instruction • Suppose we have x = - x • sub $s0 , $zero , $s0 • This is in contrast to complementing a number ! • x = ~x • Bitwise complement of x • There is no single instruction • Recall the XOR operation x 1 = x’ • addi $t0, $zero, -1 • xor $s0, $s0, $t0
MIPS Instruction Support for Signed numbers • addvsaddu, subvssubu, and addivsaddiu • Addition/subtraction is performed in the same manner • Is overflow exception generated ? • lbvslbu • lb sign extend the additional bits • lbuzero extend the additional bits • sltvssltu and sltivssltiu • sltandslti perform signed comparison with a constant • sltuandsltiuperform unsigned comparison with a constant
Example • Suppose that • $s0 = 1111 1111 1111 1111 1111 1111 1111 1111 • $s1 = 0000 0000 0000 0000 0000 0000 0000 0001 • then what is the value stored in $t0 in the following cases : • slt $t0, $s1, $s0 • Signed comparison • $t0 = 0 since $s1 = 1 and $s0 = -1 • sltu $t0, $s1, $s0 • Unsigned comparison • $t0 = 1 since $s1 = 1 and $s0 = 2^32 -1
Binary Addition • Binary addition is simple ! • 0 + 0 = 0 and 0 carry • 0 + 1 = 1 and 0 carry • 1 + 0 = 1 and 0 carry • 1 + 1 = 0 and 1 carry • Add corresponding bits and propagate the carry, if any, to the next bit.
Review: A Full Adder S = A B carry_in carry_out = A&B | A&carry_in | B&carry_in carry_in A 1-bit Full Adder S B carry_out • How can we use it to build a 32-bit adder? • How can we modify it easily to build an adder/subtractor?
add/sub c0=carry_in A0 1-bit FA S0 B0 c1 control (0=add,1=sub) A1 1-bit FA B0 if control = 0, !B0 if control = 1 S1 B0 B1 c2 A2 1-bit FA S2 B2 c3 . . . c31 A31 1-bit FA S31 B31 c32=carry_out A 32-bit Ripple Carry Adder/Subtractor • Remember 2’s complement is just • complement all the bits • add a 1 in the least significant bit • Subtraction is equivalent to adding the negative of the number A 0111 0111 B - 0110 + 1001 1 0001 1 0001
0 1 1 1 1 0 0 1 1 1 7 1 1 0 0 –4 + 0 0 1 1 3 + 1 0 1 1 – 5 1 0 1 0 0 1 Overflow Detection • Overflow: the result is too large to represent in 32 bits • Overflow occurs when • adding two positives yields a negative • or, adding two negatives gives a positive • or, subtract a negative from a positive gives a negative • or, subtract a positive from a negative gives a positive • On your own: Prove that you can detect overflow by: • Carry into MSB XOR Carry out of MSB, ex for 4 bit signed numbers – 6 1 1 7
MIPS Arithmetic Logic Unit (ALU) • Need to support the logic operations • Need to support arithmetic operations • Need to support the set-on-less-than instruction • Need to support test for equality • Immediates are sign or zero extended outside the ALU with wiring (i.e., no logic needed)
zero ovf 1 1 A 32 ALU result 32 B 32 4 m (operation) MIPS Arithmetic Logic Unit (ALU) • Must support the Arithmetic/Logic operations of the ISA add, addi, addiu, addu sub, subu, neg mult, multu, div, divu sqrt and, andi, nor, or, ori, xor, xori beq, bne, slt, slti, sltiu, sltu • With special handling for • sign extend – addi, addiu andi, ori, xori, slti, sltiu • zero extend – lbu, addiu, sltiu • no overflow detected – addu, addiu, subu, multu, divu, sltiu, sltu
MIPS Arithmetic Logic Unit (ALU) • Start with 1-bit ALU • Can easily implement the logic instruction ANDandOR since they map directly to hardware. • Perform all possible operations in parallel then use a multiplexer to select the result based on the instruction type. • The control signal Operation is issued by the control unit
MIPS Arithmetic Logic Unit (ALU) • For the ADD instruction, use a full adder. The CarryIn input will be used later on to expand the 1-bit ALU to n-Bit. • Expand the multiplexer inputs and select lines to accommodate for the add instruction.
MIPS Arithmetic Logic Unit (ALU) • For the subtract instruction, we use 2’s complement subtraction. • We need to complement B and add 1. • Define Binvert to select between B and B’ and set CarryIn to 1. • Combine Binvertand CarryInin one signal Bnegatesince they have the same value all the time.
MIPS Arithmetic Logic Unit (ALU) • Supporting the NOR operation requires no separate gate. • Use Demorgan’s theorem and the AND gate and define the signal Ainvert • (A+B)’ = A’.B’
MIPS Arithmetic Logic Unit (ALU) • Constructing 32-bit ALU • Replicate the 1-bit ALU and connect the CarryIn signals • All cells receive the same control signals
MIPS Arithmetic Logic Unit (ALU) • Supporting the SLT instruction • Expand the multiplexer for one more input. • Subtract the two registers and feed the sign bit (the result of bit 31) back to the least significant bit. • The slt input of the multiplexer is connected to 0 for remaining bits . LSB MSB
MIPS Arithmetic Logic Unit (ALU) • 32-bit ALU with SLT support.
MIPS Arithmetic Logic Unit (ALU) • Supporting conditional branch instructions • Need to generate a signal that indicates whether the result is zero or not. • Simply OR the result bits and take the complement. • This signal will be used to make the selection between the branch address and the PC. • Example on using the Zero signal on selecting the address for BEQ instruction
MIPS Arithmetic Logic Unit (ALU) • Final ALU with overflow detection
MIPS Arithmetic Logic Unit (ALU) • Control signals values and corresponding operations
Improving Addition Performance • The ripple-carry adder is slow • We have to wait until the carry is propagated to the final position in order to read out the addition/subtraction result. • Carry generation is associated with two levels of gates at each bit position (Coi = AiBi + AiCini + BiCini). • Total delay = gate delay x 2 x number of bits • Example: 16 bit adder delay is 32 delay units
Carry-Lookahead Adder • Need fast way to find the carry • Design a separate unit that computes carries for different bits in parallel !
Carry-Lookahead Adder • In a 4 bit adder, the equations of the carries are c1 = (b0 . c0) + (a0 . c0) + (a0 . b0) c2 = (b1 . c1) + (a1 . c1) + (a1 . b1) c3 = (b2 . c2) + (a2 . c2) + (a2 . b2) c4 = (b3 . c3) + (a3 . c3) + (a3 . b3) • By substitution c2 = (a1 . a0 . b0) + (a1 . a0 . c0) + (a1 . b0 . c0) + (b1 . a0 . b0) + (b1 . a0 . c0 ) + (b1 . b0 . c0) + (a1.b1) c3 = (b2 . a1 . a0 . b0) + (b2 . a1 . a0 . c0) + (b2 . a1 . b0 . c0) + (b2 . b1 . a0 . b0) + (b2 . b1 . a0 . c0 ) + (b2 . b1 . b0 . c0) + (b2 . a1 . b1) + (a2 . a1 . a0 . b0) + (a2 . a1 . a0 . c0) + (a2 . a1 . b0 . c0) + (a2 . b1 . a0 . b0) + (a2 . b1 . a0 . c0 ) + (a2 . b1 . b0 . c0) + (a2 . a1 . b1) + (a2.b2) c4 = …… • Imagine the equation if the adder is 32 bits ?? .
Carry-Lookahead Adder • We can reduce the logic cost by simple simplification • ci+1 = (bi . ci) + (ai . ci) + (ai . bi) = (ai . bi) + (ai + bi) . ci = gi + pi . ci • gi : carry generate • pi : carry propagate • Carry equations for 4 bit adder • c1 = g0 + p0 . c0 • c2 = g1 + (p1 . g0) + (p1 . p0 . c0) • c3 = g2 + (p2 . g1) + (p2 . p1 . g0) + (p2 . p1 . p0 . c0) • c4 = g3 + (p3 . g2) + (p3 . p2 . g1) + (p3 . p2 . p1 . g0) + (p3 . p2 . p1 . p0 . c0) • Still cost is high for larger adders ! ! !
Carry-Lookahead Adder- Second level of Abstraction • Assume 16 bit adder that consists of 4 single 4-bit adders with carry-lookahead implementation • We can generate the carries using three levels of gates in parallel • Delay to generate C4 is 3 gates
Carry-Lookahead Adder- Second level of Abstraction • Need to generate the carry propagate and generate signals at higher level • Think of each 4-bit adder block as a single unit that can either generate of propagate a carry. • Super propagate signals • P0 = p3⋅p2⋅p1⋅p0 • P1 = p7⋅p6⋅p5⋅p4 • P2 = p11⋅p10⋅p9⋅p8 • P3 = p15⋅p14⋅p13⋅p12 • Super generate signals • G0 = g3+(p3 ⋅ g2)+(p3⋅p2⋅g1)+(p3⋅p2⋅p1⋅g0) • G1 = g7+(p7 ⋅ g6)+(p7⋅p6⋅g5)+(p7⋅p6⋅p5⋅g4) • G2 = g11+(p11 ⋅ g10)+(p11⋅p10⋅g9)+(p11⋅p10⋅p9⋅g8) • G3 = g15+(p15 ⋅ g14)+(p15⋅p14⋅g13)+(p15⋅p14⋅p13⋅g12)
Carry-Lookahead Adder- Second level of Abstraction • Carry signal at higher levels are • C1 = G0 + (P0 ⋅ c0) • C2 = G1 + (P1 ⋅ G0) + (P1⋅P0⋅c0) • C3 = G2 + (P2 ⋅ G1) + (P2⋅P1⋅G0) + (P2⋅P1⋅P0⋅c0) • C4 = G3 + (P3 ⋅ G2) + (P3⋅P2⋅G1) + (P3⋅P2⋅P1⋅G0) + (P3⋅P2⋅P1⋅P0⋅c0) • Each supper carry signal is two level implementation in terms of Pi and Gi • Pi is one level of gates while Gi is two and expressed in terms of pi and gi • pi and gi are one level of gates • Total delay is 2 + 2 + 1 = 5 • 16-bit CLA is 5 times faster than the 16-bit ripple carry adder