590 likes | 829 Views
Computer Architecture Chapter 3 Instructions: Arithmetic for Computer. Yu-Lun Kuo 郭育倫 Department of Computer Science and Information Engineering Tunghai University, Taichung, Taiwan R.O.C. sscc6991@gmail.com http://www.csie.ntu.edu.tw/~d95037/. Fetch PC = PC+4. Exec. Decode.
E N D
Computer ArchitectureChapter 3Instructions: Arithmetic for Computer Yu-Lun Kuo 郭育倫 Department of Computer Science and Information Engineering Tunghai University, Taichung, Taiwan R.O.C. sscc6991@gmail.com http://www.csie.ntu.edu.tw/~d95037/
Fetch PC = PC+4 Exec Decode Review: MIPS Organization Processor Memory 1…1100 src1 addr src1 data Register File 5 32 src2 addr 32 registers ($zero - $ra) 5 dst addr read/write addr 5 src2 data write data 32 230 words 32 32 32 bits branch offset read data 32 Add PC 32 32 32 32 Add 32 4 write data 0…1100 32 0…1000 32 4 5 6 7 0…0100 32 ALU 0 1 2 3 0…0000 32 word address (binary) 32 bits 32 byte address (big Endian)
Memory Address Binding • High level language 與 Machine language在memory address轉換的對應方法 • 編譯期鏈結 • 載入期鏈結 • Relocate • 執行期鏈結 • Dynamic linking & Dynamic loading
2. Operand: Base/Displacement addressing Memory op rs rt offset word or byte operand base register Review: MIPS Addressing Modes 1. Operand: Register addressing Register op rs rt rd funct word operand R-Type I-Type 3. Operand: Immediate addressing addi op rs rt operand 4. Instruction: PC-relative addressing Memory op rs rt offset branch destination instruction Branch Program Counter (PC) 5. Instruction: Pseudo-direct addressing Memory op jump address J-Type jump destination instruction || Program Counter (PC)
Overview • Data type includes representation and operations • let’s look at some arithmetic operations: • Addition • Subtraction • Sign Extension • Also look at overflow conditions for addition. • Multiplication, division, etc. • Logical operations are also useful: • AND • OR • NOT
Signed and Unsigned Numbers • Humans base 10, Computers base 2 • Bits are just bits (no inherent meaning) • Conventions define relationship between bits and numbers • Binary numbers (base 2)0000 0001 0010 0011 0100 0101 0110 0111 1000 1001... decimal: 0...2n-1 • Of course it gets more complicated: • Numbers are finite (overflow) • Negative numbers • e.g., no MIPS subi instruction; addi can add a negative number
Signed and Unsigned Numbers • Computer program calculate both positive and negative numbers • Distinguishes the positive from the negative • Obvious solution • Add a separate sign • Conveniently can be represented in a sign bit • Name of this representation is sign and magnitude
Signed and Unsigned Numbers • Sign and magnitude representation has several shortcomings • Not obvious where to put the sign bit • Right or left? • Adders may need an extra step to set the sign • Has both a positive and negative zero • Lead to problems for inattentive programmers
Signed and Unsigned Numbers • How do we represent negative numbers? • i.e., which bit patterns will represent which numbers? • No use sign and magnitude • Use two’s complement representation • Leading 0s mean positive • Leading 1s mean negative • Two’s complement advantage • All negative numbers have a 1 in the most significant bit • Hardware needs to test only this bit (+ or -)
maxint MSB LSB minint Signed and Unsigned Numbers • 32-bit signed numbers (2’s complement):0000 0000 0000 0000 0000 0000 0000 0000two = 0ten0000 0000 0000 0000 0000 0000 0000 0001two = + 1ten... 0111 1111 1111 1111 1111 1111 1111 1110two = + 2,147,483,646ten0111 1111 1111 1111 1111 1111 1111 1111two = + 2,147,483,647ten1000 0000 0000 0000 0000 0000 0000 0000two = – 2,147,483,648ten1000 0000 0000 0000 0000 0000 0000 0001two = – 2,147,483,647ten... 1111 1111 1111 1111 1111 1111 1111 1110two = – 2ten1111 1111 1111 1111 1111 1111 1111 1111two = – 1ten (2)10 = (0000 0000 0000 0000 0000 0000 0000 0010)2 (1111 1111 1111 1111 1111 1111 1111 1101)2 (-2)10 = (1111 1111 1111 1111 1111 1111 1111 1110)2
Signed and Unsigned Numbers • Converting < 32-bit values into 32-bit values • Copy the most significant bit (the sign bit) into the “empty” bits • 0010 -> 0000 0010 1010 -> 1111 1010 (sign extend) • sign extend versus zero extend (lb vs. lbu) • slt vs. slti (set on less than immediate) • sltu (set on less than unsigned) vs. sltiu • Represent positive and negative numbers (x31 x -231)+(x30 x 230)+(x29 x 229)+…+ (x1 x 21)+(x0 x 20)
Sign Extension 4-bit8-bit 0100 (4)00000100 (still 4) 1100 (-4)00001100 (12, not -4) 4-bit8-bit 0100 (4)00000100 (still 4) 1100 (-4)11111100 (still -4) To add two numbers, we must represent them with the same number of bits If we just pad with zeroes on the left: Instead, replicate the MS bit -- the sign bit:
Signed versus Unsigned Comparison • Suppose register $s0 has binary number 1111 1111 1111 1111 1111 1111 1111 1111 • Register $s1 has the binary number 0000 0000 0000 0000 0000 0000 0000 0001 What are the value of registers $t0 and $t1 • slt $t0, $s0, $s1 #signed • sltu $t1, $s0, $s1 #unsigned Ans: $t0 = 1 and $t1 = 0
zero ovf 1 1 A 32 ALU result 32 B 32 4 m (operation) MIPS Arithmetic Logic Unit (ALU) • Must support the Arithmetic/Logic operations of the ISA add, addi, addiu, addu sub, subu, mult, multu, div, divu sqrt and, andi, nor, or, ori, xor, xori beq, bne, slt, slti, sltiu, sltu • With special handling for • sign extend – addi, addiu andi, ori, xori, slti, sltiu • zero extend – lbu, addiu, sltiu • no overflow detected – addu, addiu, subu, multu, divu, sltiu, sltu
1011 and add a 1 1010 complement all the bits Review: 2’s Complement Representation -23 = -(23 - 1) = • Negate 23 - 1 =
add/sub c0=carry_in A0 1-bit FA S0 B0 c1 control (0=add,1=sub) A1 1-bit FA B0 if control = 0, !B0 if control = 1 S1 B0 B1 c2 A2 1-bit FA S2 B2 c3 . . . c31 A31 1-bit FA S31 B31 c32=carry_out A 32-bit Ripple Carry Adder/Subtractor • Remember 2’s complement is just • complement all the bits • add a 1 in the least significant bit A 0111 0111 B - 0110 + 1001 1 0001 1 0001
Addition & Subtraction • Just like in grade school (carry/borrow 1s) 0111 0111 0110+ 0110 - 0110 - 0101 • Two's complement operations easy • subtraction using addition of negative numbers 0111 + 1010 • Overflow (result too large for finite computer word): • e.g., adding two n-bit numbers does not yield an n-bit number 0111 + 0001 note that overflow term is somewhat misleading, 1000 it does not mean a carry “overflowed”
Overflow 01000 (8)11000 (-8) + 01001(9)+ 10111(-9) 10001 (-15) 01111(+15) If operands are too big, then sum cannot be represented as an n-bit 2’s comp number
0 1 1 1 1 0 0 1 1 1 7 + 0 0 1 1 3 1 0 1 0 0 1 Overflow Detection 1 1 0 0 –4 + 1 0 1 1 – 3 – 6 1 1 7 • Overflow • the result is too large to represent in 32 bits • Overflow occurs when • adding two positives yields a negative • adding two negatives gives a positive • subtract a negative from a positive gives a negative • subtract a positive from a negative gives a positive
Tailoring the ALU to the MIPS ISA • Need to support the logic operation (and,nor,or,xor) • Bit wise operations (no carry operation involved) • Need a logic gate for each function, mux to choose the output • Need to support the set-on-less-than instruction (slt) • Use subtraction to determine if (a – b) < 0 (implies a < b) • Copy the sign bit into the low order bit of the result, set remaining result bits to 0 • Need to support test for equality (bne, beq) • Again use subtraction: (a - b) = 0 implies a = b • Additional logic to “nor” all result bits together • Immediates are sign extended outside the ALU with wiring (i.e., no logic needed)
op rs rtrdshamtfunct Logic Operation (*) • Shift Operations • Shifts move all the bits in a word left or right sll $t2, $s0, 8 #$t2 = $s0 << 8 bits srl $t2, $s0, 8 #$t2 = $s0 >> 8 bits • The shift operation is implemented by hardware separate from the ALU
0 16 10 8 0 unused Logic Operation (*) sll $s2, $s0, 8 • sll (R-format) • Ex. If register $s0 is 0000 0000 0000 0000 0000 0000 0000 1101 execute sll $s2, $s0, 8 What is the value of $s2? 0000 0000 0000 0000 0000 1101 0000 0000
Arithmetic Logic Unit (ALU) Using 4 kinds hardware components
Logical Operations • Operations on logical TRUE or FALSE • two states -- takes one bit to represent: TRUE=1, FALSE=0
Logic Operation (*) • And (and) instruction • Or (or) instruction • Ex. If register $t2 is 0000 0000 0000 0000 0000 1101 0000 0000 If register $t1 is 0000 0000 0000 0000 00111100 0000 0000 • Then execute and $t0, $t1, $t2 The value of $t0 is 0000 0000 0000 0000 00001100 0000 0000
Examples of Logical Operations • AND • useful for clearing bits • AND with zero = 0 • AND with one = no change • OR • useful for setting bits • OR with zero = no change • OR with one = 1 • NOT • unary operation -- one argument • flips every bit 11000101 AND 00001111 00000101 11000101 OR 00001111 11001111 NOT11000101 00111010
CarryIn Input Output a + Sum a 0 0 0 0 1 1 1 1 b 0 0 1 1 0 0 1 1 CarryIn 0 1 0 1 0 1 0 1 CarryOut 0 0 0 1 0 1 1 1 Sum 0 1 1 0 1 0 0 1 b CarryOut Adder (1-bit) 1-bit full adder
Adder (4-bit ripple carry adder) Each full adder inputs a Cin, which is the Cout of the previous adder
Adder (32 bit ALU) Operation CarryIn a Result b CarryOut 32-bit adder requires 31 carry computations
ALU-運算 a Zero 結果 ALU 溢位 b 進位輸出 ALU Overflow Carry out ALU notation
can be formed in parallel and added in parallel for faster multiplication 3.4 Multiplication (1/4) n multiplicand multiplier partial product array n double precision product 2n Binary multiplication is just a bunch of right shifts and adds
Multiplication (2/4) multiplicand multiplier • More complicated than addition • accomplished via shifting and addition • More time and more area • Ex. Unsigned Multiplication (1000)2 x (1011)2
Multiplication (3/4) • The length of the multiplication • n-bit multiplicand • m-bit multiplier • Product is n + m bits long • The n + m bits are required to represent all possible products • We must cope with overflow • Because we frequently want a 32-bit product as the result of multiplying two 32-bit numbers
Multiplication (4/4) • The design mimics the algorithm • We learned in grammar school • Assume • Multiplier: in the 32-bit Multiplier register • 64-bit Product register is initialized to 0 • Write new values into the Product register • 64-bit Multiplicand register • Need to move the multiplicand left one digit each step • Over 32 steps a 32-bit multiplicand would move 32 bits to the left
1. Multiplier0 = 1 T F 1a. 將被乘數與乘積相加,然後把結 果放入乘積暫存器內 2. 將被乘數暫存器左移一位元 3. 重複32次否? 將乘數暫存器右移一位元 F T The First Multiplication Algorithm • Product register is initialized to 0 • Each step took a clock cycle • Require almost 100 clock cycle
First Multiply Algorithm (ex. p.180) • Using 4-bit number to save space • Multiply 2ten X 3ten (0010 X 0011)
The Second Multiplication Algorithm • Multiplicand register, ALU, and Multiplier register are all 32 bits wide • Only Product register is 64 bits (initial = 0) • The multiplier is placed instead in the right half on the Product register
1. Multiplier0 = 1 T F 1a. 把被乘數加到乘積的 左半邊 , 然後把結果放到乘積暫存器的左半邊 將乘積暫存器右移一位元 2. 3. 重複32次否? 將乘數暫存器右移一位元 F T The Second Multiplication Algorithm
Multiply in MIPS • MIPS has two instructions • Multiply: mult • Multiply unsigned: multu • MIPS multiply instructions ignore overflow • Up to the software to check to see if the product is too big to fit in 32 bits
3.5 Division n n quotient 0 0 0 dividend divisor 0 partial remainder array 0 0 remainder n Division is just a bunch of quotient digit guesses and left shifts and subtracts
Division • MIPS has two instructions • Divide: div • Divide unsigned: divu • As with multiply, divide ignores overflow • Software must determine if the quotient is too large • Software must also check the divisor to avoid division by 0
3.6 Floating Point (p.189) • We need a way to represent • numbers with fractions, e.g., 3.14159265 (π) • very small numbers, e.g., .000000001 • very large numbers, e.g., 3.15576 X 109
Floating Point • Representation • sign, exponent, significand: (–1)sign X significand X 2exponent • more bits for significand gives more accuracy • more bits for exponent increases range • IEEE 754 floating point standard: • single precision: 8 bit exponent, 23 bit significand • double precision: 11 bit exponent, 52 bit significand
s E (exponent) F (fraction) 1 bit 8 bits 23 bits Representing Big (and Small) Numbers • What if we want to encode the approx. age of the earth? 4,600,000,000 or 4.6 x 109 or the weight in kg of one a.m.u. (atomic mass unit) 0.0000000000000000000000000166 or 1.6 x 10-27 • Floating point representation (-1)sign x F x 2E • Still have to fit everything in 32 bits (single precision)
S Exponent Significand Floating Point Form 1-bit 8-bit 23-bit S Exponent Significand 1-bit 11-bit 20-bit Significand (continue) 32-bit • Generally of the form (-1)S * F* 2E • Single precision • Double precision
S Exponent Significand Floating Point Form 1-bit 8-bit 23-bit • Generally of the form (-1)S * F* 2E • Single precision • S: the sign of the floating-point number • Exponent: the value of the 8-bit exponent field • Fraction(significand): the 23-bit number • Sign and magnitude (符號與大小) • The sign has a separate bit from the rest of the number
Overflow & Underflow • Overflow • Means that the positiveexponent is too large to be represented in the exponent field • Underflow • Means that the negative exponent is too large to be represented in the exponent field
IEEE 754 floating-point standard (1/2) • These format go beyond MIPS • They are part of the IEEE 754 floating-point standard • To pack even more bits into the significand • IEEE754 makes the leading 1bit of normalized binary numbers implicit • The number is 24 bits long in single precision • Implied 1 and a 23-bit fraction • The number is 53 bits long in double precision • Implied 1 and a 52-bit fraction
IEEE 754 floating-point standard (2/2) • 0 has no leading 1 • It is given the reserved exponent value 0 • 000…00two represents 0 • The representation of the rest of the numbers (-1)S * (1+Fraction)* 2E @F is stored in normalized form where the msb in the fraction is 1 (so there is no need to store it!) – called the hidden bit @ E specifies the value in the exponent field • If number the bits of the fraction from left to right s1, s2, s3,… (-1)S*(1+(s1*2-1) +(s2*2-2) +(s3*2-3)+…)*2E