720 likes | 891 Views
9-11. maxint. MSB. LSB. minint. Number Representations. 32-bit signed numbers (2’s complement): 0000 0000 0000 0000 0000 0000 0000 0000 two = 0 ten 0000 0000 0000 0000 0000 0000 0000 0001 two = + 1 ten ...
E N D
maxint MSB LSB minint Number Representations • 32-bit signed numbers (2’s complement):0000 0000 0000 0000 0000 0000 0000 0000two = 0ten0000 0000 0000 0000 0000 0000 0000 0001two = + 1ten... 0111 1111 1111 1111 1111 1111 1111 1110two = + 2,147,483,646ten0111 1111 1111 1111 1111 1111 1111 1111two = + 2,147,483,647ten1000 0000 0000 0000 0000 0000 0000 0000two = – 2,147,483,648ten1000 0000 0000 0000 0000 0000 0000 0001two = – 2,147,483,647ten... 1111 1111 1111 1111 1111 1111 1111 1110two = – 2ten1111 1111 1111 1111 1111 1111 1111 1111two = – 1ten • Converting <32-bit values into 32-bit values • copy the most significant bit (the sign bit) into the “empty” bits 0010 -> 0000 0010 1010 -> 1111 1010 • sign extend versus zero extend (lb vs. lbu)
zero ovf 1 1 A 32 ALU result 32 B 32 4 m (operation) MIPS Arithmetic Logic Unit (ALU) • Must support the Arithmetic/Logic operations of the ISA add, addi, addiu, addu sub, subu mult, multu, div, divu sqrt and, andi, nor, or, ori, xor, xori beq, bne, slt, slti, sltiu, sltu • With special handling for • sign extend – addi, addiu, slti, sltiu • zero extend – andi, ori, xori • overflow detection – add, addi, sub
Dealing with Overflow • Overflow occurs when the result of an operation cannot be represented in 32-bits, i.e., when the sign bit contains a value bit of the result and not the proper sign bit • When adding operands with different signs or when subtracting operands with the same sign, overflow can never occur • MIPS signals overflow with an exception (aka interrupt) – an unscheduled procedure call where the EPC contains the address of the instruction that caused the exception
Two’s Complement Arithmetic • Addition is accomplished by adding the codes, ignoring any final carry • Subtraction: change the sign and add • 16 + (-23) =? • 16 - (-23) =? • -23 - (-16) =?
can be formed in parallel and added in parallel for faster multiplication Multiply • Binary multiplication is just a bunch of right shifts and adds n multiplicand multiplier partial product array n double precision product 2n
Multiplication Example 1011 Multiplicand (11 dec) x 1101 Multiplier (13 dec) 1011 Partial products 0000 Note: if multiplier bit is 1 copy 1011 multiplicand (place value) 1011 otherwise zero 10001111 Product (143 dec) Note: need double length result
Add and Right Shift Multiplier Hardware 0 1 1 0 = 6 multiplicand add 32-bit ALU shift right product multiplier Control 0 0 0 0 0 1 0 1 = 5 add 0 1 1 0 0 1 0 1 0 0 1 1 0 0 1 0 0 0 1 1 0 0 1 0 add 0 1 1 1 1 0 0 1 0 0 0 1 1 0 0 1 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 = 30 0 0 0 1 1 1 1 0
Multiplying Negative Numbers • This does not work! • Solution 1 • Convert to positive if required • Multiply as above • If signs were different, negate answer • Solution 2 • Booth’s algorithm
0 16 17 0 0 0x18 MIPS Multiply Instruction • Multiply (mult and multu) produces a double precision product mult $s0, $s1 # hi||lo = $s0 * $s1 • Low-order word of the product is left in processor register lo and the high-order word is left in register hi • Instructions mfhi rd and mflo rd are provided to move the product to (user accessible) registers in the register file • Multiplies are usually done by fast, dedicated hardware and are much more complex (and slower) than adders
MIPS Multiplication • Two 32-bit registers for product • HI: most-significant 32 bits • LO: least-significant 32-bits • Instructions • mult rs, rt / multu rs, rt • 64-bit product in HI/LO • mfhi rd / mflo rd • Move from HI/LO to rd • Can test HI value to see if product overflows 32 bits • mul rd, rs, rt • Least-significant 32 bits of product –> rd
Division • Division is just a bunch of quotient digit guesses and left shifts and subtracts dividend = quotient x divisor + remainder n n quotient 0 0 0 dividend divisor 0 partial remainder array 0 0 remainder n
Division of Unsigned Binary Integers Quotient 00001101 Divisor 1011 10010011 Dividend 1011 001110 Partial Remainders 1011 001111 1011 Remainder 100
Left Shift and Subtract Division Hardware 0 0 1 0 = 2 divisor subtract 32-bit ALU shift left dividend remainder quotient Control 0 0 0 0 0 1 1 0 = 6 0 0 0 0 1 1 0 0 sub 1 1 1 0 1 1 0 0 rem neg, so ‘ient bit = 0 0 0 0 0 1 1 0 0 restore remainder 0 0 0 1 1 0 0 0 sub 1 1 1 1 1 1 0 0 rem neg, so ‘ient bit = 0 0 0 0 1 1 0 0 0 restore remainder 0 0 1 1 0 0 0 0 rem pos, so ‘ient bit = 1 sub 0 0 0 1 0 0 0 1 0 0 1 0 0 0 1 0 rem pos, so ‘ient bit = 1 sub 0 0 0 0 0 0 1 1 = 3 with 0 remainder
0 16 17 0 0 0x1A MIPS Divide Instruction • Divide (div and divu) generates the reminder in hi and the quotient in lo div $s0, $s1 # lo = $s0 / $s1 # hi = $s0 mod $s1 • Instructions mfhi rd and mflo rd are provided to move the quotient and reminder to (user accessible) registers in the register file • As with multiply, divide ignores overflow so software must determine if the quotient is too large. Software must also check the divisor to avoid division by 0.
MIPS Division • Use HI/LO registers for result • HI: 32-bit remainder • LO: 32-bit quotient • Instructions • div rs, rt / divu rs, rt • No overflow or divide-by-0 checking • Software must perform checks if required • Use mfhi, mflo to access result
xx.yyyy 21 2-4 20 2-1 2-2 2-3 Representation of Fractions “Binary Point” like decimal point signifies boundary between integer and fractional parts: Example 6-bit representation: 10.10102 = 1x21 + 1x2-1 + 1x2-3 = 2.62510 If we assume “fixed binary point”, range of 6-bit representations with this format: 0 to 3.9375 (almost 4)
Fractional Powers of 2 i 2-i • 0 1.0 1 • 0.5 1/2 • 0.25 1/4 • 0.125 1/8 • 0.0625 1/16 • 0.03125 1/32 • 0.015625 • 0.0078125 • 0.00390625 • 0.001953125 • 0.0009765625 • 0.00048828125 • 0.000244140625 • 0.0001220703125 • 0.00006103515625 • 0.000030517578125
Example: 0.828125 and 0.1640625 (done in class)
Representation of Fractions So far, in our examples we used a “fixed” binary point. What we really want is to “float” the binary point. Why? Floating binary point most effective use of our limited bits (and thus more accuracy in our number representation): example: put 0.1640625 into binary. Represent as in 5-bits choosing where to put the binary point. … 000000.001010100000… Store these bits and keep track of the binary point 2 places to the left of the MSB Any other solution would lose accuracy! With floating point rep., each numeral carries a exponent field recording the whereabouts of its binary point. The binary point can be outside the stored bits, so very large and small numbers can be represented.
exponent significand radix (base) decimal point Scientific Notation (in Decimal) • Normalized form: no leadings 0s (exactly one digit to left of decimal point) • Alternatives to representing 1/1,000,000,000 • Normalized: 1.0 x 10-9 • Not normalized: 0.1 x 10-8,10.0 x 10-10 6.0210 x 1023
exponent significand radix (base) “binary point” Scientific Notation (in Binary) • Computer arithmetic that supports it called floating point, because it represents numbers where the binary point is not fixed, as it is for integers • Declare such variable in C as float 1.0two x 2-1
31 30 23 22 0 S Exponent Significand 1 bit 8 bits 23 bits Floating Point Representation • Normal format: +1.xxxxxxxxxxtwo*2yyyytwo • 32-bit version (C “float”) S represents Sign Exponent represents y’s Significand represents x’s
-1 1 0 Floating Point Representation • What if result too large? • Overflow! Exponent larger than represented in 8-bit Exponent field • What if result too small? • Underflow! Negative exponent larger than represented in 8-bit Exponent field • What would help reduce chances of overflow and/or underflow? overflow overflow underflow
31 30 20 19 0 S Exponent Significand 1 bit 11 bits 20 bits Significand (cont’d) 32 bits Double Precision Fl. Pt. Representation • 64 bit version (C “double”) • Double Precision (vs. Single Precision) • C variable declared as double • But primary advantage is greater accuracy due to larger significand
QUAD Precision Fl. Pt. Representation • Next Multiple of Word Size (128 bits) • Unbelievable range of numbers • Unbelievable precision (accuracy) • Currently being worked on (IEEE 754r) • Current version has 15 exponent bits and 112 significand bits (113 precision bits) • Oct-Precision? • Some have tried, no real traction so far • Half-Precision? • Yep, that’s for a short (16 bit) en.wikipedia.org/wiki/Quad_precision en.wikipedia.org/wiki/Half_precision
31 30 23 22 0 S Exponent Significand 1 bit 8 bits 23 bits IEEE 754 Floating Point Standard Single Precision (DP similar): • Sign bit:1 means negative, 0 means positive • Significand: • To pack more bits, leading 1 implicit for normalized numbers • 1 + 23 bits single, 1 + 52 bits double • always true: 0 < Significand < 1 (for normalized numbers) • Note: 0 has no leading 1, so reserve exponent value 0 just for number 0
IEEE 754 Floating Point Standard • IEEE 754 uses “biased exponent” representation • Designers wanted FP numbers to be used even if no FP hardware; e.g., sort records with FP numbers using integer compares • Wanted bigger (integer) exponent field to represent bigger numbers • 2’s complement poses a problem (because negative numbers look bigger) • 1.0x 2-1 and 1.0x21 (done in class)
31 30 23 22 0 S Exponent Significand 1 bit 8 bits 23 bits IEEE 754 Floating Point Standard • Called Biased Notation, where bias is number subtracted to get real number • IEEE 754 uses bias of 127 for single precision • Subtract 127 from Exponent field to get actual value for exponent • 1023 is bias for double precision • Summary (single precision): • (-1)S x (1 + Significand) x 2(Exponent-127) • Double precision identical, except with exponent bias of 1023 (half, quad similar)
Single-Precision Range • Exponents 00000000 and 11111111 reserved • Smallest value • Exponent: 00000001 actual exponent = 1 – 127 = –126 • Fraction: 000…00 significand = 1.0 • ±1.0 × 2–126 ≈ ±1.2 × 10–38 • Largest value • exponent: 11111110 actual exponent = 254 – 127 = +127 • Fraction: 111…11 significand ≈ 2.0 • ±2.0 × 2+127 ≈ ±3.4 × 10+38
Double-Precision Range • Exponents 0000…00 and 1111…11 reserved • Smallest value • Exponent: 00000000001 actual exponent = 1 – 1023 = –1022 • Fraction: 000…00 significand = 1.0 • ±1.0 × 2–1022 ≈ ±2.2 × 10–308 • Largest value • Exponent: 11111111110 actual exponent = 2046 – 1023 = +1023 • Fraction: 111…11 significand ≈ 2.0 • ±2.0 × 2+1023 ≈ ±1.8 × 10+308
Floating-Point Precision • Relative precision • all fraction bits are significant • Single: approx 2–23 • Equivalent to 23 × log102 ≈ 23 × 0.3 ≈ 6 decimal digits of precision • Double: approx 2–52 • Equivalent to 52 × log102 ≈ 52 × 0.3 ≈ 16 decimal digits of precision
Floating-Point Example • Represent –0.75 • –0.75 = (–1)1 × 1.12 × 2–1 • S = 1 • Fraction = 1000…002 • Exponent = –1 + Bias • Single: –1 + 127 = 126 = 011111102 • Double: –1 + 1023 = 1022 = 011111111102 • Single: 1011111101000…00 • Double: 1011111111101000…00
Floating-Point Example • What number is represented by the single-precision float 11000000101000…00 • S = 1 • Fraction = 01000…002 • Fxponent = 100000012 = 129 • x = (–1)1 × (1 + 012) × 2(129 – 127) = (–1) × 1.25 × 22 = –5.0
Example: Converting Binary FP to Decimal (done in class) 0 0110 1000 101 0101 0100 0011 0100 0010
Example: Converting Decimal to FP -2.828125 x 101 (done in class)
Representation for 0 • Represent 0? • exponent all zeroes • significand all zeroes • What about sign? Both cases valid +0: 0 00000000 00000000000000000000000 -0: 1 00000000 00000000000000000000000
Special Numbers • What have we defined so far? (Single Precision) • Exponent Significand Object • 0 0 0 • 0 nonzero ??? • 1-254 anything +/- fl. pt. # • 255 0 +/- ∞ • 255 nonzero ???
Representation for Not a Number • What do I get if I calculate sqrt(-4.0)or 0/0? • If ∞ not an error, these shouldn’t be either • Called Not aNumber (NaN) • Exponent = 255, Significand nonzero • Why is this useful? • Hope NaNs help with debugging? • They contaminate: op(NaN, X) = NaN
Infinities and NaNs • Exponent = 111...1, Fraction = 000...0 • ±Infinity • Can be used in subsequent calculations, avoiding need for overflow check • Exponent = 111...1, Fraction ≠ 000...0 • Not-a-Number (NaN) • Indicates illegal or undefined result • e.g., 0.0 / 0.0 • Can be used in subsequent calculations