Computer Science 210 Computer Organization

Computer Science 210Computer Organization Floating Point Representation

Real Numbers Format 1: <whole part>.<fractional part> Examples: 0.25, 3.1415 … Format 2 (normalized form): <digit>.<fractional part> × <exponent> Example: 2.5 × 10-1 In mathematics, infinite range and infinite precision (“uncountably infinite”)

math.pi >>> import math >>> math.pi 3.141592653589793 >>> print(math.pi) 3.14159265359 >>> print("%.50f" % math.pi) 3.14159265358979311599796346854418516159057617187500 Looks like about 48 places of precision (in base10)

IEEE Standard Single precision: 32 bits Double precision: 64 bits 2.5 × 10-1 Reserve some bits for the significand (the digits to the left of ×) and some for the exponent (the stuff to the right of ×) Double precision uses 53 bits for the significand, 11 bits for the exponent, and one sign bit Approximate double precision range is 10-308 to 10308

S Exponent Significand 1 8 23 IEEE Single Precision Format • 32 bits • Roughly (-1)S x F x 2E • F is related to the significand • E is related to the exponent • Rough range • Small fractions 2 x 10-38 • Large fractions 2 x 1038

Fractions in Binary In general, 2-N = 1/2N 0.12 = 1 × 2-1 = 1 × ½ = 0.510 0.012 = 1 × 2-2 = 1 × ¼ = 0.2510 0.112 = 1 × ½ + 1 × ¼ = 0.7510

Decimal to Binary Conversion (Whole Numbers) • While N > 0 do Set N to N/2 (whole part) Record the remainder (1 or 0) Set A to remainders in reverse order

Decimal to Binary - Example • Example: Convert 32410 to binary N Rem N Rem 324 162 0 5 0 81 0 2 1 40 1 1 0 20 0 0 1 10 0 • 32410 = 1010001002

Decimal to Binary - Fractions • While N > 0 (or enough bits) do Set N to N*2 (whole part) Record the whole number part (1 or 0) Set N to fraction part Set bits to sequence of whole number parts (in order obtained)

Decimal fraction to binary - Example • Example: Convert .6562510 to binary N Whole Part .65625 1.31250 1 0.6250 0 1.250 1 0.50 0 1.0 1 .6562510 = .101012

Decimal fraction to binary - Example • Example: Convert .4510 to binary N Whole Part .45 0.9 0 1.8 1 1.6 1 1.2 1 0.4 0 0.8 0 1.6 1 .4510 = .011100110011…2

Round-Off Errors >>> 0.1 0.1 >>> print("%.48f" % 0.1) 0.100000000000000005551115123125782702118158340454 >>> print("%.48f" % 0.25) 0.250000000000000000000000000000000000000000000000 >>> print("%.48f" % 0.3) 0.299999999999999988897769753748434595763683319092 Caused by conversion of decimal fractions to binary

Scientific Notation - Decimal • Number Normalized Scientific 0.000000001 1.0 x 10-95,326,043,000 5.326043 x 109

S Exponent Significand 1 8 23 Floating Point • IEEE Single Precision Standard (32 bits) • Roughly (-1)S x F x 2E • F is related to significand • E is related to exponent • Rough range • Small fractions 2 x 10-38 • Large fractions 2 x 1038

Floating Point – Exponent Field • This comes before significand for sorting purposes • With 8 bit exponent range would be –128 to 127 • Note: -1 would be 11111111 and with simple sorting would appear largest. • For this reason, we take the exponent, add 127 and represent this as unsigned. This is called bias 127. • Then exponent field 11111111 (255) would represent 255 – 127 = 128. Also 00000000 (0) would represent 0 – 127 = -127. • Range of exponents is –127 to 128

Floating Point – Significand • Normalized form: 1.1011… x 2E • Hidden bit trick: Since the bit to left of binary point is always 1, why store it? • We don’t. • Number = (-1)S x (1+Significand) x 2E-127

Floating Point Example: Convert 312.875 to IEEE Step 1. Convert to binary: 100111000.111 Step 2. Normalize: 1.00111000111 x 28 Step 3. Compute biased exponent in binary: 8 + 127 = 135  10000111 Step 4. Write the floating point representation: 0 10000111 00111000111000000000000 or 439C7000 in hexadecimal

Floating Point Example: Convert IEEE 11000000101000… to decimal Step 1. Sign bit is 1; so number is negative Step 2. Exponent field is 10000001 or 129; so actual exponent is 2 Step 3. Significand is 010000…; so 1 + Significand is 1.0100… Step 4. Number = (-1)S x (1+Significand) x 2E-127 = (-1)1 x (1.010) x 22 = -101 = -5

Computer Science 210 Computer Organization