Understanding Number Representations in Computer Science: Fixed-Point vs. Floating-Point

Fixed-point and floating-point numbers CS370 Fall 2003

Representations of numbers • Unsigned integers • Signed integers – 1’s and 2’s complement representation • To represent • Very Large and very Small numbers • Real numbers in general • Fixed-point numbers • Floating-point numbers

Base-10 (decimal) arithmetic • Uses the ten numbers from 0 to 9 • Each column represents a power of 10

Standard binary representation • Uses the two numbers from 0 to 1 • Every column represents a power of 2

Fixed-point representation • Uses the two numbers from 0 to 1 • Every column represents a power of 2

Addition Base-10 Base-2

Range of values in a byte

Scientific notation (1) • One billion • 1,000,000,000 • 1 x 109 • significand or mantissa: 1 • base or radix: 10 • exponent: 9

Scientific notation (2) • 1999 • 1.999 x 103 • significand or mantissa: 1999 • base or radix: 10 • exponent: 3 • 19.99 x 10 • 199.9 x 10

Practice (base 10) • 258 = 2.58 x 102 Mantissa = 258 Radix = 10 Exponent = 2 • 24.25 = 2.425 x 101 Mantissa = 2425 Radix = 10 Exponent = 1

Base-2 scientific notation • 2.25ten • 10.01two • 10.01two x 20 • 1.001two x 21  normalized Numbers are usually normalized which means that the leading bit is always a 1.

8-bit floating point format (1)

Improvements • Bias the exponent • Always subtract a fixed amount, e.g., 3 • Allows representation of negative exponents • Implicit one • Leading one in a Phone number such as 1-619-556-0231 is redundant. • Why use a bit for the leading one?

8-bit floating-point format (2) • Exponent (3 bits) is biased by 3 • The leading one of significand is implicit • Zero is represented by all zeros

Single precision 32 bits sign: 1 bit exponent: 8 bits significand: 23 bits Bias: 127 Double precision 64 bits sign: 1 bit exponent: 11 bits significand: 52 bits Bias: 511 IEEE standard floating-point

Practice( base 10) • 13 = 1.3 x 101 = 1.011 x 23 • 1.25 = 1.25 x 100 = 1.010 x 20

Understanding Number Representations in Computer Science: Fixed-Point vs. Floating-Point