1 / 69

Floating Point in computers

Floating Point in computers. Comply with standards: IEEE 754 ISO/IEC 559. Timeline. Introduction quite short Binary review not so long Integer Arithmetic 1/3 Floating Point 1/3 Floating Point Arithmetic 1/3 Other issues extra short. Introduction.

fionan
Download Presentation

Floating Point in computers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Floating Point in computers Comply with standards: IEEE 754 ISO/IEC 559

  2. Timeline • Introduction quite short • Binary review not so long • Integer Arithmetic 1/3 • Floating Point 1/3 • Floating Point Arithmetic 1/3 • Other issues extra short

  3. Introduction • Who does computer arithmetic? • Intel’s spare money • How is it done in hardware? • How Integer relates to Floating point • Now, we go back to “computer structure”

  4. Binary numbers • What is 1 0 0 1 0 1 1 . 0 0 1 0 1 ? 64 8 2 1

  5. Signed Binary Integers • Sign-magnitude • 2’s complement • 1’s complement • biased

  6. Sign-Magnitude • High order bit = Sign • 0101 = 5 • 1101 = -5 • 2 zero’s

  7. 2’s complement • Number + Negative = 2n • 0101 = 5 • 1011 = -5 • Easy addition (drop carry) • Formula: -an-12n-1 + an-22n-2 + … +a121 + a0

  8. 1’s Complement • Negative - complement to 1 • 0101 = 5 • 1010 = -5 • 2 zero’s • Number + Negative = 2n-1

  9. Biased • Binary = Number + Bias • Bias = 5: 1101 = 5 5+5=10 0000 = -5 (-5)+5 = 0 • Relative order remains

  10. Integer Arithmetic

  11. Adding (usigned) Integers • Elementry school : 1 1 0 0 1 1 0 1 1 0 0 0 0 1 1 0 1 1 1 + 1 0 1 0 1 0 0 1 1 • Result has n+1 bits!

  12. a b a b Cout Cout Cin s s Adding Integers - hardware Full Adder Half Adder 2 logical levels

  13. a0 b0 a1 b1 an-1 bn-1 an-2 bn-2 Cin Cout s0 s1 sn-1 sn-2 Ripple carry Adder • Slow - 2n logical levels • Small constant (CMOS) • Other ways exist

  14. Adding Signed Integers • In 2’s complement: b + (-a) = b + (2n-a) = 2n + (b-a) (-b) + (-a) = (2n-b)+(2n-a) = (2n - (b+a)) + 2n • hence - add as integers, discard carry out • Example: 0011 + 1100 = ?

  15. Substracting Integers • Add the negation • Negating 2’s complement: 11010100101011000110000 = ? 001010110101001110 1 0000

  16. Integer (unsigned) Multiplication • Elementry school : 1 1 0 1 * 1 0 0 1 1 1 0 1 0 0 0 0 0 0 0 0 1 1 0 1 0 1 1 1 0 1 0 1 • Result is 2n bits !

  17. Shift Carry P A n n B n Hardware Multiplier • P=0 • loop: (i) if A0=1, add B to P (ii) right-shift P & A

  18. Integer (unsigned) Division • Elementry school : 0 1 0 0 11 1101 00 Result: 0100, Rem 1 Dec: 13/3=4, Rem 1 011 11 000 00 001 00 01

  19. Hardware Divider Shift P A n+1 n B 0 n+1 • P=0 • loop: (i) left-shift P & A (ii) Sub. B from P: positive: a0=1 negative: a0=0, restore P (add B)

  20. Example • 13 / 3 = 4 (1) • n=4 • A=1101 B=00011 P=00000

  21. P A B 0 0 0 0 0 1 1 0 1 0 0 0 1 1

  22. P A B 0 0 0 0 1 0 1 0 0 0 0 0 1 1 Remainder Quotient

  23. Division - remarks • Non-restoring Algorithm • Load P only if positive • Check for 0 • (Total) Result is 2n bits!

  24. Integer arithmetic - remarks • Signed Multiply and Division • Algorithms exist • We will not use them • What to do with extra bits? • Faster methods

  25. Floating Point

  26. Non Integers - Other Methods • Fixed Point • example: # # # . # • Binary point shifted • Integer arithmetic (extra shifting) • Small number magnitude • Rational • a/b (a,bZ)

  27. Floating Point • Exponent + Significand (= Mantisa) • x = s • 2e • Example: s=101 e=011 x = 101 • 211 = 5 • 23 = 40 = 101000

  28. Uniqueness • Denormal Numbers: 123.456  107 0.123  104 • Normalized: #.###  10# 1.123  104 • What about 0 ?

  29. Floating Point Standard • Why Standartize? • Hardware accelerators • Software compatibility • Build Software Libraries • etc….. • IEEE 754-1985 ISO/IEC 559 • Includes: Structure, Arithmetic results

  30. Float Types • 4 Precision Types: • Single • Single extended • Double • Double extended

  31. Single Precision • 32 bits: • Exponent (e): Biased ( + 127) • Significand (f): Fixed fraction: 0 . # # # … • Nuber: 1.f • 2e-127 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Sign(1) Exponent(8) Significand(23)

  32. Single Precision - Example • 1 10000001 01000000000000000000000 • 10000001 = 129  129-127=2  1.01= 1.25 • 01000… = 0.01000… • X = - 1.25 • 22 • X = - 5

  33. Single Precision - Range • Emax = 127 (e = 254) • Emin = -126 (e = 1) • Why |Emin|<|Emax|? • 1/2Emin does not overflow • Why Biased notation? • What about 0 and 255 ?

  34. Floating Point Precision

  35. Exmaples • We shall use base 10 sometimes: • f will have 3 digits • Emax will be 98 • Emin will be -97 • Ex: 5.341070

  36. NaN • Not a Number • Result of ilegal computation: • Any computation involving a NaN • e = Emax + 1 & f  0 • # 11111111 ####################### • Many NaN’s (different f’s)

  37. NaN’s in use • Zero finder outside domain • f(x) = sqrt(x) - 1 • Works since all computations NaN • No exception caused !

  38. Zero’s • 0 00000000 00000000000000000000000 ? • this is NOT 1.02Emin • 1 00000000 00000000000000000000000 ? • 0 is signed! 0 both exits! • What is the difference?

  39. Signed 0’os • +0 = -0 BUT: • Multiply/Divide keep sign rules: • Monivation: • Using inf correctly (describe later) • log(x) : log(0)=-inf log(negative)=Nan log(x) if x(-0) ?

  40. ± inf • More logic: • e = Emax + 1 & f =0 • # 11111111 00000000000000000000000

  41. Inf usage Example (If tan-1 is defined properly)

  42. More on 0’os and inf’s • General Rule for 0/inf arithmetic: • Take appropriate limit: • 1/(1/x) where x=0 or inf • Why not Max # instead?

  43. Zero’s and inf’s - yet again • X/(x2+1) is bad! Why? • 1/(x+x-1) is better • Do we need to check for x=0? • Using 2 zero’s and inf’s saves some special cases checks.

  44. Denormalized numbers • Example: • x=1.23•10-98y=1.11•10-98 • x-y = 1.20•10 -99 = 0 • so: x-y=0 but: x  y • think of: if(x  y) then z=1/(x-y) • Soluition: • use denormalized numbers!

  45. Denormal Numbers • Smallest normal: 1.0 • 2Emin • Below, use denormal: 0.f • 2Emin • e = Emin - 1 & f  0 • # 00000000 ####################### • Gradual underflow: 1.23 • 10-4 ( /10 ) 0.12 • 10-4 ( /10 ) 0.01 • 10-4 ( /10 ) 0

  46. Denormal Numbers • Back to our Example: • x=1.23•10-98y=1.11•10-98 • x-y = 0.12•10 -98 • and this is not 0 !

  47. Flush to 0 Vs Gradual Underflow 2-2 2-1 0 2-4 2-3 2-2 2-1 0 2-4 2-3

  48. Special Values - Summary ExponentFractionRepresents Emin-1 f=0 0 Emin-1 f0 0.f2Emin Emin  e  Emax ---- 1.f2e Emax+1 f=0 0 Emax+1 f0 0.f2Emin

  49. Rounding • Why is rounding needed? • Infinit numbers  Finit representation • Integers only overflow • Almost all operations need rounding • IEEE - specifies algorithms for arithmetic

  50. Numbers need rounding • Out of range: • x>22Emax x<12Emin • Between 2 floats: • 0.110 = 0.00011001100….2 = 1.1001100…. 2-4 • 1.10012-4

More Related