1 / 16

CSE 246: Computer Arithmetic Algorithms and Hardware Design

CSE 246: Computer Arithmetic Algorithms and Hardware Design. Winter 2004 Lecture 9. Instructor: Prof. Chung-Kuan Cheng. Topics:. Floating Point Numbers (IEEE P754) Standard Operations Exceptional Situations Rounding Modes. Standard. 2 32  Typically. Goal: Dynamic Range:

Download Presentation

CSE 246: Computer Arithmetic Algorithms and Hardware Design

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSE 246: Computer Arithmetic Algorithms and Hardware Design Winter 2004 Lecture 9 Instructor: Prof. Chung-Kuan Cheng

  2. Topics: • Floating Point Numbers (IEEE P754) • Standard • Operations • Exceptional Situations • Rounding Modes

  3. Standard 232  Typically • Goal: Dynamic Range: largest #/ smallest # • If too large, holes between #’s

  4. Standard • ulp (unit in the last place) • Difference between two consecutive values of the significand. 3 Parts  x =  s  be Sign Bit Significand 8-bit exponent

  5. Standard • a1a2a3a4a5a6a7a8b1b2b3b22b23 • 1.* normalized number • 0.* denormalized number 0 0.b1b2b3b22b23  2-126 1 --------------------------------- 1. b1b2b3b22b23  2-126 2 . . . 253 254 ------------------------------- 1.b1b2b3b22b23  2127 •   if bi = 0 for all i = 1,2,…,23, NaN otherwise NaN  Not a Number

  6. Standard 0.01x2-3 = 0.00x2-2 • Same number, so normalize to remove redundancy • Smallest Number 0.00…01x2-126 = 1.0x2-23x2-126 = 1x2-149 • 1.1101111001110011100101 • Difference between 2 #’s small for normalized 0.0001 2 times compared to magnitudes 0.0010

  7. Standard - Example s. eeeeeeee nnnnnnnnnnnnnnnnnnnnnnn 0.00000000 00000000000000000000000 = 0.000…0x2-126 1.00000000 00000000000000000000000 = 0 0.00000001 00000000000000000000000 = 1.000…0x2-126 - minimal normalized # 0.00000001 00000000000000000000001 = 1.000…1x2-126 . . . 0.01111111 00000000000000000000001 = 1.000…1x20 0.10000000 00000000000000000000001 = 1.000…0x21

  8. Standard – Example Cont. 0.11111110 00000000000000000000001 = 1.000…1x2127 0.11111110 11111111111111111111111 = 1.111…1x2127 - Normalized Maximum 0.11111111 00000000000000000000000 =  Nmin = 1.0 x 2-126 Nmax = (2 – 2-23)2127

  9. Double Floating Point • a1a2…a11b1b2…b52 000…00 0. b1b2…b52 x 2-1023 000…01 1. b1b2…b52 x 2-1022 . . . 011…11 1. b1b2…b52 x 20 100…00 1. b1b2…b52 x 21 . . . 111…10 1. b1b2…b52 x 21023 111…11 =  if bi = 0 for all i = 1,2,…,52

  10. Overflow/Underflow Underflow Denser Sparser Nmin Nmax Overflow

  11. Addition/Multiplication • s1xbe1 + (s2xbe2) = sxbe = s1xbe1 + s2/be1-e2 x be1 = (s1  s2/be1-e2) x be1 • (s1xbe1) x (s2xbe2) = (s1xs2)be1+e2

  12. Exceptions a/0 =  if a > 0 a/ = 0 if a != 0 a·0 = 0 a· =  if a > 0 0· = invalid operation (NaN) 0/0 = invalid operation (NaN) NaP op a = NaN a +  =   -  = NaN

  13. Rounding Mode • Adder Output = Cout z1z0.z-1z-2…z-l GRS Guard Bit Round Bit Sticky Bit, OR of all bits below bit R 1.101 x 23 +1.110 x 23 11.011 x 23 1.1011x24 Normalize – need to round or

  14. Rouding 1.110 x 23 - 1.101 x 23 0.001 x 23 1.000 x 20 normalize 1.101 x 23 - 1.111 x 22 1.101 x 23 - 0.1111 x 23 0.1101 x 23 1.101 x 22 Guard bit

  15. Rounding • Round to the nearest even • toward 0 1.1011 • Toward + 1.1100 • Toward - 1.1011

  16. Conventional Rounding Error Rounding Error 1.10100  1.101 = 0 1.10101  1.101 = -0.25 1.10110  1.110 = +0.5 1.10111  1.110 = +0.25 Average Error = 0.5/4 = 0.125

More Related