1 / 19

CSE 246: Computer Arithmetic Algorithms and Hardware Design

CSE 246: Computer Arithmetic Algorithms and Hardware Design. Winter 2004 Lecture 10 Thursday 02/19/02. Instructor: Prof. Chung-Kuan Cheng. Topics:. Rounding F.P. Numbers Ch. 11 (all). Rounding the numbers. Why we need the Sticky bit Round bit Guard bit. Example 1. 1.00000x2 4

ruth-perez
Download Presentation

CSE 246: Computer Arithmetic Algorithms and Hardware Design

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSE 246: Computer Arithmetic Algorithms and Hardware Design Winter 2004 Lecture 10 Thursday 02/19/02 Instructor: Prof. Chung-Kuan Cheng

  2. Topics: • Rounding F.P. Numbers • Ch. 11 (all)

  3. Rounding the numbers • Why we need the • Sticky bit • Round bit • Guard bit

  4. Example 1 1.00000x24 -1.10000x2-3 Normalize according to exponent 1.00000 x24 -0.00000011x24 0.11111101x24 Renormalize 1.1111101x23 Result = 1.11111x23 Sticky Bit Round bit Take 5 bits after decimal

  5. Example 2 1.00001x23 -1.01011x2-1 Normalize according to exponent 1.00000 x23 -0.000101011x23 0.111100101x23 Renormalize 1.11100101x22 Result = 1.11101x22 Bit on the boundary Round bit Take 5 bits after decimal Non-zero => round-up

  6. Theory behind it • When shifting right, don’t need to remember anything more than 3 bits below • This is a necessaryand sufficient condition • The most we ever normalize is by just 1 bit after a subtraction, since all numbers are exponent-normalized before the operation g r guard Other bits round OR Sticky bit

  7. Chapter 11 • Polynomial Approximation of Functions

  8. Taylor Series f(x) = f(x0) + Example: sin(x) = x – x^3/3! + x^5/5! – x^7/7!+…

  9. Taylor Series Given: PN(x) = = c0+x(c1+x(c2+…+x(cN-1+xcN))))) R(N) =cN R(i-1) =ci-1+xR(i) … PN (X) =R(0) How to calculate value of function? Recursively Group common factors …. N multiples and adds

  10. Taylor Series • 1 adder => do it in series • Given more components => can we go faster? • Take N = 7 as example c7x7+c6x6+c5x5+c4x4+c3x3+c2x2+c1x1+c0 How to accelerate?

  11. Taylor Series c7x7+c6x6+c5x5+c4x4+c3x3+c2x2+c1x1+c0 • But this is not much better. Still have overhead of 3 stages to generate x^7 x x x x x x x x x2 + + x3 + + x4 Carry-save =constant time x5 x6 x7 + + + Log n

  12. Taylor Series c7x+c6 c5x+c4x c3x+c2 c1x+c0 x2(c7x+c6)+c5x+c4x x2(c3x+c2)+c1x+c0 x4[x2(c7x+c6)+c5x+c4x]+x2(c3x+c2)+c1x+c0 • This is a bit faster. Only 2 stages • But what is fastest way to produce result? & energy efficient? => minimize[# of multiplies] • All this uses +’s and x’s. Need to get rid of them. => Let’s to try table look-up x x2 x4

  13. Taylor Series – Table look-up • SRAM/DRAM => eat power • ROM => better option f(x) = • Suppose there is a table as a binary tree. • Let x = xH + xL x0 = xH Example X = 110101 xH = 110000 f(xH + xL) = xL = 000101

  14. Taylor Series – Table look-up • 1st order f(xH + xL) ~= => Only 1 multiplication !!! f(xH) Table-1 xH f’(xH) + f(xH + xL) x Table-2 x xL

  15. Taylor Series • With extra order => 1 Extra table and 1 multiplier • If you wish to change the function, all you have to do is just change the content of the table • Problem? => Now it’s the size of the table! 2^L L /

  16. Taylor Series • Let’s reduce X into 3 sections (instead of the previous 2 (High and Low) ) x = x1+x22-k+x32-2k => f(x)= f(x1+x22-k)+x32-2k + f ’(x1) + Epsilon E ~= 2-3k f(x) requires a 2n x Vn table 2n: # of bits of x Vn: # bits of f(x) 32bit x => 2^32 x 2^32 = 2^64 bits -> HUGE!! -> but do we really need all those #’s in the table??

  17. Taylor Series Let E = epsilon, [] = Lower limit x*y = (x+y)^2 / 4 – (x-y)^2 / 4 = ( [(x+y)/2] + E/2 )^2 - ( [(x-y)/2] + E/2 )^2 = [ (x+y)/2 ] ^ 2 - [ (x-y)/2 ] ^ 2 - E * y x Content of lower bits determines lower bits of result, but not other bits !! ……… Table x^2 ………

  18. Taylor Series • 2^n x V vs. 2^n x (v-w ) + 2^L x w 2^n x v – (2^n x w - 2^L x w ) 2^n x v – w (2^n - 2^L ) Size of table is reduced by 2^n x v n v x / / f(x) 2^n x (v-w) v-w n x / / f(x) w 2^L x w L / /

  19. End of Ch. 11 • Some parts of Ch. 11 (e.g. log ) will be covered part of Ch. 12 discussion

More Related