230 likes | 245 Views
Lecture 2 Addendum Rounding Techniques. Required Reading. Parhami, Chapter 17.5, Rounding schemes Rounding Algorithms 101 http://www.diycalculator.com/popup-m-round.shtml. Rounding.
E N D
Lecture 2AddendumRounding Techniques ECE 645 – Computer Arithmetic
Required Reading Parhami, Chapter 17.5, Rounding schemes Rounding Algorithms 101 http://www.diycalculator.com/popup-m-round.shtml
Rounding Rounding occurs when we want to approximate a more precise number (i.e. more fractional bits L) with a less precise number (i.e. fewer fractional bits L') Example 1: old: 000110.11010001 (K=6, L=8) new: 000110.11 (K'=6, L'=2) Example 2: old: 000110.11010001 (K=6, L=8) new: 000111. (K'=6, L'=0) The following pages show rounding from L>0 fractional bits to L'=0 bits, but the mathematics hold true for any L' < L Usually, keep the number of integral bits the same K'=K
Whole part Fractional part xk–1xk–2 . . . x1x0.x–1x–2 . . . x–lyk–1yk–2 . . . y1y0 Round Rounding Equation • y = round(x)
Rounding Techniques • There are different rounding techniques: • 1) truncation • results in round towards zero in signed magnitude • results in round towards -∞ in two's complement • 2) round to nearest number • 3) round to nearest even number (or odd number) • 4) round towards +∞ • Other rounding techniques • 5) jamming or von Neumann • 6) ROM rounding • Each of these techniques will differ in their error depending on representation of numbers i.e. signed magnitude versus two's complement • Error = round(x) – x
xk–1xk–2 . . . x1x0.x–1x–2 . . . x–lxk–1xk–2 . . . x1x0 trunc ulp 1) Truncation • Truncation in signed-magnitude results in a number chop(x) that is always of smaller magnitude than x. This is called round towards zero or inward rounding • 011.10 (3.5)10 011 (3)10 • Error = -0.5 • 111.10 (-3.5)10 111 (-3)10 • Error = +0.5 • Truncation in two's complement results in a number chop(x) that is always smaller than x. This is called round towards -∞ or downward-directed rounding • 011.10 (3.5)10 011 (3)10 • Error = -0.5 • 100.01 (-3.5)10 100 (-4)10 • Error = -0.5 The simplest possible rounding scheme: chopping or truncation
x chop( ) 4 4 3 3 2 2 1 1 x x – 4 – 3 – 2 – 1 1 2 3 4 – 4 – 3 – 2 – 1 1 2 3 4 – 1 – 1 – 2 – 2 – 3 – 3 – 4 – 4 Truncation Function Graph: chop(x) x chop( ) Fig. 17.5 Truncation or chopping of a signed-magnitude number (same as round toward 0). Fig. 17.6 Truncation or chopping of a 2’s-complement number (same as round to -∞).
Bias in two's complement truncation • Assuming all combinations of positive and negative values of x equally possible, average error is -0.375 • In general, average error = -(2-L'-2-L )/2, where L' = new number of fractional bits
Implementation truncation in hardware • Easy, just ignore (i.e. truncate) the fractional digits from L to L'+1 xk-1 xk-2 .. x1 x0. x-1 x-2 .. x-L = yk-1 yk-2 .. y1 y0. ignore (i.e. truncate the rest)
2) Round to nearest number • Rounding to nearest number what we normally think of when say round • 010.01 (2.25)10 010 (2)10 • Error = -0.25 • 010.11 (2.75)10 011 (3)10 • Error = +0.25 • 010.00 (2.00)10 010 (2)10 • Error = +0.00 • 010.10 (2.5)10 011 (3)10 • Error = +0.5 [round-half-up (arithmetic rounding)] • 010.10 (2.5)10 010 (2)10 • Error = -0.5 [round-half-down]
Round-half-up: dealing with negative numbers • Rounding to nearest number what we normally think of when say round • 101.11 (-2.25)10 110 (-2)10 • Error = +0.25 • 101.01 (-2.75)10 101 (-3)10 • Error = -0.25 • 110.00 (-2.00)10 110 (-2)10 • Error = +0.00 • 101.10 (-2.5)10 110 (-2)10 • Error = +0.5 [asymmetric implementation] • 101.10 (-2.5)10 101 (-3)10 • Error = -0.5 [symmetric implementation]
Round to Nearest Function Graph: rtn(x)Round-half-up version Symmetric implementation Asymmetric implementation
Bias in two's complement round to nearestRound-half-up asymmetric implementation • Assuming all combinations of positive and negative values of x equally possible, average error is +0.125 • Smaller average error than truncation, but still not symmetric error • We have a problem with the midway value, i.e. exactly at 2.5 or -2.5 leads to positive error bias always • Also have the problem that you can get overflow if only allocate K' = K integral bits • Example: rtn(011.10) overflow • This overflow only occurs on positive numbers near the maximum positive value, not on negative numbers
Implementing round to nearest (rtn) in hardware Round-half-up asymmetric implementation • Two methods • Method 1: Add '1' in position one digit right of new LSB (i.e. digit L'+1) and keep only L' fractional bits xk-1 xk-2 .. x1 x0. x-1 x-2 .. x-L + 1 = yk-1 yk-2 .. y1 y0. y-1 • Method 2: Add the value of the digit one position to right of new LSB (i.e. digit L'+1) into the new LSB digit (i.e. digit L) and keep only L' fractional bits xk-1 xk-2 .. x1 x0. x-1 x-2 .. x-L + x-1 yk-1 yk-2 .. y1 y0. ignore (i.e. truncate the rest) ignore (i.e truncate the rest)
Fig. 17.9 R* rounding or rounding to the nearest odd number. Round to Nearest Even Function Graph: rtne(x) • To solve the problem with the midway value we implement round to nearest-even number (or can round to nearest odd number) Fig. 17.8 Rounding to the nearest even number.
Bias in two's complement round to nearest even (rtne) • average error is now 0 (ignoring the overflow) • cost: more hardware
4) Rounding towards infinity • We may need computation errors to be in a known direction • Example: in computing upper bounds, larger results are acceptable, but results that are smaller than correct values could invalidate upper bound • Use upward-directed rounding (round toward +∞) • up(x) always larger than or equal to x • Similarly for lower bounds, use downward-directed rounding (round toward -∞) • down(x) always smaller than or equal to x • We have already seen that round toward -∞ in two's complement can be implemented by truncation
Rounding Toward Infinity Function Graph: up(x) and down(x) up(x) down(x) down(x) can be implemented by chop(x) intwo's complement
x inward( ) 4 3 2 1 x – 4 – 3 – 2 – 1 1 2 3 4 – 1 – 2 – 3 – 4 Two's Complement Round to Zero • Two's complement round to zero (inward rounding) also exists
Other Methods • Note that in two's complement round to nearest (rtn) involves an addition which may have a carry propagation from LSB to MSB • Rounding may take as long as an adder takes • Can break the adder chain using the following two techniques: • Jamming or von Neumann • ROM-based
5) Jamming or von Neumann Chop and force the LSB of the result to 1 Simplicity of chopping, with the near-symmetry or ordinary rounding Max error is comparable to chopping (double that of rounding)
xk–1 . . . x4x3x2x1x0.x–1x–2 . . . x–lxk–1 . . . x4y3y2y1y0 ROM ROM address ROM data 6) ROM Rounding Fig. 17.11 ROM rounding with an 8 2 table. Example: Rounding with a 32 4 table Rounding result is the same as that of the round to nearest scheme in 31 of the 32 possible cases, but a larger error is introduced when x3 = x2 = x1 = x0 = x–1 = 1
ANSI/IEEE Floating-Point Standard ANSI/IEEE standard includes four rounding modes: Round to nearest even [default rounding mode] Round toward zero (inward) Round toward + (upward) Round toward – (downward)