1 / 17

Function Evaluation Using Tables and Small Multipliers

Function Evaluation Using Tables and Small Multipliers. CS252A, Spring 2005 Jason Fong. Overview. Want to obtain values of elementary functions sin(x), cos(x), e x Full lookup table would be too large Bipartite and multipartite tables

vlora
Download Presentation

Function Evaluation Using Tables and Small Multipliers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Function Evaluation Using Tables and Small Multipliers CS252A, Spring 2005 Jason Fong

  2. Overview • Want to obtain values of elementary functions • sin(x), cos(x), ex • Full lookup table would be too large • Bipartite and multipartite tables • Split into multiple smaller tables and add values to obtain an approximation

  3. Table Method With Small Multipliers • Similar to multipartite method • Approximate using 5th order Taylor expansion • Use a set of smaller tables and some small multipliers • Better precision for same amount of hardware when compared to bipartite and multipartite methods

  4. Taylor Series • Approximates the value of f(x) near x = a • More terms give a better approximation • But not directly applicable for table values

  5. Making a Taylor Series Useful • Split n-bit input x into x0, x1, x2, x3, x4 • x0, x1, x2, x3 are k-bits wide • x4 is p-bits wide • 4k+p = n • p < k • Use first 5 terms, and set a = x0 • Rearrange terms into groups that depend on only two parts of x • Reduces possible values for each group • Reduces number of rows in a group’s table of values

  6. Resulting Formula • Each term depends on only two parts of x • Compute all possible values of each term and create a lookup table with those values • Lookup table row number obtained by concatenating input values • Some terms require small multiplications • Add together all terms to get the function value

  7. Input Restrictions • x is in a fixed-point format • x is in the range [0,1) • Range reductions common in approximation methods • Apply transformation to reduce range of input • Obtain approximation • Apply another transformation to obtain final value

  8. Block Diagram

  9. Area Reduction in Tables • n = 23, k = 5, p = 3 • Full lookup table: • 2n entries, each 4k+p bits • ~8 million rows • Smaller tables: • 22k entries of 4k+p+g bits (Table A) • 22k entries of 2k+p+g bits (Table B) • 2 x 22k entries of k+p+g bits (Tables C and E) • 2p+k entries of p+g bits (Table D) • ~5000 rows

  10. Multipliers • Two small multipliers: • k x k+p+g • k x p+g • One operand less than ¼ size of input precision • Modern FPGA’s include small multipliers

  11. Implementation • Java program calculates values of tables • Function evaluator implemented using Altera Quartus II • Size and delay measurements for Altera Stratix II FPGA

  12. Building Table Values • Java program generates Verilog code implementing each lookup table • Iterate through each combination of (x0,x1), (x0, x2), etc. and calculate the corresponding value of the table • Check correctness by iterating through all values of x and comparing with function’s real value

  13. Guard Bits • Can find worse-case number of guard bits required based on logic structure • May not actually need all the guard bits • Adjust guard bit value and find minimum needed for a particular function

  14. Results • Synthesized for an Altera Stratix II • 12480 ALUTs • 96 DSP blocks (used as multipliers) • f(x) = ex, n=23 • 2143 ALUTs (17%) • 4 DSP blocks (4%) • 23 ns delay

  15. In Comparison...

  16. Possible Improvements • Optimize final adder • Currently using a generic parallel adder • Not all operands are the same width • Can optimize by making a custom adder • Merge multiplications into the final adder • Move partial product arrays into the adder • Change splitting of the x input • Improves table size • More complicated formulas for table values

  17. References • D. Defour, F de Dinechin, and J.-M. Muller, "A New Scheme fo Table-Based Evaluation of Functions," Proc. 36th Asilomar Conf. Signals, Systems, and Computers, Nov. 2002 • F. de Dinechin, A. Tisserand, "Multipartite Table Methods," IEEE Transactions on Computers, March 2005 • M. Ercegovac, T. Lang, Digital Arithmetic, Ch. 10

More Related