450 likes | 852 Views
Verilog Transcendental Functions for Numerical Testbenches. Mark G. Arnold University of Manchester Institute of Science & Technology, UK Colin Walter University of Manchester Institute of Science & Technology, UK Freddy Engineer
E N D
Verilog Transcendental Functions for Numerical Testbenches Mark G. Arnold University of Manchester Institute of Science & Technology, UK Colin Walter University of Manchester Institute of Science & Technology, UK Freddy Engineer Xilinx, Inc., San Jose, CA
Numeric-intensive embedded hardware systems • use transcendental functions • FFT: sin(x),cos(x) • Neural Nets: 1/(1+exp(-x)) • Graphics: sqrt(x*x+y*y+z*z), trig • How to test such designs in Verilog? • Need testbench aware of math functions • Problem: 1993 Verilog lacks these functions • Problem: Access to C functions needs PLI • Problem: Synthesizable Verilog uses reg • Numeric algorithm uses real
Ways to use reg as real Fixed-point (FX) cheap but hard to design Floating point (FP) easier but more expensive Logarithmic number system (LNS) cheaper than FX, easy as FP, but weird see www.xlnsresearch.com for more
We need log(x) and exp(x) for LNS, so why not design something more general: a transcendental package for Verilog? Useful for testbenches that verifies hardware involving transcendental functions. Available: www.cs.uwyo.edu/~marnold/verilogmath.html module math; `include “math.v” endmodule
Black-Box Testbench Testbench Verilog does not know about sin(x),log(x) Only can compare against expected overall behaviour Embedded Hardware sin(x) log(x) etc… other hard- ware )
White-Box Testbench Testbench Verilog sin(x) Verilog log(x) Can test individual function units Embedded Hardware sin(x) log(x) etc… other hard- ware
sin(x) • Syntax: math.sin(x) • Computed by: c1x+c3 x3+c5 x5+c7 x7 • in the range: -/2 < x < /2 • Range reduction: x<0, sin(x) = -sin(-x) • x>/2, sin(x) = -sin(x-) • Errors: none
cos(x) Syntax: math.cos(x) Computed by: sin(x + /2) Range reduction: like sin(x) Errors: none
tan(x) Syntax: math.tan(x) Computed by: sin(x)/cos(x) Errors: x = /2
tan-1(x) • Syntax: math.atan(x) • Computed by: b0+a1 / (x2+b1-a2 / • (x2+b2 – a3/(x2+b3))) • in the range: 0 < x < 1 • Range reduction: x < 0, tan-1(x) = tan-1(-x) • x > 1, tan-1(x) = /2 - tan-1(1/x) • Errors: none
cos-1(x) Syntax: math.acos(x) _____ Computed by: tan-1( 1.0-x2 / x ); in the range: 0 < x < 1 Errors: x<0, x>1
sin-1(x) • Syntax: math.asin(x) • ______ • Computed by: tan-1( x / 1.0 - x2 ) • in the range: 0 < x < 1 • Errors: x < 0, x > 1
ex Syntax: math.exp(x) Computed by: 2x ln(2) Range reduction: e-x = 1/ex Errors: x>177
xy Syntax: math.pow(x,y) Computed by: e y ln(x) in the range: x>0 Errors: x<=0, y ln(x)>177
__ x Syntax: math.sqrt(x) Computed by: eln(x) / 2 in the range: x > 0 Errors: x < 0
ln(x) Syntax: math.log(x) Computed by: log2(x) / log2(e) in the range: x > 0 Errors: x <= 0
2x Syntax: N/A Computed by: products of rootof2(i) Errors: x > 255
For k bits of precision, rootof2(-k) starts with 2k root of two Our function uses k = 23 bits of precision, rootof2(-23) = 83886082 = 1.000000082629586 Simpler example: k = 2, rootof2(-2) = 42 = 1.1892 When ith bit of x is one, multiply product by corresponding rootof2(i)
rootof2(i) squares on each iteration: i=-2, ( 42 )1 = 1.1892 Example: x = 5.75 = 101. 112, 2x1.1892
rootof2(i) squares on each iteration: i=-2, ( 42 )1 = 1.1892 i=-1, ( 42 )2 = 1.18922 = 1.4142 Example: x = 5.75 = 101. 112, 2x1.4142*1.1892
rootof2(i) squares on each iteration: i=-2, ( 42 )1 = 1.1892 i=-1, ( 42 )2 = 1.18922 = 1.4142 i= 0, ( 42 )4 = 1.41422 = 2.0 Example: x = 5.75 = 101. 112, 2x2.0*1.4142*1.1892
rootof2(i) squares on each iteration: i=-2, ( 42 )1 = 1.1892 i=-1, ( 42 )2 = 1.18922 = 1.4142 i= 0, ( 42 )4 = 1.41422 = 2.0 i=+1, ( 42 )8 = 2.02 = 4.0 Example: x = 5.75 = 101. 112, 2x 2.0*1.4142*1.1892
rootof2(i) squares on each iteration: i=-2, ( 42 )1 = 1.1892 i=-1, ( 42 )2 = 1.18922 = 1.4142 i= 0, ( 42 )4 = 1.41422 = 2.0 i=+1, ( 42 )8 = 2.02 = 4.0 i=+2, ( 42 )16 = 4.02 = 16.0 Example: x = 5.75 = 101. 112, 2x16.0*2.0*1.4142*1.1892 = 53.8165
Code that computes 2x1 prod = 1.0; power = 128.0; for (i = 7; i >= -23; i = i-1) begin if (x1 > power) begin prod = prod * rootof2(i); x1 = x1 - power; end power = power / 2.0; end
log2(x) Syntax: N/A Computed by: iteration involving rootof2(i) Errors: x < 0
Example x= 53.8165 • 53.8165 >= 16.0? • log2(x) ???. ??2
Example x= 53.8165 • 53.8165 >= 16.0? • yes, 53.8165/16.0 = 3.3635 • 3.3635 >= 4.0? • log2(x) 1??. ??2
Example x= 53.8165 • 53.8165 >= 16.0? • yes, 53.8165/16.0 = 3.3635 • 3.3635 >= 4.0? • no • 3.3635 >= 2.0? • log2(x) 10?. ??2
Example x= 53.8165 • 53.8165 >= 16.0? • yes, 53.8165/16.0 = 3.3635 • 3.3635 >= 4.0? • no • 3.3635 >= 2.0? • yes, 3.3635/2.0 = 1.68175 • 1.68175 >= 1.4142? • log2(x) 101. ??2
Example x= 53.8165 • 53.8165 >= 16.0? • yes, 53.8165/16.0 = 3.3635 • 3.3635 >= 4.0? • no • 3.3635 >= 2.0? • yes, 3.3635/2.0 = 1.68175 • 1.68175 >= 1.4142? • yes, 1.68175/1.4142 = 1.1892 • 1.1892 >= 1.1892? • log2(x) 101. 1?2
Example x= 53.8165 • 53.8165 >= 16.0? • yes, 53.8165/16.0 = 3.3635 • 3.3635 >= 4.0? • no • 3.3635 >= 2.0? • yes, 3.3635/2.0 = 1.68176 • 1.68176 >= 1.4142? • yes, 1.68176/1.4142 = 1.1892 • 1.1892 >= 1.1892? • yes, … • log2(x) 101. 112 = 5.75
Code that computes log2(re) log2 = 0.0; for (i=7; i>=-23; i=i-1) begin if (re > rootof2(i)) begin re = re/rootof2(i); log2 = 2.0*log2 + 1.0; end else log2 = log2*2; end
Conclusions We provide 23-bit functions for testbench use We have trig functions useful for FFT and graphics We have exp, log, pow and sqrt functions useful for LNS and neural nets Our functions are compact and portable don’t need PLI encourage white-box testing
IEEE 754 standard for Floating Point (FP): Mantissa: reg with significant bits Exponent: reg that determines scaling Each has a separate datapath Simulator supports real type real p,t,a,b; OK: t=a+b; OK: p=a*b; Synthesis does not support real WRONG: p=a*b; WRONG: t=a+b;
Have to design by hand: Both + and * are expensive Most embedded systems avoid FP Debugging: $display($bitstoreal(a));
Fixed point (FX) reg scaled by a fixed power of two Simulator supports FX only partially: OK t=a+b; WRONG: p=a*b; Synthesis is same Have to design by hand: Rounding for * Scaling for * + is same as integer: cheap Scaling difficult: FX delays time to market But, FX less expensive than FP Most embedded systems use FX
Debugging: $display(`SCALE * a); `SCALE = constant designer determines
LNS reg has fixed-point logarithm Neither synthesis nor simulation supports Multiply is as cheap as + Automatic scaling like FP Lower Cost, Power than FX LNS is seldom used Debugging: 1993 Verilog lacks log/exp