70 likes | 333 Views
Radar Pulse Compression Using the NVIDIA CUDA SDK. Stephen Bash, David Carpman, and David Holl. HPEC 2008 September 23-25, 2008.
E N D
Radar Pulse Compression Using the NVIDIA CUDA SDK Stephen Bash, David Carpman, and David Holl HPEC 2008 September 23-25, 2008 This work is sponsored by the Air Force Research Laboratory under Air Force contract FA8721-05-C-0002. Opinions, interpretations, conclusions and recommendations are those of the author and not necessarily endorsed by the United States Government.
NVIDIA Compute Unified Device Architecture SDK HPEC 07: • Create custom kernels that run on GPU • Extension of C language • Provides driver- and runtime-level APIs • Includes numerical libraries • CUFFT • CUBLAS • $/GFLOP GPU=$1.27 CPU=$29.18
NVIDIA Compute Unified Device Architecture SDK HPEC 07: • Create custom kernels that run on GPU • Extension of C language • Provides driver- and runtime-level APIs • Includes numerical libraries • CUFFT • CUBLAS • $/GFLOP GPU=$1.27 CPU=$29.18
Radar Pulse Compression • Waveform design and processing to achieve higher range resolution and sensitivity* • Processing consists of convolution with FIR filter • Doppler tolerant (top): traditional frequency domain convolution • Doppler intolerant (bottom): additional FFT and Doppler correction required Replica Fast Time FFT Fast Time IFFT Doppler Correction Replica Slow Time FFT Fast Time FFT Fast Time IFFT Fast Time IFFT * Skolnik, Radar Handbook, Second Edition. McGraw Hill Publishing, Boston, MA, 1990.
GPU vs. CPU Comparison 1D FFT 27000 CPU 3 10368 GPU 4725 2 1960 1000 65536 GPU Speedup FFT Size 1 32768 16384 8192 0.5 4096 2048 0.3 1024 1 4 16 64 256 Batch Size • CPU vs GPU comparison in real-world conditions • 2 GHz dual quad-core AMD Opterons vs eVGA eGeForce 8800 Ultra • Memory transfer to and from GPU included in timing Processing Time Per Stage 500 490 480 60 50 Stage Time (ms) 40 30 20 10 0 Doppler FFT Fast Time FFT Fast Time IFFT Fast Time IFFT Fast Time IFFT Doppler Window Multiply Replica 1 Multiply Replica 2 Multiply Replica 3 Doppler Correction Extract Range Region Extract Range Region Extract Range Region
Reference: $/GFLOP As of July 2007, these products represent the top of the line consumer CPU and graphics card according to floating point computational power: • Kentsfield Core 2 Extreme QX6800 37.7 GFLOPS – fastest CPU as of 7/16/2007 http://www.tomshardware.com/2007/07/16/cpu_charts_2007/page36.html $1100 – price as of March 10, 2008 http://www.google.com/products?q=Kentsfield+Core+2+Extreme+QX6800 $/GFLOPS = $29.18 Notes: Price excludes motherboard + power supply + memory + GPU • EVGA GeForce 8800 Ultra Superclocked (NVIDIA) 576 GFLOPS – theoretical peak http://en.wikipedia.org/wiki/GeForce_8_Series $730 – price as of March 10, 2008 http://www.google.com/products?q=768-P2-N887-AR&scoring=p $/GFLOPS = $1.27 Notes: Price includes 768 MB GDDR3 memory, but excludes: motherboard + power supply + CPU