110 likes | 271 Views
Implementation of Fast Fourier Transform on General Purpose Computers. Tianxiang Yang. FFT Formulation. Basically a matrix-vector product:. FFT - What do we already have?. A history of theoretical ideas: Gauss (1805). First but largely unnoticed.
E N D
Implementation of Fast Fourier Transform on General Purpose Computers Tianxiang Yang
FFT Formulation • Basically a matrix-vector product:
FFT - What do we already have? • A history of theoretical ideas: • Gauss (1805). First but largely unnoticed. • Cooley-Tukey (1965). Reduces the order of the number of operations from N2 to Nlog2(N). Also suitable for any length of FFT computation. • Yanve (1968). Requires the least known number of multiplications, as well as additions for length 2n FFTs. • Almost uncountable others.
Motivation: Divide and Conquer • Map the original problem into several sub-problems in such a way the the following inequality is satisfied: sum(cost(subproblems)) + cost(mapping) < cost(original problem)
Main Categories of FFT Algorithms • Original Cooley-Tukey. • Split-radix. • Prime factor. • Winograd FFT algorithms. Many techniques were invented such as: DFT computation as a convolution, computation of the cyclic convolution, etc.
Implementation Issues • General Purpose Computers • Digital Signal Processors • Vector and Multi-Processors • VLSI
FFT implementations on GPP • Algorithms under survey include: • FFTPACK, Temperton, SUNPERF, Sorensen, Bailey, Oorua, Krukar, QFT, Green, Singleton, NRF, FFTW • Special interest: FFTW (Fast Fourier Transform in the West)
Overview of FFTW • Planner + Executor • FFTW has collected a sea of small combinable small programs called “codelets” • Planner tries to minimize the actual execution time, not the number of floating point operations. • A dedicated FFTW compiler is used to combine codelets by the plan by wisely allocating register and memory usage and by taken advantages of the processor pipeline.
FFTW • Generates unexpected code specific optimized for the current machine. An adaptive approach. • Performance results: • Significant faster than most proposed implementations. • Faster or equivalent to some machine specific optimized library • Best FFT on GPP ever.
Reference • A.V. Oppenheim and R.W. Schafer, Discrete-time Signal Processing. Englewood Cliffs, NJ 07632. Prentice-Hall, 1989. • P. Duhamel and M. Vetterli, “Fast Fourier Transforms: A Tutorial Review and a State of the Art”, Signal Processing, vol. 19, Apr. 1990 • http://www.fftw.org (official FFTW site).