130 likes | 353 Views
FFT Survey. Eric Jackowski ECE 734. Fourier Transforms. The Discrete Fourier Transform (DFT) converts time domain data to frequency domain data. where the Twiddle Factor is The Inverse DFT (IDFT) coverts frequency domain data to the time domain
E N D
FFT Survey Eric Jackowski ECE 734
Fourier Transforms • The Discrete Fourier Transform (DFT) converts time domain data to frequency domain data. where the Twiddle Factor is • The Inverse DFT (IDFT) coverts frequency domain data to the time domain • The DFT/IDFT requires N2 complex multiplies and N(N-1) complex additions
Fast Fourier Transform • The FFT provides a fast method for computing the DFT. • Break an N-point DFT into two N/2-Point FFTs plus N/2 butterfly operations • This process continues by recursively dividing the DFT • Reduces complexity of O(N2) to O(NlogrN)
Radix-2 FFT • An N-point radix-2 FFT requires log2(N) x (N/2) butterfly operations • With the DIT FFT, the inputs are in bit-reversed order and outputs are in sequential order • Sequential order: 000, 001, 010, 011, 100, 101, 110, 111 0 1 2 3 4 5 6 7 • Bit-reversed order: 000, 100, 010, 110, 001, 101, 011, 111 0 4 2 6 1 5 3 7 • With the decimation in frequency (DIT) FFT, the inputs are in sequential order and outputs are in bit-reversed order
Butterfly Operations • A radix-2 DIT butterfly operations is represented as: • where A, B, W, X, and Y are all complex numbers. • The complex operations can be performed as:
DIT FFT Radix-2 Decimation-in-time (DIT) FFT with N = 8
DIF FFT • Coefficients change • Data accesses change • Butterflies change • Outputs bit reversed • Number of butterflies and amount of memory same
Higher-Radix FFTs • Radix-r FFT divides an N-point DFT into (N/r)*logr(N) r-point FFTs • Reduces the number of butterflies, but each butterfly is more complex • Reduces total number of complex multiplies • Can improve latency and throughput • Complicates overall design Radix-4 DIT butterfly
FFT Implementations • Sequential FFT Processor - only has one butterfly unit • Requires (N/r)*logr(N) butterfly • Relatively low area, but long latency and low throughput • Pipelined (Cascaded) FFT Processor – one butterfly unit per stage or column (logrN butterfly units total) • One butterfly does not need complex multiplier • Higher area, but lower latency and better throughput • May be easier to access memory and support multiple FFT sizes • Parallel-iterative FFT Processor – one butterfly unit per row (N/r butterfly units total) • Array FFT Processor – fully parallel implementation ((N/r)*logr(N) butterfly units total)
Memory Address Generation • Handling bit reversed input or output • Sequential FFT Processor – Handle by how you read from or write to memory • Pipelined FFT Processor – May need an extra memory buffer that is written to and then read from in a different order • Multiple Banks for low power • Minimize twiddle memory accesses • Reorder butterflies
Other Research • Additions and multiplications can increase the size of the data • May want to use larger internal datapath • May want to saturate results of additions • May want to round results of multiplies • Supporting different FFT sizes • Sequential FFT processor – change address generation based on FFT size • Pipelined FFT processor – skip stages that aren’t needed • Decreasing Twiddle Rom size
IFFT • Implementing the IFFT using the FFT • Swap the real and imaginary inputs • Perform the FFT • Swap the real and imaginary outputs • Scale output by 1/N