Martijn v/d Horst M.G.v.d.Horst@tue.nl

Recursive Filtering on a Vector DSP with Linear Speedup Martijn v/d Horst M.G.v.d.Horst@tue.nl

Outline • Introduction • Vector DSP • Linear Speedup • Recursive (IIR) Filters • Implementation • Generalization • Improvement • Conclusion • Future Work

Introduction • Moore’s Law: The processing power of a microchip doubles every 18 months. • Gilder’s Law: The total bandwidth of communication systems triples every 12 months. • Corollary: Without parallelism, our communication systems will run out of processing power.

Vector DSP • SIMD processor with vector length P • Operations • Basic element-wise operations • Strided Memory Access • Intra-add operation • One operation per clock cycle • Why? • Flexibility • Parallelism • Low cost

Linear Speedup • If you pay twice the cost you get twice the performance (No diminishing returns) • Measure of performance: Throughput (Outputs per clock cycle) • Measure of cost: vector size of the DSP • Approach: produce a number (depending on the vector size) of outputs in constant time.

Input Output FIR Filters The output of an N-th order FIR filter is: the weighted sum of the current input and N previous inputs.

Input Output IIR Filters The output of an N-th order IIR filter is: the weighted sum of the current input, N previous inputs and N previous outputs.

Describing Filters • Transfer Function: • Difference Equation: • State space form:

Block-State The state space form can be rewritten into block state space form:

Block-State Architecture

Block-State Architecture • State of the art (2004) in SIMD • A better VLSI implementation exists since 1987

Incremental Block-State • Linear dependency between block size and hardware • Problem: How to map it onto SIMD?

Incremental Block-State • Choose L = I P • Remove dependencies with pipelining • Assign each stage to a SIMD slice

Philips EVP16 • VLIW SIMD processor with vector length 16 • Simulated strided access • We implemented a second order filter • Speedup is based on a VLIW DSP

Generalization

f f f f Improvement • No intra-add operation • Achieved by applying our method to a MVM

Conclusion • Recursive filtering on vector DSPs with linear speedup is possible, provided that the DSP supports strided memory access • This speedup is not bounded by the order of the filter • This speedup holds for any order filter • The method used can be applied to other cases as well

Future Work • Implementation on Vector DSPs without strided memory access • Adaptive Filters • Other signal processing algorithms

Questions?

Martijn v/d Horst M.G.v.d.Horst@tue.nl