160 likes | 333 Views
FIR Tap Filter Optimization CE222 Final Project Spring 2003. S oleste H ilberg N icole S tarr. FIR Tap Filter Finite Impulse Response Filter. FIR filters are one of the two primary types of digital filters used in Digital Signal Processing (DSP) applications To implement the filter:
E N D
FIR Tap Filter OptimizationCE222 Final ProjectSpring 2003 Soleste Hilberg Nicole Starr
FIR Tap FilterFinite Impulse Response Filter • FIR filters are one of the two primary types of digital filters used in Digital Signal Processing (DSP) applications • To implement the filter: • 1. Put the input sample into the delay line • 2. Multiply each sample in delay line by corresponding coefficient & accumulate result • 3. Shift the delay line by one sample to make room for the next input sample
FIR Tap Filter • A FIR filter produces an output, y(n), that is the weighted sum of the current and past inputs, x(n) • A 3-tap filter is based on 3 previous inputs yn = b0 x n + b1 x n-1 + b2 x n-2 = bi x n-i 3 i = 0
FIR Tap Filter Input: Sample [23:0] Output: Result [47:0] Clock frequency: 20 MHz 48 24 Sample Result Input_valid output_ready rst clk
FIR Tap Filter Executes processes in the following order: • (Load)Load input data to rin • (Calc1)Multiply rin with coefficient c0, store to acc • (Calc2)Multiply rs0 with coefficient c1, add to acc, store in acc • (Calc3)Multiply rs1 with coefficient c2, add to acc, store in acc • (Shift)Store acc in result, move rs0 to rs1, move rin to rs0
INPUT OUTPUT 000001 000000702a78 000002 0000016054f0 000003 000002a054af 000004 000003e0546e 000005 00000520542d 000006 0000066053ec 000007 000007a053ab 000008 000008e0536a 000009 00000a205329 00000a 00000b6052e8 INPUT OUTPUT 000009 00000bbffdb7 000008 00000b1fa886 000007 000009dfa8c7 000006 0000089fa908 000005 0000075fa949 000004 0000061fa98a 000003 000004dfa9cb 000002 0000039faa0c 000001 0000025faa4d 000000 0000011faa8e Testbench Input / Output
First Attempt: 5 cycle latency3 calculation states calc1 Input_valid = 0 load Input_valid = 1 wait output_ready = 0 calc2 shift output_ready = 0 calc3
Results: 5 cycle latency Verilog Simulation Results: Testbench simulation complete Required time to complete: 10250 Number of inputs processed: 20 Cycles to complete: 102.5 Time-to-input ratio: 5.125
Second Attempt: 3 cycle latencyCondense 3 calculation states into 1 state load Input_valid = 0 load Input_valid = 1 wait output_ready = 0 calc1 output_ready = 0 shift
Results: 3 cycle latency Verilog Simulation Results: Testbench simulation complete Required time to complete: 6450 Number of inputs processed: 20 Cycles to complete: 64.5 Time-to-input ratio: 3.225
Third Attempt: 2 stage pipeline Stage 1: Shift registers, load new input value Stage 2: Calculate results time 1 2 3 4 5 6 1 2 3 4 5 STAGE 1 STAGE 2 STAGE 1 STAGE 2 Instruction STAGE 1 STAGE 2 STAGE 1 STAGE 2 STAGE 1 STAGE 2
Results: 2 stage pipeline Verilog Simulation Results: Testbench simulation complete Required time to complete: 2150 Number of inputs processed: 20 Cycles to complete: 21.5 Time-to-input ratio: 1.075
Speedup:Pipeline over State Machine • 3 stage state machine: • 37% faster than 5 stage state machine • 2 stage pipeline machine: • 67% faster than 3 stage state machine • 79% faster than 5 stage state machine
5 state machine Combinational area: 63983.300781 Noncombinational area: 13199.155273 Total cell area: 77182.453125 2 stage pipeline Combinational area: 19166.716797 Noncombinational area: 4071.513672 Total cell area: 23238.230469 Results Comparison: Area
5 state machine data required time 9.77 data arrival time -9.77 slack (MET) 0.00 2 stage pipeline data required time 9.00 data arrival time -9.00 slack (MET) 0.84 Results Comparison: Timing