250 likes | 350 Views
Parallel accelerator project. Final presentation Summer 2008 Student Vitaly Zakharenko Supervisor Inna Rivkin Duration semester. System functionality Large picture. Multiple signal sources share the same media. Each source produces a periodic pulse sequence in the media.
E N D
Parallel accelerator project Final presentation Summer 2008 Student VitalyZakharenko Supervisor Inna Rivkin Duration semester
System functionalityLarge picture • Multiple signal sources share the same media. • Each source produces a periodic pulse sequence in the media. • Observer of the media senses superposed pulse sequences with the addition of noise. • Preprocessor detects pulses in the signal and stores each pulse as pulse TOA (time of arrival). • The pulse TOA array produced by the preprocessor is conveyed to the system. • The system separates pulses into original signals (i.e. into periodic pulse sequences).
Signal produced by source # 1 Signalproducedbysource # 2 Signal as seen by observer Missing pulse effect Missing pulse effect TOA1 TOA3 TOA4 TOA5 TOA6 TOA7 TOA8 TOA9 TOA10 TOA11 TOA2 System output : pulses separated by source Data structure for signal representation TOA1 TOA1 TOA2 TOA2 TOA3 TOA3 TOA4 TOA4 TOA5 TOA5 TOA6 TOA6 TOA7 TOA7 TOA8 TOA8 TOA9 TOA9
System components Simulator On a PC constructs datagrams. Datagram switch On the FPGA manages flow of datagrams between the simulator and the processing units. Data processing units On the FPGA each unit processes datagrams.
Main system components Simulator PC Switch Processing unit • Processing unit • Processing unit • Processing unit • Processing unit • Processing unit FPGA
Data processing units Each unit contains Nios II processor and C2H generated H/W accelerators. Nios II embeddedprocessor Avalon switch fabric Avalon switch fabric Histogram builder C2H generated accelerator Sequence search C2H generated accelerator
Data processing algorithm for {level} := 1 up to {maximum level} do 1. Build histogram of differences (SDIF) of level:= {level}. 2. Add SDIF to cumulative histogram (CDIF). 3. Find lowest periodicity column of CDIF above threshold. 4. if {column found} = TRUE then 4.1. Detect all pulse sequences of the periodicity. 4.2. Mark pulses as associated. end if 5. Check whether to break the loop. end for
Data processing example Source 1 signal Source 2 signal Source 3 signal Observed signal a a a a a b b b b b c c c c c
Data processing example Observed signal CDIF CDIF SDIF(level = 1) Cumulative histogram (CDIF) update a a b b a a a a a b b b b b c c c c c c c
Data processing example Threshold crossing check Threshold function No periodicity candidate No sequence search CDIF a b c
Data processing example Observed signal a+b c+a b+c Cumulative histogram (CDIF) update SDIF(level = 2) CDIF CDIF c c a a a a a b b b b b c c c c c c+a c+a b+c b+c b a a b a+b a+b
Data processing example Threshold crossing check c Threshold function CDIF No periodicity candidate No sequence search c+a b+c b a a+b
Data processing example Observed signal a+b c+a b+c a+b+c Cumulative histogram (CDIF) update SDIF(level = 3) CDIF CDIF c c a a a a a b b b b b c c c c c b+c c+a c+a b+c b b a a a+b+c a+b a+b a+b+c
Data processing example Threshold crossing check Threshold satisfied by periodicity (a+b+c) Search for all sequences of periodicity (a+b+c) Threshold function c CDIF b+c c+a b a a+b a+b+c
Data processing example Sequence search results (final results) Detected sequence # 3 Detected sequence # 1 Detected sequence # 2
ID Control Bits Len TOA 1 TOA 2 ... TOA N Input datagram format 64 bits
Output datagram format Field name Size (bytes) Control fields set 2 Length 2 ID 4 Total pulses associated 2 Total sequences detected 2 Association of pulse 1 1 Association of pulse 2 1 … … Association of pulse N 1 Total pulses associated with sequence 1 4 PRI of sequence 1 4 Jitter of sequence 1 4 Confidence level 1 of sequence 1 4 Confidence level 3 of sequence 1 4 PRI of sequence 2 4 … …
Implementation for Nios II Testing and profiling • In Visual Studio (VS) floating point calculations were replaced by fixed point • C code of the algorithm was ported from VS to Nios IDE • Algorithm was profiled on Nios II
SoPC system generation • H/w design was generated in AlteraSoPC Builder environment
SoPC system generation • Different SoPC system configurations were compared • SoPC system was optimized • multiple clock domains were provided for • interconnect was minimized • different processor types were compared
C2H Acceleration • C2H h/w accelerators were generated for two blocks of the algorithm: • Sequence search function (FindSeqs) • Histogram builder function (BuildHist)
C2H acceleratorsPerformance optimization • Sequence search (FindSeqs) function acceleration • Accelerator results unsatisfactory • Consumes great amount of FPGA logic • Low acceleration gain (X4 at most) • Discarded after much efforts wasted in optimization
C2H acceleratorsPerformance optimization • Sequence search (BuildHist) function acceleration • Good acceleration results • X50 acceleration gain • Moderate FPGA logic consumption
Design performanceFPGA resources • 6% logic consumption • 5% memory consumption
Design performance Timing • 1 up to 7 ms processing time • 3 Nios systems significantly outperform Pentium 4 processor