Hongtao Du

Hongtao Du AICIP Research ECE Department University of Tennessee Feb 23, 2005

Background • Blind Source Separation (BSS) Motivation: “cocktail party problem” • BSS Model: (Mixing) (Unmixing) • BSS Algorithms • ICA • LCNN • Pixel level processing the observed signal (pixel) the source signal (pure pixel or noise) weight matrix or unmixing matrix

Synthesis Structures • Serial Processing • Processing pixel-by-pixel in a serial sequence • Parallel Processing • Using SIMD structure • Multiple pixels in, multiple pixels out • Depending on hardware constraints • Segment Processing • Pipeline structure • Parallel processing

Contrast Stretching s, r : grey level of input pixel and output pixel

Component Contrast

Component Contrast - RTL

Component Contrast - Schematic

Top-level - Schematic

Pre-layout Simulation

Pre-layout Simulation – Small Signal

Pre-layout Simulation – Reset

Pre-layout Simulation – Write Enable

Contrast Stretching (32-bit) – FPGA layout

Contrast Stretching (8-bit) – FPGA layout

Constraint Requested Frequency Estimated Delay Actual TS_Clk = PERIOD TIMEGRP "Clk" 100 nS HIGH 50 nS 100.000 ns 14.464 ns 69.14MHz 16.63 ns 32-bit TS_Clk = PERIOD TIMEGRP "Clk" 100 nS HIGH 50 nS 100.000 ns 7.450 ns 134.23MHz 11.33 ns Comparison 32-bit v.s. 8-bit • Device utilization summary: • 32-bit • Number of External IOBs 132 out of 158 83% • Number of Occupied SLICEs 605 out of 12288 4% • 8-bit • Number of External IOBs 36 out of 158 22% • Number of Occupied SLICEs 53 out of 12288 1% • Clock Report 8-bit

Parallel Contrast- Schematic

Parallel Contrast Stretching – FPGA layout

Constraint Requested Frequency Estimated Delay Actual TS_Clk = PERIOD TIMEGRP "Clk" 100 nS HIGH 50 nS 100.000 ns / / 16.63 ns Constraint • Device utilization summary: • 32-bit • Number of External IOBs 580 out of 158 367% • Number of Occupied SLICEs 4838 out of 12288 39% • Too many required IOBs, exceeding the target FPGA capacity • 8-bit • Number of External IOBs 148 out of 158 93% • Number of Occupied SLICEs 422 out of 12288 3% • Clock Report 32-bit TS_Clk = PERIOD TIMEGRP "Clk" 100 nS HIGH 50 nS 100.000 ns 6.748 ns 148.19MHz 8-bit 11.63 ns

Pipeline Contrast- Schematic

Top-level - Schematic

Pre-layout Simulation - threshold

Pipeline Contrast Stretching – FPGA layout

Constraint Requested Frequency Estimated Delay Actual TS_Clk = PERIOD TIMEGRP "Clk" 100 nS HIGH 50 nS 100.000 ns 20.944ns 47.75MHz 11.63 ns Synthesis Performance Synthesis Performance (8-bit) Device: Xilinx V1000EHQ-6 • Device utilization summary: • Number of External IOBs 156 out of 158 98% • Number of Occupied SLICEs 586 out of 12288 4% • Total equivalent gate count for design 13,474 • Clock Report

Structure Requested Estimated Delay Actual Frequency 100.000 ns 7.450 ns 134.23MHz 11.33 ns Serial Parallel 100.000 ns 6.748 ns 148.19MHz 11.63 ns 100.000 ns 20.944ns 47.75MHz Pipeline 11.63 ns Serial v.s. Parallel • Serial processing should have the minimum delay, but actually not. • Parallel processing is the fastest structure • Pipeline is the most efficient structure, but very slow.

Serial Parallel Pipeline

Hongtao Du

Hongtao Du

Presentation Transcript

Du beau du bon Du bonnet

Hongtao Du

Mots du jour du Vendredi

Correction du questionnaire du musée.

Hongtao Yu Zhaoqing Zhang Xiaobing Feng Wei Huo

Nianle Su, Hongtao Hou, Feng Yang, Qun Li and Weiping Wang

Hongtao Du, Hairong Qi, Gregory Peterson Department of Electrical and Computer Engineering

Agadu- du - du ...

Panelist ： JIAO, Hongtao ; LUO, Min