500 likes | 800 Views
Neuromorphic Image Processing. Ralph Etienne-Cummings The Johns Hopkins University Collaborators: Kabena Boahen, Gert Cauwenberghs, Timothy Horiuchi, M. Anthony Lewis, Philippe Pouliquen Students: Eugenio Culurciello, Viktor Gruev, Udayan Mallik Sponsors: NSF, ONR, ARL.
E N D
Neuromorphic Image Processing Ralph Etienne-Cummings The Johns Hopkins University Collaborators: Kabena Boahen, Gert Cauwenberghs, Timothy Horiuchi, M. Anthony Lewis, Philippe Pouliquen Students: Eugenio Culurciello, Viktor Gruev, Udayan Mallik Sponsors: NSF, ONR, ARL
An Alternative Style of Neuromorphic Image Processing • Traditional image processing uses pixel-serial image access, digitization and sequential processing • Discrete levels, Discrete time • High fidelity images, large vocabulary of functions (GP) • High power, high latency, small sensor/processing area ratio • Traditional neuromorphic vision systems typically uses pixel-parallel processing • Continuous and/or discrete levels, continuous time • Low fidelity images, large pixels, small vocabulary of function (ASICs) • Low power, low-latency • Computation-On-Readout (COR) vision systems uses block-serial-pixel-parallel image processing • Continuous levels, discrete time • High fidelity images, medium vocabulary of function (pseudo-GP) • Low power, medium/low-latency, computation for “free,”
Adaptive SpatioTEmpoRal Imaging (ASTERIx) Architecture • - Digitally controlled analog processing • - Image acts as memory • Parallel execution of multiple filters • Temporal evolution of results • Standard Fetch-Decode-Compute-Store (RISC) architecture possible Competition/recurrence possible
Foveated Tracking Chip • Spatially variant layout of sensors and processing elements • Dynamically controllable spatial acuity • Velocity measurement capabilities • Combined high-resolution imaging and focal-plane processing
VLSI Implementation of Robotic Vision System: Single Chip Micro-Stereo System Single Chip Stereo Optics VLSI Algorithm Chip Layout • A single chip stereo vision system has been implemented • Contains 2, 128 x 128 imagers • Computes full frame disparity in parallel • Provides a confidence measure on computation • Uses a vertical template to reduce noise and computation • Operates at 20 fps • Uses ~30mW @ 5V (can be reduced) Measured data: Line is disparate on the imagers Matlab Simulation of VLSI Algorithm
VLSI Implementation of Robotic Vision System: Spatiotemporal Focal Plane Image Processing Parallel Processed Images Biological Inspiration of the GIP Chip: Orientation Detection Spatiotemporal receptive fields • Implemented CMOS Imagers with focal plane spatiotemporal filters • Realized high resolution imaging and high speed processing • Consumes milliwatts of power • Performs image processing at GOPS/mW (unmatched by any other technology) • Used for optical flow measurement, object recognition and adaptive optics. Spatially Processed: Orientation Selectivity Temporally Processed: Motion Detection
Color-Based Object Recognition on a Chip Skin-tone Identification “Learned” templates Fruit Identification • Implemented chip that contains a camera and a recognition engine • Decomposes the image into Hue, Saturation and Intensity (HSI) • Creates a template of HIS for learned template • Identifies part of the scene that match a template • Used by interactive toys, aides to the blind and Robots Neurons Synapses Smart Camera Chip Coke or Pepsi?
Technology 0.5µm 3M CMOS Array Size APS: 120 (H) x 36 (V) Pixel Size APS: 14.7µm x 14.7µm Fill Factor APS: 16% Power Consumption (with 3.3V supply) 3.2mW FPN (APS) – Dark (Std. Dev./Full Scale) Pixel-Pixel (within column): 0.6% Column-Column: 0.7% FPN (APS) – Half Scale (Std. Dev./Full Scale) Pixel-Pixel (within column): 0.7% Column-Column: 1.2% VLSI Implementation of Robotic Vision System: Visual Tracking Target Tracking Sample Image Low Noise Imaging and Motion Tracking Chip • Implemented CMOS Imager with active pixel sensor and motion tracking • Obtain low noise image • Tracks multiple targets simultaneously • Consumes milliwatts of power • Used for optical flow measurement, target tracking, 3D mouse and robot assisted surgical systems.
VLSI Implementation of Robotic Vision System: Ultrasonic Imaging and Tracking Bearing Estimation with Spatiotemporal Filters Ultrasonic Array Processing Bearing Estimation Algorithm Bearing/Range Mapping and Novelty Detection • Implemented ultrasonic bearing estimation chip and change detection chip • Uses sonic flow across microphone array to measure bearing of target • Creates internal map of environment • Detects changes in the structure of the environment • Operates on milliwatts of power • Used for surveillance and navigation MEMS Front-End Bearing Estimation Chip Bearing Change Detection Range Change Detection
Descending signals VLSI Implementation of Central Pattern Generators (CPG) for Legged Locomotion Biologically Inspired Locomotion controller Silicon Integrate-and-fire Neuron New Biped: Snappy • Implemented a general purpose CPG chip • Contains 10 Neurons • Allows 10 fully connected neurons • Allows 10 inputs from off-chip • Allows Spike and Graded neuron inputs • Allows digitally programmable synapses • Operates on microwatts of power • Used to control legged locomotion Adaptive Locomotion Controller Neurons Synapses 10 Neuron CPG Chip
Outline • Photo-transduction: • Active Pixel Sensors • Dynamic Range Enhancement • Current Mode • Spatial Processing: • Image Filtering • Spatiotemporal Processing: • Change Detection • Motion Detection • Spectral Processing: • Color-Based Object Recognition
Conventional CMOS Cameras:Integrative Photo-detection Simple 3-T APS: Fossum, 1992 Integrative Imagers: Voltage domain; Dense arrays (1.25-T); Low Noise; Low dynamic range (~45 – 60dB), Not ideal for computation
Conventional CMOS Cameras:Integrative Photo-detection • 150 million sold in 2004, 55% annual growth rate to 700 million by 2008 • Power consumption is relatively low ( ~ 10’s of mW for VGA) • 2 Mega Pixels is probably the limit of usefulness • Download bandwidth is a problem (service providers would like more people to download their pictures) • There is a fear that it will represent the next technology bubble …. So much hype, legal problems … • Small (~ 100 x 100 pixels) imagers, with smarts (e.g. motion, color processing) have market in toys, sensor networks, computer mouse … Camera phones are driving the CMOS camera market
Spike-Based CMOS Cameras:Octopus Imaging Concept • Other approaches: • W. Yang, “Oscillator in a Pixel,” 1994 • J. Harris, “Time to first Spike,” 2002 Sample Image Culurciello, Etienne-Cummings & Baohen, 2003
Front-End of Vision Chips: Photoreception Adaptation After Normann & Werblin, 1974 • Time adaptive (band-pass) • Voltage domain • Large dynamic range (9 orders) • Can be large pixels (Caps) • Can have mismatch? Adaptive Phototransduction (Delbruck, 1994)
Front-End of Vision Chips: Photoreception • Current Domain Imaging • (Mead et al, 1988) • Wide dynamic range (9 orders) • Simple to implement (2 Trans.) • Ideal for computation (KCL) • Poor matching (10 – 15%) • Slow turn-off • Transfer function is non-linear • Photo sensitive elements: • Phototransistors: ~100pA/um2 • Photodiodes: ~1pA/um2
How Can We Improve Current Mode Imagers - Linear Current Mode APS • Photodiode linear discharges with light intensity • Amplified linear current output from the APS - Incorporate noise correction techniques at the focal plane • Current mode Correlated Double Sampling (CDS) • Improve the quality of image noise characteristics • Easy integration with processing units – convolution, ADC, others.
Complete Imaging System Pixel Vt variations are eliminated from the final current output!
Measured FPN figure • Image quality has been improved • Non-linearity due to mobility degradation degrades performance under bight light
Spatial Processing: Image Filtering
Architectural Concept:Visual Receptive Fields High resolution Imaging array Programmable Scanning Registers Parallel Processed Images Spatiotemporal receptive fields Etienne-Cummings, 2001
Results – Spatial Image Processing Enhanced Imaging • 1. Vertical Edge Detection (3x3) • 2. Horizontal Edge Detection (3x3) • 3. Laplacian Filter (3x3) • 4. Intensity Image • 6. Vertical Edge Detection (5x5) • 7. Horizontal Edge Detection (5x5) • 8. Laplacian Filter (5x5) • 9. Gaussian Filter (5x5) 1. Intensity Image 2. Horizontal Edges 3. Enhanced Image = Intensity + Horizontal Edge Image
Results – Spatial Image Processing 3 x 3 Kernels 5 x 5 Kernels
GIP version 1 GIP version 2 Technology 1.2 mm Nwell CMOS 1.5 mm Nwell CMOS No. Transistors 6K 13K Array Size 16 x 16 42 x 35 Pixel Size 30 mm x 30 mm 20 mm x 20 mm FPN (STD/Mean) 2.5% (Average) 2.1% (Average) Fill Factor 20% 35% Dynamic Range 1 – 6000 Lux 1 – 6000 Lux Frame Rate DC – 400KHz DC – 400KHz Kernel Sizes 2x2 - whole array 2x2 - whole array Kernel Coefficients +/- 3.75 by 0.25 +/- 3.75 by 0.25 Coeff. Precision Intra-processor: <0.5% Inter-processor: <2.5% Temporal Delay 1% decay in 150ms @ 800Lux NA Power 5 x 5: 1mW @ 20 kfps 5 x 5: ~1mW @ 20 kfps Computation Rate (Add and Multiply) 5 x 5: 1 GOPS/mW @ 20 kfps 5 x 5: 1 GOPS/mW @ 20 kfps Summary
Motivation • High speed, high resolution, high accuracy, pitch matched,Temporal Difference Imager (TDI) • Flexible control of exposure, inter-frame delay and read-out synchronization • Low fixed pattern noise on current and previous image • Pipelined readout mechanism for improved read-out rate and temporal difference accuracy
On-Set and Off-Set Imaging Narrow Rejection Band Wide Rejection Band
Spectral Processing: Color Object Recognition
RGB to HIS: Why? Etienne-Cummings et al., 2002
Examples: Chroma-Based Object Identification Skin Identification Fruit Identification “Learned” templates
Chip Block Diagram -Block addressable color imager -White correction and R,G,B scaling -R,G,B normalization -R,G,B to HSI conversion -HSI histogramming for an image block -Stored “learned” HSI templates -SAD template matching
HSI Histogramming -Filters Saturation and Intensity Values -Non-linear RGB->Hue transformation using analog-to-digital look-up -Hue histogram constructed by counting number of pixels in a block mapping to each Hue bin -36 x 12b Template per block -Programmable bin assignment in next version
Technology Hue Bins 0.5µm 3M CMOS 36, each 10 degree wide Saturation Analog (~5bits) one threshold Array Size (R,G,B) 128 (H) x 64 (V) Intensity Analog (~5bits) one threshold Chip Area 4.25mm x 4.25mm Histogram Bin Counts 12bits/bin Pixel Size 24.85µm x 24.85µm Template Size 432bits (12 x 36bits) Fill Factor 20% No. Stored Template 32 (13.8Kbits SRAM) FPN ~5% Template Matching (SAD) 4 Parallel SAD, 18bits results Dynamic Range >120 dB (current mode) Frame Rate Array Scan: ~2K fps HIS Comp: ~30 fps Region-Of-Interest Size 1 x 1 to 128 x 64 Color Current Scaling 4bits Power Consumption ~1mW @ 30 fps on 3.3V Supplies Summary
Some Conclusions • Block-Serial-Pixel-Parallel Focal-Plane Computation-on-Readout (COR) is an another style of neuromorphic image processing • Computation for “free”, high fidelity images, compact, low-power, high-speed, reconfigurable, multiple parallel kernels, can be iterated • Although COR can be used for both voltage- and current-mode imagers, current-mode image processing is more ideal for focal-plane implementation • Linearize the photo-current, perform CDS to remove FPN • Many different algorithms can be implemented with COR that are compatible with standard machine vision