260 likes | 548 Views
Image Recognition using Hierarchical Temporal Memory. Radoslav Škoviera Ústav merania SAV Fakulta matematiky, fyziky a informatiky UK. Image Recognition. Applications: Digital image databases, surveillance, industry, medicine
E N D
Image Recognition using Hierarchical Temporal Memory Radoslav Škoviera Ústav merania SAV Fakulta matematiky, fyziky a informatiky UK
Image Recognition • Applications: Digital image databases, surveillance, industry, medicine • Tasks: Object recognition, automatic annotation, content based image search • Input: Digital Image • Single object • Scene (multiple objects – clutter, occlusion, merging) • Output: Description of the input image • Keywords, scene semantics, similar images • Subtasks: image segmentation, feature extraction, classification
Motivation • Image recognition • Very easy for us humans (and [other] animals) • Computers can‘t do it neither quickly, nor accurately enough, yet • Good motivation for the researchers in the field of AI – bio-inspired models
Hierarchical Temporal Memory (HTM) • Developed by Jeff Hawkins and Dileep George (Numenta) • Hierarchical tree-shaped network • Bio-inspired – based on large scale model of the neocortex • Consists of basic operational units – nodes • Each node uses the same two-stage learning algorithm: 1) Spatial Learning (Pooling) 2) Temporal Learning (Pooling) • Learning is performed layer-by-layer • Nodes have receptive fields – each (except for the top node) can look only at a portion of the input image
Spatial Learning • Observe common patterns in the input space (training images) • Group them into clusters of spatially simillar patterns • Use only one representative of each cluster • Generate „codebook“ • Input space and spatial noise reduction
Temporal Learning • Uses time sequences to learn correlations of spatial patterns
Temporal Learning • In each training step, TAM is increased at the locations corresponding with the co-occurring codebook patterns according to the update function defined as follows:
Inference & Classification • Uses simlar dataflow as learning • Two stages of inference in each node: • Spatial inference – find the closest pattern in the codebook • Temporal inference – calculate membership into temporal groups • Classification – HTM itself does not classify images, it only transforms input space into another (hopefully more inviariant) space • External classifier must be used
ATM Security • ATM (automatic teller machine) semiatomatic fraud detection system • Detection of masked individuals interacting with the ATM through the ATM‘s camera – possibility of illegal activity • Pilot system implemented and tested in an experimental environment • Using Kinect as an input device
Kinect • RGB camera developed for the XBOX game console • Capable of providing depth image for the scene and a „skeleton“ if a person is detected on the scene
Face Image Segmentation using Kinect • Two image classes: normal and anomalous faces
ATM Security – Results • Image set inflated with translated, rotated and mirrored copies of the original images • k-NN classifier in the input space was compared with the combination of the HTM and k-NN and HTM and SVM classifier • Scenario 1: The whole data set was used and • Scenario 2: Translated images were excluded from the training set
New features and algorithmsfor the HTM • New temporal pooler • Images transformed to different image spaces • different image features • Various settings for the temporal pooler • SOM as spatial pooler
Testing of new image features • Dataset: selected images from Caltech 256 • 10 classes, 30 testing and 30 training images per class • Single layer network • With 1-NN classifier as top node • Image features extracted from image patches corresponding to the receptive fields of nodes