Object Recognition Using a Neural Network and Invariant Zernike Features

Object Recognition Using a Neural Network and Invariant Zernike Features Kumar Srijan (200602015) Syed Ahsan(200601096)

Problem Statement • To create a Neural Networks based multiclass object classifier which can do rotation, scale and translation invariant object recognition.

Translation Invariance

Scale Invariance

Rotation Invariance

Translation, Rotation and Scale Invariance

Solution • Normalize the image so that scaled and translated images look the same. • Extract features from the images which are invariant to rotation. • Create a classifier based on these features.

All these images are same if we consider scale and translation invariance and are equal to:

This is accomplished by normalization with respect to first two orders of geometric moments. • Geometric Moment for an image is : • The zeroth order moment, M00, represents the total mass of the image. • The two first order moments, (M10, M01), provide the position of the center of mass.

Translation Invariance • Translation invariance is achieved by transforming the image into a new one whose first order moments, M10 and M01, are both equal to zero. • So, we transform the original image (f(x,y)) into: f(x + x’, y + y’) where x’= M10/M00 and, y’= M01/M00

Scale Invariance • Enlarge or reduce the object such that its zeroth order moment, M00 , is set equal to a predetermined value β. • This is done after making the image translation invariant. • Done by changing the image to a new function: f(x/a,y/a) where, a=sqrt(β/M00)

Rotation Invariance • Zernike moments for an image is defined as: • Here, n represents the order and m represents the repetition. • For n=5, the valid values of m are : -5,-3,-1,1,3 and 5

Rotation Invariance • Now suppose the image is rotated by an angle φ so, • Thus, |Znm| can be taken as rotational invariant feature of underlying image function.

Feature Extraction • Binarize the image first according to some threshold • Normalize it to make translation invariant and scale invariant • Calculate Zernike moments of g(x,y) from 2nd order to nth order (since, the 0th order moment is = β/π and 1st the first order moments are = 0 for all images after making them scale and translation invariant ).

Classification • It is done using a multi layer neural network. • In this case, we used only one hidden layer. • Back Propagation of error is used for learning.

Classifier Details • The activation function used at the output layer and the hidden layer is Symmetric Sigmoid Function. • Sigmoid Function

Classifier Details • The symmetric sigmoid is simply the sigmoid that is stretched so that the y range is 2 and then shifted down by 1 so that it ranges between -1 and 1. • If f(x) is the standard sigmoid then the symmetric sigmoid is g(x) = 2*f(x) – 1. So, this becomes symmetric about the origin.

Classifier Details • 26 nodes in the output layer. • Number of hidden layer nodes can be varied. • Number of input layer nodes is equal to the length of feature vector.

Training of the Classifier • We have used the Back Propagation Algorithm. • Initialize all Wij’s to small random values. • Present an input from class m and specify the desired output. The desired output is -1 for all the output nodes except the mth node which is 1. • Calculate actual outputs of all the nodes using the present value of Wij’s. • This is done by mapping the total input at the node according to the symmetric sigmoid function.

Training of the Classifier • Find an error term, δj for all the nodes. • If dj and yj stand for desired and actual values of a node respectively, for an output node. • and for a hidden layer node • Where, k is over all nodes in the layer above node j.

Training of the Classifier • Now, we adjust weights by: • where (n+1), (n), and (n-1) index next, present, and previous respectively. α is a learning rate similar to step size in gradient search algorithms. • ζ is a constant between 0 and 1 which determines the effect of past weight changes on the current direction of movement in weight space. • All the training inputs are presented cyclically until weights stabilize.

Experimentation • We trained the classifier using 4 images of each of the letters of the alphabet. • A sample of training data: • Zernike moments from 2nd to nth order were calculated for all images and treated as feature vectors.

Testing • Similar kinds of images with translation, scale and rotation imbalances were used for testing. • Some images used in testing: • A total of 104 images are 4 for each alphabet with various scale, translation and rotation imbalances were used as testing data.

Results • EXPERIMENT – I • Keeping number of features = 47 (2nd – 12th order). Results for varying the number of hidden layer nodes.

Results

Results • EXPERIMENT – II • Keeping number of hidden nodes = 50 and varying the length of feature vector.

Results

Inferences • Good classification can be achieved even by the use of one hidden layer. • Use of very few hidden layer nodes can may lead to a very bad classifier. • After a limit, there is a saturation in the amount of performance one can get by increasing the number of hidden layer nodes. • For very good classification, one must have sufficient number of features. • Use of too many features does not guarantee better classifier.

Object Recognition Using a Neural Network and Invariant Zernike Features

Object Recognition Using a Neural Network and Invariant Zernike Features

Presentation Transcript

Object Recognition Using Alignment

Traffic Sign Recognition Using Artificial Neural Network

Object Recognition from Local Scale-Invariant Features

Invariant Local Feature for Object Recognition

Object Recognition from Local Scale-Invariant Features (SIFT) David G. Lowe

Object Recognition from Photographic Images Using a Back Propagation Neural Network

Object Tracking/Recognition using Invariant Local Features

Using eigencolor normalization for illumination-invariant color object recognition

Image Recognition and Processing Using Artificial Neural Network

Mandarin Tone Recognition using Affine-Invariant Prosodic Features and Tone Posteriorgram

A PREPROCESSING METHOD AND ROTATION INVARIANT 2D OBJECT RECOGNITION USING BPG NEURAL NETWORKS

Object Recognition with Invariant Features

Recognition and Matching based on local invariant features

Object Recognition Using Attention

NIPS 2003 Tutorial Real-time Object Recognition using Invariant Local Image Features

Neural Network-based Face Recognition, using ARENA Algorithm.

Situation invariant Face Recognition using Neural Networks

Object Recognition with Invariant Features

Object Class Recognition Using Discriminative Local Features

Object Recognition from Local Scale-Invariant Features (SIFT) David G. Lowe

Recognition and Matching based on local invariant features

Object class recognition using unsupervised scale-invariant learning