280 likes | 466 Views
Object Recognition Using a Neural Network and Invariant Zernike Features. Kumar Srijan (200602015) Syed Ahsan(200601096). Problem Statement. To create a Neural Networks based multiclass object classifier which can do rotation, scale and translation invariant object recognition.
E N D
Object Recognition Using a Neural Network and Invariant Zernike Features Kumar Srijan (200602015) Syed Ahsan(200601096)
Problem Statement • To create a Neural Networks based multiclass object classifier which can do rotation, scale and translation invariant object recognition.
Solution • Normalize the image so that scaled and translated images look the same. • Extract features from the images which are invariant to rotation. • Create a classifier based on these features.
All these images are same if we consider scale and translation invariance and are equal to:
This is accomplished by normalization with respect to first two orders of geometric moments. • Geometric Moment for an image is : • The zeroth order moment, M00, represents the total mass of the image. • The two first order moments, (M10, M01), provide the position of the center of mass.
Translation Invariance • Translation invariance is achieved by transforming the image into a new one whose first order moments, M10 and M01, are both equal to zero. • So, we transform the original image (f(x,y)) into: f(x + x’, y + y’) where x’= M10/M00 and, y’= M01/M00
Scale Invariance • Enlarge or reduce the object such that its zeroth order moment, M00 , is set equal to a predetermined value β. • This is done after making the image translation invariant. • Done by changing the image to a new function: f(x/a,y/a) where, a=sqrt(β/M00)
Rotation Invariance • Zernike moments for an image is defined as: • Here, n represents the order and m represents the repetition. • For n=5, the valid values of m are : -5,-3,-1,1,3 and 5
Rotation Invariance • Now suppose the image is rotated by an angle φ so, • Thus, |Znm| can be taken as rotational invariant feature of underlying image function.
Feature Extraction • Binarize the image first according to some threshold • Normalize it to make translation invariant and scale invariant • Calculate Zernike moments of g(x,y) from 2nd order to nth order (since, the 0th order moment is = β/π and 1st the first order moments are = 0 for all images after making them scale and translation invariant ).
Classification • It is done using a multi layer neural network. • In this case, we used only one hidden layer. • Back Propagation of error is used for learning.
Classifier Details • The activation function used at the output layer and the hidden layer is Symmetric Sigmoid Function. • Sigmoid Function
Classifier Details • The symmetric sigmoid is simply the sigmoid that is stretched so that the y range is 2 and then shifted down by 1 so that it ranges between -1 and 1. • If f(x) is the standard sigmoid then the symmetric sigmoid is g(x) = 2*f(x) – 1. So, this becomes symmetric about the origin.
Classifier Details • 26 nodes in the output layer. • Number of hidden layer nodes can be varied. • Number of input layer nodes is equal to the length of feature vector.
Training of the Classifier • We have used the Back Propagation Algorithm. • Initialize all Wij’s to small random values. • Present an input from class m and specify the desired output. The desired output is -1 for all the output nodes except the mth node which is 1. • Calculate actual outputs of all the nodes using the present value of Wij’s. • This is done by mapping the total input at the node according to the symmetric sigmoid function.
Training of the Classifier • Find an error term, δj for all the nodes. • If dj and yj stand for desired and actual values of a node respectively, for an output node. • and for a hidden layer node • Where, k is over all nodes in the layer above node j.
Training of the Classifier • Now, we adjust weights by: • where (n+1), (n), and (n-1) index next, present, and previous respectively. α is a learning rate similar to step size in gradient search algorithms. • ζ is a constant between 0 and 1 which determines the effect of past weight changes on the current direction of movement in weight space. • All the training inputs are presented cyclically until weights stabilize.
Experimentation • We trained the classifier using 4 images of each of the letters of the alphabet. • A sample of training data: • Zernike moments from 2nd to nth order were calculated for all images and treated as feature vectors.
Testing • Similar kinds of images with translation, scale and rotation imbalances were used for testing. • Some images used in testing: • A total of 104 images are 4 for each alphabet with various scale, translation and rotation imbalances were used as testing data.
Results • EXPERIMENT – I • Keeping number of features = 47 (2nd – 12th order). Results for varying the number of hidden layer nodes.
Results • EXPERIMENT – II • Keeping number of hidden nodes = 50 and varying the length of feature vector.
Inferences • Good classification can be achieved even by the use of one hidden layer. • Use of very few hidden layer nodes can may lead to a very bad classifier. • After a limit, there is a saturation in the amount of performance one can get by increasing the number of hidden layer nodes. • For very good classification, one must have sufficient number of features. • Use of too many features does not guarantee better classifier.