180 likes | 414 Views
Generic Object Recognition. A Project on. -- by Yatharth Saraf. Problem Definition and Background. Recognizing generic class or category of a given object as opposed to recognizing specific, individual objects
E N D
Generic Object Recognition A Project on -- by Yatharth Saraf
Problem Definition and Background • Recognizing generic class or category of a given object as opposed to recognizing specific, individual objects • humans are much better at generic recognition, machines are more competitive at specific object recognition • Early work by Marr led to the ‘reconstruction school’ • advocates 3-D reconstruction and modeling before further reasoning of a scene • Current work in object categorization tends to fall in the ‘recognition school’ • work in the 2-D domain, with 2-D image features and descriptors • e.g. Bag of features approaches, spatial 2-D geometry approaches as in the ‘constellation model’
Applications • Image database annotation and retrieval • Video surveillance • Driver assistance, autonomous robots • Cognitive support for disabled people
Related Work • Discriminative approaches • SVM, subspace methods • Bag of features • Representation of objects with point descriptors • Constellation model • Representations that take into account spatial geometry (2-D) of key points
Assumptions • Images are scale-normalized • Images are clean, i.e. no background clutter/occlusion • (-) Implies segmentation is necessary as a pre-processing step • (+) Avoids the problem of exponential search
Outline of the Method (Training) • Detect salient regions in all training images using Kadir-Brady feature detector • Extract X,Y coordinates, scale and 11x11 intensity patches around detected features • Reduce dimensionality of appearance patches from 121 to 16 using PCA • Estimate model parameters • A single full Gaussian for location; one Gaussian per part
Outline of the Method (Testing) • Extract features of test images in the same manner as in training phase • Use the learnt model to estimate probability of detection • Use Bayes’ Decision Rule to classify
Experiments • Careful tweaking of detector parameters needed • A single set of parameter settings may not be suitable for all categories
Starting scale: 3 Starting scale: 23
Experiments (contd.) • 47 clean motorbike images used for training motorbike model • Sorting the extracted patches by X-coordinate helped (as opposed to sorting by saliency) • Appearance model not doing as well
Log-probabilities of the 9 test images from location model Features sorted by X-coordinate. Features sorted by saliency. Image 5 Image 9
Appearance log-probabilities of the 9 test images Features sorted by saliency. Features sorted by X-coordinate. Total log-probabilities of the 9 test images
Experiments (contd.) • Using a Mixture of Gaussians for the appearances of parts didn’t make too much difference 3 mixture components per part (EM initialized with k-means and sample covariances)
Experiments (contd.) • Levenshtein distances on the appearance patches worked quite nicely • Each appearance patch is a single character • Matching cost was computed using a straight SSD • Cost of inserting a gap = matching cost of the patch with a canonical 11x11 patch having uniform intensity of 128.
Conclusions and Future Work • Strong dependence on feature detector • Appearance model doesn’t seem to be working too well • Levenshtein distances could be more promising • Experiments with more clean training and test data, multiple categories • Exponential search for dealing with clutter and occlusion
Questions? -- Thank You