270 likes | 286 Views
Invariant Local Feature for Object Recognition. Presented by Wyman 2/05/2006. Introduction. Object Recognition A task of finding 3D objects from 2D images (or even video) and classifying them into one of the many known object types
E N D
Invariant Local Feature for Object Recognition Presented by Wyman 2/05/2006
Introduction • Object Recognition • A task of finding 3D objects from 2D images (or even video) and classifying them into one of the many known object types • Closely related to the success of many computer vision applications • robotics, surveillance, registration … etc. • A difficult problem that a general and comprehensive solution to this problem has not been made
Introduction • Two main streams of approaches: • Model-Based Object Recognition • View-Based Object Recognition • 2D representations of the same object viewed at different angles and distances when available • Extract features (as the representations of object) and compare them to those in the feature database
Repeatedly Detected Matching with Local Features • One of the possible solution • Matching with invariant local features • Robust to Occlusion, clutter background • cf. global features • Three phases: • Detection • Description • Matching Accurate, Fast Distinctive Invariance
Research Direction • Study and improve the invariant local features • Detection, description and matching • Study and improve object recognition / matching using invariant local features • Area to improve • Distinctiveness • Invariance • Efficiency
Outline • State-of-the-art techniques • Descriptor • Matching • Conclusion & Future Works
Outline • State-of-the-art techniques • Descriptor • Performance evaluation • Current extension using color • Possible way to improve – Color Orientation • Matching • Conclusion & Future Work
Outline • State-of-the-art techniques • Descriptor • Performance evaluation • Current extension using color • Possible way to improve – Color Orientation • Matching • Cross-bin distance • Performance evaluation • Possible way to improve – Aggregation of Content • Conclusion & Future Work
Performance Evaluation of Descriptors • We aim to compare the performance of three state-of-the-art local feature descriptors: SIFT, PCA-SIFT and GLOH • Same experimental setup as that used in “Performance Evaluation of Local Descriptors” TPAMI 2005 • Different evaluation criterion • Different result • In each experiment, each descriptor describe features from • Harris corner detector • Harris-affine covariant detector • Output regions that are invariant to viewpoint change
SIFT – Scale Invariant Feature Transform • Descriptor overview: • Find local orientation as the dominant gradient direction Rotation Invariant • Compute gradient orientation histograms of several small windows (128 values for each point) relative to the local orientation Viewpoint Invariant • Normalize the descriptor to make it invariant to intensity change Illumination D.Lowe. “Distinctive Image Features from Scale-Invariant Keypoints”. IJCV 2004
PCA-SIFT • Rotate feature region to dominant gradient direction same as SIFT • Pre-compute an eigenspace for local gradient patches of size 41x41 • 2x39x39=3042 elements • Only keep 20 components • A more compact descriptor • Sensitive to viewpoint change Y. K. Rahul. Pca-sift: A more distinctive representation for local image descriptors. CVPR 2004
GLOH (Gradient location-orientation histogram) • Different from SIFT in sampling method • 17 log-polar location bins • 16 orientation bins • Analyze the 17x16=272 Dimensions • Apply PCA analysis, keep 128 components PCA on Orientation Histogram VS PCA on Gradient Patch 17 Log-polar location bins C. S. Krystian Mikolajczyk. A performance evaluation of local descriptors. TPAMI 2005
Performance Evaluation Scale + Rotation (bark) • Data Set • From Visual Geometry Group Viewpoint change (graf) Illumination change (leuven) Viewpoint change (wall) Blur Blurring (bikes)
Total # possible matches Performance Evaluation • Evaluation Criteria • Match features from first image to the second one based on the nearest neighbor distance ratio • That is, two features are matched if first nearest neighbor is much closer than the second nearest neighbor • This is different from the threshold-based criterion used in “A Performance Evaluation of Local Descriptors” TPAMI 2005 • Count the number of correct matches and the number of false matches obtained for an image pair • The results are plotted in form of recall versus 1-precision curves
Viewpoint change (graf) Performance Evaluation Viewpoint change (wall) Scale + Rotation (bark) Blurring (bikes) Illumination change (leuven)
Performance Evaluation Result • For accuracy SIFT • For speed PCA-SIFT • In large database ?
Start from Scratch • Comparison of my descriptor with SIFT • Simply designed vs carefully designed • Result • SIFT is a carefully designed descriptor, it remains robust when the degree of transformation increases Increasing illumination change Increasing affine change Increasing affine change Increasing blur
Extension using Color • Weijier extends local feature descriptors with color information, by concatenating a color descriptor, K, to the shape descriptor, S, according to • where B is the combined color and shape descriptor and is a weighting parameter and ^ indicates that the vector is normalized. J. van de Weijer and C. Schmid. Coloring local feature extraction. ECCV2006.
Proposed Extension using Color • Problem statement • Orientation of local feature patch are obtained from the monochrome intensity image • Color feature patches on the right has the same grayscale patches shown on the left. Thus, they are assigned the same orientation histogram • If we can generate significant orientation histogram for each of them, we can further improve the distinctiveness of the shape descriptor, SIFT …
Feature Matching • Original distance metric designed for SIFT, PCA-SIFT and GLOH is bin-to-bin Euclidean distance • Problems: • Sensitive to quantization effects • Sensitive to distortion problems due to deformation, illumination change and noise
Feature Matching – Diffusion Distance • Haibin Ling proposed a new distance metric for histogram-based descriptor called diffusion distance • Summing value in all layers of the distance pyramid with exponentially decreasing size Gaussian Blur In 3 directions 3D case Gaussian Blur In 1 direction 1D case H. Ling and K. Okada. Diffusion distance for histogram comparison. CVPR06.
Feature Matching – Performance Evaluation • Same setup as the previous experiment • Recall vs 1-prevision curve for image pair with affine transformation
Feature Matching – Performance Evaluation Data set. The synthetic deformation data set from Haibin Ling Images in the data set and the evaluation method needs to be improved
Proposed Extension • Robust aggregation of the histogram, such as average orientation direction and center of mass of derivatives, can be also used in comparison • Diffusion distance can be viewed as a form of comparison using the aggregate information • Its aggregation of histogram bins is obtained by repeatedly convolving the histogram with Gaussian kernels • Summation of the distance between each aggregation pair of two histograms gives the diffusion distance Histogram A Histogram B 128 bins 128 bins 64 bins 64 bins 32 bins 32 bins Aggregation:1. Average of gradient magnitude over location bins 2. Bin reduction in orientation bins
Conclusion and Future Work • Presented • Result of performance evaluation of some state-of-the-art descriptors and feature matching distance metric • Possible way to improve the description and matching step • TODO • Incorporate color information into local features • Improve feature’s distinctiveness • Design a distance metric for comparing SIFT feature’s histogram • Invariant to deformation (like diffusion distance) • Improve feature’s distinctiveness
Q & A Thank you very much!
Models of Image Change • Geometry • Rotation • Similarity (rotation + uniform scale) • Affine (scale dependent on direction)valid for: orthographic camera, locally planar object • Photometry • Affine intensity change (I aI + b)