Invariant Local Feature for Object Recognition

Invariant Local Feature for Object Recognition Presented by Wyman 2/05/2006

Introduction • Object Recognition • A task of finding 3D objects from 2D images (or even video) and classifying them into one of the many known object types • Closely related to the success of many computer vision applications • robotics, surveillance, registration … etc. • A difficult problem that a general and comprehensive solution to this problem has not been made

Introduction • Two main streams of approaches: • Model-Based Object Recognition • View-Based Object Recognition • 2D representations of the same object viewed at different angles and distances when available • Extract features (as the representations of object) and compare them to those in the feature database

Repeatedly Detected Matching with Local Features • One of the possible solution • Matching with invariant local features • Robust to Occlusion, clutter background • cf. global features • Three phases: • Detection • Description • Matching Accurate, Fast Distinctive Invariance

Research Direction • Study and improve the invariant local features • Detection, description and matching • Study and improve object recognition / matching using invariant local features • Area to improve • Distinctiveness • Invariance • Efficiency

Outline • State-of-the-art techniques • Descriptor • Matching • Conclusion & Future Works

Outline • State-of-the-art techniques • Descriptor • Performance evaluation • Current extension using color • Possible way to improve – Color Orientation • Matching • Conclusion & Future Work

Outline • State-of-the-art techniques • Descriptor • Performance evaluation • Current extension using color • Possible way to improve – Color Orientation • Matching • Cross-bin distance • Performance evaluation • Possible way to improve – Aggregation of Content • Conclusion & Future Work

Performance Evaluation of Descriptors • We aim to compare the performance of three state-of-the-art local feature descriptors: SIFT, PCA-SIFT and GLOH • Same experimental setup as that used in “Performance Evaluation of Local Descriptors” TPAMI 2005 • Different evaluation criterion • Different result • In each experiment, each descriptor describe features from • Harris corner detector • Harris-affine covariant detector • Output regions that are invariant to viewpoint change

SIFT – Scale Invariant Feature Transform • Descriptor overview: • Find local orientation as the dominant gradient direction  Rotation Invariant • Compute gradient orientation histograms of several small windows (128 values for each point) relative to the local orientation  Viewpoint Invariant • Normalize the descriptor to make it invariant to intensity change  Illumination D.Lowe. “Distinctive Image Features from Scale-Invariant Keypoints”. IJCV 2004

PCA-SIFT • Rotate feature region to dominant gradient direction same as SIFT • Pre-compute an eigenspace for local gradient patches of size 41x41 • 2x39x39=3042 elements • Only keep 20 components • A more compact descriptor • Sensitive to viewpoint change Y. K. Rahul. Pca-sift: A more distinctive representation for local image descriptors. CVPR 2004

GLOH (Gradient location-orientation histogram) • Different from SIFT in sampling method • 17 log-polar location bins • 16 orientation bins • Analyze the 17x16=272 Dimensions • Apply PCA analysis, keep 128 components PCA on Orientation Histogram VS PCA on Gradient Patch 17 Log-polar location bins C. S. Krystian Mikolajczyk. A performance evaluation of local descriptors. TPAMI 2005

Performance Evaluation Scale + Rotation (bark) • Data Set • From Visual Geometry Group Viewpoint change (graf) Illumination change (leuven) Viewpoint change (wall) Blur Blurring (bikes)

Total # possible matches Performance Evaluation • Evaluation Criteria • Match features from first image to the second one based on the nearest neighbor distance ratio • That is, two features are matched if first nearest neighbor is much closer than the second nearest neighbor • This is different from the threshold-based criterion used in “A Performance Evaluation of Local Descriptors” TPAMI 2005 • Count the number of correct matches and the number of false matches obtained for an image pair • The results are plotted in form of recall versus 1-precision curves

Viewpoint change (graf) Performance Evaluation Viewpoint change (wall) Scale + Rotation (bark) Blurring (bikes) Illumination change (leuven)

Performance Evaluation Result • For accuracy  SIFT • For speed  PCA-SIFT • In large database  ?

Start from Scratch • Comparison of my descriptor with SIFT • Simply designed vs carefully designed • Result • SIFT is a carefully designed descriptor, it remains robust when the degree of transformation increases Increasing illumination change Increasing affine change Increasing affine change Increasing blur

Extension using Color • Weijier extends local feature descriptors with color information, by concatenating a color descriptor, K, to the shape descriptor, S, according to • where B is the combined color and shape descriptor and is a weighting parameter and ^ indicates that the vector is normalized. J. van de Weijer and C. Schmid. Coloring local feature extraction. ECCV2006.

Proposed Extension using Color • Problem statement • Orientation of local feature patch are obtained from the monochrome intensity image • Color feature patches on the right has the same grayscale patches shown on the left. Thus, they are assigned the same orientation histogram • If we can generate significant orientation histogram for each of them, we can further improve the distinctiveness of the shape descriptor, SIFT …

Feature Matching • Original distance metric designed for SIFT, PCA-SIFT and GLOH is bin-to-bin Euclidean distance • Problems: • Sensitive to quantization effects • Sensitive to distortion problems due to deformation, illumination change and noise

Feature Matching – Diffusion Distance • Haibin Ling proposed a new distance metric for histogram-based descriptor called diffusion distance • Summing value in all layers of the distance pyramid with exponentially decreasing size Gaussian Blur In 3 directions 3D case Gaussian Blur In 1 direction 1D case H. Ling and K. Okada. Diffusion distance for histogram comparison. CVPR06.

Feature Matching – Performance Evaluation • Same setup as the previous experiment • Recall vs 1-prevision curve for image pair with affine transformation

Feature Matching – Performance Evaluation Data set. The synthetic deformation data set from Haibin Ling Images in the data set and the evaluation method needs to be improved

Proposed Extension • Robust aggregation of the histogram, such as average orientation direction and center of mass of derivatives, can be also used in comparison • Diffusion distance can be viewed as a form of comparison using the aggregate information • Its aggregation of histogram bins is obtained by repeatedly convolving the histogram with Gaussian kernels • Summation of the distance between each aggregation pair of two histograms gives the diffusion distance Histogram A Histogram B 128 bins 128 bins 64 bins 64 bins 32 bins 32 bins Aggregation:1. Average of gradient magnitude over location bins 2. Bin reduction in orientation bins

Conclusion and Future Work • Presented • Result of performance evaluation of some state-of-the-art descriptors and feature matching distance metric • Possible way to improve the description and matching step • TODO • Incorporate color information into local features • Improve feature’s distinctiveness • Design a distance metric for comparing SIFT feature’s histogram • Invariant to deformation (like diffusion distance) • Improve feature’s distinctiveness

Q & A Thank you very much!

Models of Image Change • Geometry • Rotation • Similarity (rotation + uniform scale) • Affine (scale dependent on direction)valid for: orthographic camera, locally planar object • Photometry • Affine intensity change (I  aI + b)

Invariant Local Feature for Object Recognition

Invariant Local Feature for Object Recognition

Presentation Transcript

Invariant-Based Face Recognition

Object Recognition from Local Scale-Invariant Features

Object Recognition from Local Scale-Invariant Features (SIFT) David G. Lowe

Local invariant features

Object Recognition using Local Descriptors

Object Recognition Using Distinctive Image Feature From Scale-Invariant Key point

Object Tracking/Recognition using Invariant Local Features

Using eigencolor normalization for illumination-invariant color object recognition

Local Invariant Feature Descriptors

Semi-Local Affine Parts for Object Recognition

Object Recognition with Invariant Features

Recognition and Matching based on local invariant features

NIPS 2003 Tutorial Real-time Object Recognition using Invariant Local Image Features

Local Invariant Features

Pose Invariant Palmprint Recognition

Object Recognition with Invariant Features

Local invariant features

Object Recognition from Local Scale-Invariant Features (SIFT) David G. Lowe

Local invariant features

Recognition and Matching based on local invariant features

Object class recognition using unsupervised scale-invariant learning

Age invariant face recognition