410 likes | 611 Views
Learning Local Image Descriptors. Matthew Brown University of British Columbia (prev.) Microsoft Research. [ Collaborators: † Simon Winder, *Gang Hua , † Rick Szeliski † =MS Research, *=MS Live Labs]. Applications @MSFT. Panoramic Stitching
E N D
Learning Local Image Descriptors Matthew Brown University of British Columbia (prev.) Microsoft Research [ Collaborators: †Simon Winder, *Gang Hua, †Rick Szeliski †=MS Research, *=MS Live Labs]
Applications @MSFT • Panoramic Stitching • Digital Image Pro, Windows Live Photogallery, Expression, HDView • 3D Modelling • Photosynth • Virtual Earth • Location Recognition • Image Search • Lincoln [yellow = product, white = technology preview, grey = research ]
Photosynth [ http://labs.live.com/photosynth ]
Photo Tourism [ http://photour.cs.washington.edu ] Photo Explorer Scene reconstruction Input photographs Relative camera positions and orientations Point cloud Sparse correspondence [ Slide credit: Noah Snavely] Photosynth is based on Photo Tourism [Snavely, Seitz, Szeliski SIGGRAPH 2006 ] Photo Tourism uses SIFT for correspondence
multiview stereo = training data [ Seitz et al CVPR 2006, Goesele et al ICCV 2007 ]
Learning Image Features 3D Point Cloud [ Photo Tourism – Snavely, Seitz, Szeliski - SIGGRAPH 2006 ]
Learning Image Features 3D Point Cloud [ Photo Tourism – Snavely, Seitz, Szeliski - SIGGRAPH 2006 ]
Learning Image Features 3D Point Cloud [ Photo Tourism – Snavely, Seitz, Szeliski - SIGGRAPH 2006 ]
Learning Image Features 3D Point Cloud [ Photo Tourism – Snavely, Seitz, Szeliski - SIGGRAPH 2006 ]
Problem Statement Find a function of a local image patch descriptor = f ( ) s.t. a nearest neighbour classifier† is optimal* † =for simplicity + efficiency * = measured by ROC curve Q: Form of the descriptor function f(.)?
Descriptor Algorithms Algorithm Normalized Image Patch Descriptor Vector Gradients Quantized to k Orientations Normalize Summation [ SIFT – Lowe ICCV 1999 ]
Descriptor Algorithms Algorithm Normalized Image Patch Descriptor Vector Gradients Quantized to k Orientations Normalize (plus PCA) Summation [ GLOH – MikolajzcykSchmid PAMI 2005 ]
Descriptor Algorithms Algorithm Normalized Image Patch Descriptor Vector Create Edge Map Normalize Summation [ Shape Context – BelongieMalikPuzicha NIPS 2000 ]
Descriptor Algorithms Algorithm Normalized Image Patch Descriptor Vector T S N Feature Detector Normalize Summation [ Geometric Blur – Berg Malik CVPR 2001 ]
Our Contribution T S N Normalized Image Patch Descriptor Vector Parameters Propose a framework for descriptor algorithms Learn parameters to find best performance Train on a ground truth data set based on accurate 3D matches
T-blocks T S N Normalized Image Patch (w x h) Descriptor Vector (w x h x k) • Haar wavelets • Local classifier • Quantized intensities • Output: one length k vector per source pixel • Transformation block • Local gradients • Steerable filters • Isotropic filters
S-Blocks T S N Normalized Image Patch (w x h) Descriptor Vector (w x h x k) (m x k) S1 S2 S3 S4 Spatial summation block with m regions Output: m length k vectors
N-Blocks T S N Normalized Image Patch (w x h) Descriptor Vector (w x h x k) (m x k) (m x k) • Normalization Block • Unit normalization • SIFT normalization with clipping
Learning Descriptors T S N
Learning Descriptors T1a S2 N2 Training Pairs Descriptor Distances Parameters Update Parameters (Powell) Correct Match % Incorrect Match %
Testing Descriptors T1a S2 N2 Test Pairs Descriptor Distances Parameters 95% Final Error Rate Correct Match % Incorrect Match %
Results: Changing T-Blocks (k=4) Polar lattice S2 always has lower error rate than rectangular S1 Gradient and DOG with S2 beat our SIFT reference (4% vs 6% error)
Results: Changing T-Blocks (k=16) Steerable filters produce great results if phase information is kept
Results SIFT normalization is important Best result: 4th order steerable filters with phase information combined with polar S4-25 Gaussian summation block (2% error vs SIFT at 6%) Very large numbers of dimensions
Results: LDA on patches Normalised patches Gradient patches • LDA on pixels ≈ SIFT (6%) • PCA gave small improvement
Results: LDA on patches Effect of # of Training Pairs • LDA on pixels ≈ SIFT (6%) • PCA gave small improvement • Need ~100,000 training examples
Results: LDA on T blocks T1 T3 • LDA on T1-T3 < 4.5% • Optimal #dimensions ~20-30 • Post-normalisation important
Results: LDA on T blocks LDA using T blocks T1–T4 • LDA on T1-T3 < 4.5% • Optimal #dimensions ~20-30 • Post-normalisation important
Results: LDA on descriptors LDA using CVPR 07 descriptors • Overall best results • #dimensions reduced from 100’s to 10’s • Need more challenging dataset!
Discussion: Image Descriptors Algorithm Normalized Image Patch Descriptor Vector “simple” “complex” T S N Feature Detector Normalize Summation
Conclusions • Future Work • Use multi-view stereo ground truth • Multi-level simple-complex architecture • + non-parametric T blocks • Learn interest point detectors [ refs: 1) Winder, Brown CVPR 2007 2) Hua, Brown, Winder ICCV 2007 ] mbrown@cs.ubc.ca Used learning to obtain good descriptors Achieved error rates 1/3 of SIFT Produced useful ground truth data set
HDView [http://research.microsoft.com/ivm/hdview.htm ]