1 / 41

Learning Local Image Descriptors

Learning Local Image Descriptors. Matthew Brown University of British Columbia (prev.) Microsoft Research. [ Collaborators: † Simon Winder, *Gang Hua , † Rick Szeliski † =MS Research, *=MS Live Labs]. Applications @MSFT. Panoramic Stitching

wilma
Download Presentation

Learning Local Image Descriptors

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Learning Local Image Descriptors Matthew Brown University of British Columbia (prev.) Microsoft Research [ Collaborators: †Simon Winder, *Gang Hua, †Rick Szeliski †=MS Research, *=MS Live Labs]

  2. Applications @MSFT • Panoramic Stitching • Digital Image Pro, Windows Live Photogallery, Expression, HDView • 3D Modelling • Photosynth • Virtual Earth • Location Recognition • Image Search • Lincoln [yellow = product, white = technology preview, grey = research ]

  3. Photosynth [ http://labs.live.com/photosynth ]

  4. Photo Tourism [ http://photour.cs.washington.edu ] Photo Explorer Scene reconstruction Input photographs Relative camera positions and orientations Point cloud Sparse correspondence [ Slide credit: Noah Snavely] Photosynth is based on Photo Tourism [Snavely, Seitz, Szeliski SIGGRAPH 2006 ] Photo Tourism uses SIFT for correspondence

  5. multiview stereo = training data [ Seitz et al CVPR 2006, Goesele et al ICCV 2007 ]

  6. Learning Image Features 3D Point Cloud [ Photo Tourism – Snavely, Seitz, Szeliski - SIGGRAPH 2006 ]

  7. Learning Image Features 3D Point Cloud [ Photo Tourism – Snavely, Seitz, Szeliski - SIGGRAPH 2006 ]

  8. Learning Image Features 3D Point Cloud [ Photo Tourism – Snavely, Seitz, Szeliski - SIGGRAPH 2006 ]

  9. Learning Image Features 3D Point Cloud [ Photo Tourism – Snavely, Seitz, Szeliski - SIGGRAPH 2006 ]

  10. Problem Statement Find a function of a local image patch descriptor = f ( ) s.t. a nearest neighbour classifier† is optimal* † =for simplicity + efficiency * = measured by ROC curve Q: Form of the descriptor function f(.)?

  11. Descriptor Algorithms Algorithm Normalized Image Patch Descriptor Vector Gradients Quantized to k Orientations Normalize Summation [ SIFT – Lowe ICCV 1999 ]

  12. Descriptor Algorithms Algorithm Normalized Image Patch Descriptor Vector Gradients Quantized to k Orientations Normalize (plus PCA) Summation [ GLOH – MikolajzcykSchmid PAMI 2005 ]

  13. Descriptor Algorithms Algorithm Normalized Image Patch Descriptor Vector Create Edge Map Normalize Summation [ Shape Context – BelongieMalikPuzicha NIPS 2000 ]

  14. Descriptor Algorithms Algorithm Normalized Image Patch Descriptor Vector T S N Feature Detector Normalize Summation [ Geometric Blur – Berg Malik CVPR 2001 ]

  15. Our Contribution T S N Normalized Image Patch Descriptor Vector Parameters Propose a framework for descriptor algorithms Learn parameters to find best performance Train on a ground truth data set based on accurate 3D matches

  16. T-blocks T S N Normalized Image Patch (w x h) Descriptor Vector (w x h x k) • Haar wavelets • Local classifier • Quantized intensities • Output: one length k vector per source pixel • Transformation block • Local gradients • Steerable filters • Isotropic filters

  17. S-Blocks T S N Normalized Image Patch (w x h) Descriptor Vector (w x h x k) (m x k) S1 S2 S3 S4 Spatial summation block with m regions Output: m length k vectors

  18. N-Blocks T S N Normalized Image Patch (w x h) Descriptor Vector (w x h x k) (m x k) (m x k) • Normalization Block • Unit normalization • SIFT normalization with clipping

  19. Learning Descriptors T S N

  20. Learning Descriptors T1a S2 N2 Training Pairs Descriptor Distances Parameters Update Parameters (Powell) Correct Match % Incorrect Match %

  21. Testing Descriptors T1a S2 N2 Test Pairs Descriptor Distances Parameters 95% Final Error Rate Correct Match % Incorrect Match %

  22. Example of Parameter Learning

  23. Results: Changing T-Blocks (k=4) Polar lattice S2 always has lower error rate than rectangular S1 Gradient and DOG with S2 beat our SIFT reference (4% vs 6% error)

  24. Results: Changing T-Blocks (k=8)

  25. Results: Changing T-Blocks (k=16) Steerable filters produce great results if phase information is kept

  26. Results: Changing S-Blocks

  27. Results SIFT normalization is important Best result: 4th order steerable filters with phase information combined with polar S4-25 Gaussian summation block (2% error vs SIFT at 6%) Very large numbers of dimensions

  28. Dimension Reduction: PCA wPCA

  29. Dimension Reduction: LDA wLDA

  30. Dimension Reduction: LDA wLDA

  31. Dimension Reduction: LDA wLDA

  32. Results: LDA on patches Normalised patches Gradient patches • LDA on pixels ≈ SIFT (6%) • PCA gave small improvement

  33. Results: LDA on patches Effect of # of Training Pairs • LDA on pixels ≈ SIFT (6%) • PCA gave small improvement • Need ~100,000 training examples

  34. Results: LDA on T blocks T1 T3 • LDA on T1-T3 < 4.5% • Optimal #dimensions ~20-30 • Post-normalisation important

  35. Results: LDA on T blocks LDA using T blocks T1–T4 • LDA on T1-T3 < 4.5% • Optimal #dimensions ~20-30 • Post-normalisation important

  36. Results: LDA on descriptors LDA using CVPR 07 descriptors • Overall best results • #dimensions reduced from 100’s to 10’s • Need more challenging dataset!

  37. Discussion: Image Descriptors Algorithm Normalized Image Patch Descriptor Vector “simple” “complex” T S N Feature Detector Normalize Summation

  38. Conclusions • Future Work • Use multi-view stereo ground truth • Multi-level simple-complex architecture • + non-parametric T blocks • Learn interest point detectors [ refs: 1) Winder, Brown CVPR 2007 2) Hua, Brown, Winder ICCV 2007 ] mbrown@cs.ubc.ca Used learning to obtain good descriptors Achieved error rates 1/3 of SIFT Produced useful ground truth data set

  39. HDView [http://research.microsoft.com/ivm/hdview.htm ]

More Related