1 / 36

Richard G. Baraniuk Chinmay Hegde

Manifold Learning in the Wild A New Manifold Modeling and Learning Framework for Image Ensembles Aswin C. Sankaranarayanan Rice University. Richard G. Baraniuk Chinmay Hegde. Sensor Data Deluge. Internet Scale Databases. Tremendous size of corpus of available data

yamka
Download Presentation

Richard G. Baraniuk Chinmay Hegde

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Manifold Learning in the Wild A New Manifold Modeling and Learning Framework for Image EnsemblesAswin C. Sankaranarayanan Rice University Richard G. BaraniukChinmayHegde

  2. Sensor Data Deluge

  3. Internet Scale Databases • Tremendous sizeof corpus of available data • Google Image Search of “Notre Dame Cathedral” yields 3m results 3Tbof data

  4. Concise Models • Efficient processing / compression requires concise representation • Our interest in this talk: Collections of images

  5. Concise Models • Our interest in this talk: Collections of image parameterized by q\inQ • translations of an object • q: x-offset and y-offset • wedgelets • q: orientation and offset • rotations of a 3D object • q: pitch, roll, yaw

  6. Concise Models • Our interest in this talk: Collections of image parameterized by q\inQ • translations of an object • q: x-offset and y-offset • wedgelets • q: orientation and offset • rotations of a 3D object • q: pitch, roll, yaw • Image articulation manifold

  7. Image Articulation Manifold • N-pixel images: • K-dimensional articulation space • Thenis a K-dimensional manifoldin the ambient space • Very concise model • Can be learnt using Non-linear dim. reduction articulation parameter space

  8. Ex: Manifold Learning LLE ISOMAP LE HE Diff. Geo… • K=1rotation

  9. Ex: Manifold Learning • K=2rotation and scale

  10. Smooth IAMs • N-pixel images: • Local isometryimage distance parameter space distance • Linear tangent spacesare close approximationlocally • Low dimensional articulation space articulation parameter space

  11. Smooth IAMs • N-pixel images: • Local isometryimage distance parameter space distance • Linear tangent spacesare close approximationlocally • Low dimensional articulation space articulation parameter space

  12. Smooth IAMs • N-pixel images: • Local isometryimage distance parameter space distance • Linear tangent spacesare close approximationlocally • Lowdimensional articulation space articulation parameter space

  13. Theory/Practice Disconnect Isometry • Ex: translation manifold all blue images are equidistant from the red image • Local isometry • satisfied only when sampling is dense

  14. Theory/Practice DisconnectNuisance articulations • Unsupervised data, invariably, has additional undesired articulations • Illumination • Background clutter, occlusions, … • Image ensemble is no longer low-dimensional

  15. Image representations • Conventional representation for an image • A vector of pixels • Inadequate! pixel image

  16. Image representations • Replace vector of pixels with an abstract bagof features • Ex: SIFT (Scale Invariant Feature Transform) selects keypoint locations in an image and computes keypoint descriptorsfor each keypoint • Very popular in many many vision problems

  17. Image representations • Replace vector of pixels with an abstract bagof features • Ex: SIFT (Scale Invariant Feature Transform) selects keypoint locations in an image and computes keypoint descriptorsfor each keypoint • Keypoint descriptors are local; it is very easy to make them robust to nuisance imaging parameters

  18. Loss of Geometrical Info • Bag of features representations hide potentially useful image geometry Image space Keypoint space • Goal: make salient image geometrical info more explicit for exploitation

  19. Key idea • Keypoint space can be endowed with a rich low-dimensional structure in many situations

  20. Key idea • Keypoint space can be endowed with a rich low-dimensional structure in many situations • Mechanism: define kernels ,between keypoint locations, keypoint descriptors

  21. Keypoint Kernel • Keypoint space can be endowed with a rich low-dimensional structure in many situations • Mechanism: define kernels ,between keypoint locations, keypoint descriptors • Joint keypoint kernel between two images is given by

  22. Many Possible Kernels • Euclidean kernel • Gaussian kernel • Polynomial kernel • Pyramid match kernel [Grauman et al. ’07] • Many others

  23. Keypoint Kernel • Joint keypoint kernel between two images is given by • Using Euclidean/Gaussian (E/G) combination yields

  24. From Kernel to Metric Lemma: The E/G keypoint kernel is a Mercer kernel • enables algorithms such as SVM Lemma: The E/G keypoint kernel induces a metricon the space of images • alternative to conventional L2 distance between images • keypoint metric robust to nuisance imaging parameters, occlusion, clutter, etc.

  25. Keypoint Geometry Theorem:Under the metric induced by the kernel certain ensembles of articulating images formsmooth, isometric manifolds • Keypointrepresentation compact, efficient, and … • Robust to illumination variations, non-stationary backgrounds, clutter, occlusions

  26. Keypoint Geometry Theorem: Under the metric induced by the kernel certain ensembles of articulating images formsmooth, isometric manifolds • In contrast: conventional approach to image fusion via image articulation manifolds (IAMs) fraught with non-differentiability (due to sharp image edges) • not smooth • not isometric

  27. Application: Manifold Learning • 2D Translation

  28. Application: Manifold Learning • 2D Translation • IAM KAM

  29. Manifold Learning in the Wild • Rice University’s Duncan Hall Lobby • 158 images • 360° panorama using handheld camera • Varying brightness, clutter

  30. Manifold Learning in the Wild • Duncan Hall Lobby • Ground truth using state of the art structure-from-motion software Ground truth IAM KAM

  31. Manifold Learning in the Wild • Rice University’s Brochstein Pavilion • 400 outdoor images of a building • occlusions, movement in foreground, varying background

  32. Manifold Learning in the Wild • Brochstein Pavilion • 400 outdoor images of a building • occlusions, movement in foreground, background IAM KAM

  33. Internet scale imagery • Notre-dame cathedral • 738 images • Collected from Flickr • Large variations in illumination (night/day/saturations), clutter (people, decorations),camera parameters (focal length, fov, …) • Non-uniform sampling of the space

  34. Organization • k-nearest neighbors

  35. Organization • “geodesics’ “zoom-out” “Walk-closer” 3D rotation

  36. Summary • Challenges for manifold learning in the wild are both theoretical and practical • Need for novel image representations • Sparse features • Robustness to outliers, nuisance articulations, etc. • Learning in the wild: unsupervised imagery • Promise lies in fast methods that exploit only neighborhood properties • No complex optimization required

More Related