1 / 37

Object Recognition from Local Scale-Invariant Features

Object Recognition from Local Scale-Invariant Features. David G. Lowe Presented by Ashley L. Kapron. Introduction. Object Recognition Recognize known objects in unknown configurations. Previous Work. Zhang et al Harris Corner Detection Detect peaks in local image variation

tucker
Download Presentation

Object Recognition from Local Scale-Invariant Features

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron

  2. Introduction • Object Recognition • Recognize known objects in unknown configurations

  3. Previous Work • Zhang et al • Harris Corner Detection • Detect peaks in local image variation • Schmid and Mohr • Harris Corner Detection • Local image descriptor at each interest pt from an orientation-invariant vector of derivative-of-Gaussian image measurements

  4. Motivation • Limitations of previous work: • Examine image only on a single scale • Current paper addresses this concern by identifying stable key locations in scale space • Identify features that are invariant

  5. Invariance • Illumination • Scale • Rotation • Affine

  6. Scale Space • Different scales are appropriate for describing different objects in the image, and we may not know the correct scale/size ahead of time.

  7. Difference of Gaussian • A = Convolve image with vertical and horizontal 1D Gaussians, σ=sqrt(2) • B = Convolve A with vertical and horizontal 1D Gaussians, σ=sqrt(2) • DOG (Difference of Gaussian) = A – B • Downsample B with bilinear interpolation with pixel spacing of 1.5 (linear combination of 4 adjacent pixels)

  8. B1 A1 Difference of Gaussian Pyramid A3-B3 Blur B3 DOG3 A3 Downsample A2-B2 B2 Blur DOG2 A2 Input Image Downsample A1-B1 Blur DOG1 Blur

  9. Pyramid Example A3 DOG3 B3 A2 B2 DOG3 A1 B1 DOG1

  10. Feature detection • Find maxima and minima of scale space • For each point on a DOG level: • Compare to 8 neighbors at same level • If max/min, identify corresponding point at pyramid level below • Determine if the corresponding point is max/min of its 8 neighbors • If so, repeat at pyramid level above • Repeat for each DOG level • Those that remain are key points

  11. Identifying Max/Min DOG L+1 DOG L DOG L-1

  12. Refining Key List: Illumination • For all levels, use the “A” smoothed image to compute • Gradient Magnitude • Threshold gradient magnitudes: • Remove all key points with MIJ less than 0.1 times the max gradient value • Motivation: Low contrast is generally less reliable than high for feature points

  13. Assigning Canonical Orientation • For each remaining key point: • Choose surrounding N x N window at DOG level it was detected DOG image

  14. Assigning Canonical Orientation • For all levels, use the “A” smoothed image to compute • Gradient Orientation + Gradient Orientation Gradient Magnitude Gaussian Smoothed Image

  15. Assigning Canonical Orientation • Gradient magnitude weighted by 2D gaussian = * Gradient Magnitude 2D Gaussian Weighted Magnitude

  16. Assigning Canonical Orientation • Accumulate in histogram based on orientation • Histogram has 36 bins with 10° increments Weighted Magnitude Sum of Weighted Magnitudes Gradient Orientation Gradient Orientation

  17. Assigning Canonical Orientation • Identify peak and assign orientation and sum of magnitude to key point * Peak Weighted Magnitude Sum of Weighted Magnitudes Gradient Orientation Gradient Orientation

  18. Refining Key List: Rotation • The user may choose a threshold to exclude key points based on their assigned sum of magnitudes.

  19. Example of Refinement Filter for illumination Max/mins from DOG pyramid Filter for edge orientation

  20. Local Image Description • SIFT keys each assigned: • Location • Scale (analogous to level it was detected) • Orientation (assigned in previous canonical orientation steps) • Now: Describe local image region invariant to the above transformations

  21. SIFT key example

  22. Local Image Description For each key point: • Identify 8x8 neighborhood (from DOG level it was detected) • Align orientation to x-axis

  23. Local Image Description • Calculate gradient magnitude and orientation map • Weight by Gaussian

  24. Local Image Description • Calculate histogram of each 4x4 region. 8 bins for gradient orientation. Tally weighted gradient magnitude.

  25. Local Image Description • This histogram array is the image descriptor. (Example here is vector, length 8*4=32. Best suggestion: 128 vector for 16x16 neighborhood)

  26. Database Creation • Index all key points of reference model image(s) • Store key point descriptor vectors in database

  27. Image Matching • Find all key points identified in target image • Each key point will have 2d location, scale and orientation, as well as invariant descriptor vector • For each key point, find similar descriptor vectors in reference image database. • Descriptor vector may match more than one reference image database • The key point “votes” for image(s) • Use best-bin-first algorithm

  28. Hough Transform Clustering • Create 4D Hough Transform (HT) Space for each reference image • Orientation bin = 30° bin • Scale bin = 2 • X location bin = 0.25*ref image width • Y location bin = 0.25*ref image height • If key point “votes” for reference image, tally its vote in 4D HT Space. • This gives estimate of location and pose • Keep list of which key points vote for a bin

  29. Verification • Identify bins with largest votes (must have at least 3). • Using list of key points which voted for a cell, compute affine transformation parameters (m, t) • Use corresponding coordinates of reference model (x,y) and target image (u,v).

  30. Verification • If more than three points, solve in least-squares sense

  31. Verification: Remove Outliers • After applying affine transformation to key points, determine difference between calculated location and actual target image location • Throw out if: • Orientation different by 15° • Scale off by sqrt(2) • X,Y location by 0.2*model size • Repeat least-squares solution until no points are thrown out

  32. SIFT Example

  33. SIFT Example

  34. SIFT example

  35. Advantages of SIFT • Numerous keys can be generated for even small objects • Partial occlusion/image clutter ok because dozens of SIFT keys may be associated with an object, but only need to find 3 • Object models can undergo limited affine projection. • Planar shapes can be recognized at 60 degree rotation away from camera. • Individual features can be matched to a large database of objects

  36. Limitations of SIFT • Fully affine tranformations require additional steps • Many parameters “engineered” for specific application. May need to be evaluated on case-to-case basis

  37. Thank you!

More Related