520 likes | 524 Views
This text provides an overview of the keypoint-based object recognition process, including the steps of specifying object models, generating hypotheses, scoring hypotheses, and resolving matches. It also covers techniques such as affine-variant point locations and geometric verification.
E N D
03/04/10 Keypoint-based Recognition Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem
Today • HW 1 • HW 2 (p3) • Recognition
General Process of Object Recognition Specify Object Model Generate Hypotheses Score Hypotheses Resolution
General Process of Object Recognition Example: Keypoint-based Instance Recognition Specify Object Model A1 A3 B3 A2 Generate Hypotheses B1 B2 Last Class Score Hypotheses Resolution
Overview of Keypoint Matching 1. Find a set of distinctive key- points 2. Define a region around each keypoint A1 B3 3. Extract and normalize the region content A2 A3 B2 B1 4. Compute a local descriptor from the normalized region N pixels e.g.color e.g.color 5. Match local descriptors N pixels K. Grauman, B. Leibe
General Process of Object Recognition Example: Keypoint-based Instance Recognition Specify Object Model A1 A3 B3 A2 Affine-variant point locations Generate Hypotheses Affine Parameters B1 B2 Score Hypotheses This Class # Inliers Resolution Choose hypothesis with max score above threshold
Matching Keypoints • Want to match keypoints between: • Training images containing the object • Database images • Given descriptor x0, find two nearest neighbors x1, x2 with distances d1, d2 • x1 matches x0 if d1/d2 < 0.8 • This gets rid of 90% false matches, 5% of true matches in Lowe’s study
Affine Object Model • Accounts for 3D rotation of a surface under orthographic projection
Affine Object Model • Accounts for 3D rotation of a surface under orthographic projection Translation Scaling/skew How many matched points do we need?
Finding the objects • Get matched points in database image • Get location/scale/orientation using Hough voting • In training, each point has known position/scale/orientation wrt whole object • Bins for x, y, scale, orientation • Loose bins (0.25 object length in position, 2x scale, 30 degrees orientation) • Vote for two closest bin centers in each direction (16 votes total) • Geometric verification • For each bin with at least 3 keypoints • Iterate between least squares fit and checking for inliers and outliers • Report object if > T inliers (T is typically 3, can be computed by probabilistic method)
View interpolation • Training • Given images of different viewpoints • Cluster similar viewpoints using feature matches • Link features in adjacent views • Recognition • Feature matches may bespread over several training viewpoints Use the known links to “transfer votes” to other viewpoints [Lowe01] Slide credit: David Lowe
Applications • Sony Aibo(Evolution Robotics) • SIFT usage • Recognize docking station • Communicate with visual cards • Other uses • Place recognition • Loop closure in SLAM K. Grauman, B. Leibe Slide credit: David Lowe
Location Recognition Training [Lowe04] Slide credit: David Lowe
Fast visual search “Video Google”, Sivic and Zisserman, ICCV 2003 “Scalable Recognition with a Vocabulary Tree”, Nister and Stewenius, CVPR 2006.
110,000,000Images in5.8 Seconds Slide Slide Credit: Nister
Slide Slide Credit: Nister
Slide Slide Credit: Nister
Slide Slide Credit: Nister
Key Ideas • Visual Words • Cluster descriptors (e.g., K-means) • Inverse document file • Quick lookup of files given keypoints tf-idf: Term Frequency – Inverse Document Frequency # documents # times word appears in document # documents that contain the word # words in document
Recognition with K-tree Following slides by David Nister (CVPR 2006)
Improves Retrieval Improves Speed More words is better Branch factor
Video Google System Query region • Collect all words within query region • Inverted file index to find relevant frames • Compare word counts • Spatial verification Sivic & Zisserman, ICCV 2003 • Demo online at : http://www.robots.ox.ac.uk/~vgg/research/vgoogle/index.html Retrieved frames K. Grauman, B. Leibe
Application: Large-Scale Retrieval Query Results on 5K (demo available for 100K) K. Grauman, B. Leibe [Philbin CVPR’07]
Example Applications Aachen Cathedral • Mobile tourist guide • Self-localization • Object/building recognition • Photo/video augmentation B. Leibe [Quack, Leibe, Van Gool, CIVR’08]