430 likes | 446 Views
This article explores the use of feature triplets for efficient object recognition in large databases. It discusses the importance of spatial relationships between neighboring triplets and proposes a triplet tree model for inference efficiency. It also explores different approaches for creating vocabulary and expanding the tree structure. Examples and discussions are provided to demonstrate the effectiveness of the proposed methods.
E N D
Feature triplets for object recognition Larry Zitnick
Text analogies Words Topics Features Categories wood the should beam doorway be reinforce placed structure above to blue for car jump shampoo this bag = Politics Biology Mathematics Nature (Baeza-Yates and B. Ribeiro-Neto, 99, Squire et al. 00, Sivic et al. 03, Sivic et al. 05)
Text analogies A smaller window is desirable to avoid unwanted smoothing in a disparity map. Home improvement? Computer vision? Home improvement or computer vision?
Background clutter Real images have background clutter… a smaller window is desirable to avoid unwanted smoothing in a disparity map year for hat blue a as sky in of a smaller window is desirable to avoid unwanted smoothing in a disparity map nail up we forest draw stereo object solve matrix rectify train edge stereo object solve matrix rectify train edge
Spatial relationships Real images are 2D… up rain draw sky unwanted avoid in to for blue as of smoothing forest a hat rectify out desirable smaller year is a in window map disparity nail a
N-gram model for vision? Local model: Global model: Constellation model (Weber et al, 00 Fergus et al. 03) 1-gram
“Words” should be: Discriminative – belong only to a few objects/categories Predictive – informative of occurrence and position of neighboring words.
Possible words SIFT: Harris/Hessian-Affine, MSER, etc. Lowe, 2004 Mikolajczyk and Schmid, 2004 Matas et al., 2004 Two features: Rotation + scale = affine Doublets, Sivic et al., 2005
Feature triplets Group neighboring features into sets of 3 Lazebnik et al., 2003 100 features ≈ 1,000 triplets
Advantages More discriminative – 3x more descriptors “computer vision” more discriminative than “computer” and “vision” More predictive – robust affine transformation
Outline • Object instance recognition in large databases • Jie Sun (Georgia Tech) • Object category recognition • Xiangyang Lan (Cornell)
Object instance recognition J. Sun, C. L. Zitnick, R. Szeliski • Efficient recognition with large databases (> million objects) • Image centric Query image Training image
Affine Feature invariance SIFT feature space Rotation + Scale
Sampling descriptor patches Canonical frame Image Similar to Brown and Lowe, 2002
f1 f1 f0 f0 f1 f0 f1 f0 Triplet centric Scale & rotation invariant
Triplet centric Affine Feature invariance SIFT feature space Rotation + Scale
Feature vocabulary K-means clustering 1,000 clusters = 1,000,000,000 possible triplets Increased redundancy = higher computational cost Each feature has a different descriptor for each triplet Verification Geometric hashing technique
Object category recognition X. Lan, C. L. Zitnick, R. Szeliski • Feature-based approach to object category recognition • Model based Bag of words model (Sivic et al. 03) Constellation model (Weber et al, 00 Fergus et al. 03) 3D model from features (Rothganger et al. 03)
t2 t1 Our Approach • Use local representation • Model spatial relationships between neighboring triplets • Allows global deformations
Spatial Relations: Triplet Tree • Design decision - use triplet tree for inference efficiency.
Two Essential Problems • Compute Probability of Objects given tree • Find the “topic” • Find Tree Structure • Find “grammatically correct” structure
Scoring the object G – tree graph Ok – object ti – triplet P(ti ) – triplet’s parent
Transition Probability fij ti fij Canonical ? tj
Creating Vocabulary • K-means (Sivic et al. 03) • Two features are corresponding if they’re assigned to same cluster.
? ? ? ? ? Expanding tree • Rank potential triplets in queue. • Find most likely triplet given the probability of all the objects.
Example: Known man Caltech 101
Example: Unknown woman Caltech 101
Example: Harder classes Caltech 101
Example: 3D object instance UIUC database
Example: 3D object instance UIUC database
Discussion • Local constellation model • Triplet approach = tri-gram model? • Tree object models (greedy and efficient) • Alternative: Model interactions between neighboring triplets using MRF and BP. • Model subclasses of objects • How to model background? • Further experiments needed