540 likes | 641 Views
Foreground Focus: Finding Meaningful Features in Unlabeled Images. Yong Jae Lee and Kristen Grauman University of Texas at Austin. Supervised learning methods yield good recognition performance in practice. But… Supervision is Expensive
E N D
Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin
Supervised learning methods yield good recognition performance in practice. But… • Supervision is Expensive • collect training examples, perform labeling, segmentation, etc. • Supervision has Bias • variability of the target data may not be captured (i.e., not general enough) We propose an UnsupervisedForeground Detection and Category Learning method based on image clustering
Related Work • Unsupervised Category Discovery • Topic models: pLSA, LDA - Fergus et al., Sivic et al., Quelhas et al., ICCV 2005, Fei-Fei & Perona, CVPR 2005, Liu & Chen, ICCV 2007 • Image Clustering - Grauman & Darrell, CVPR 2006, Dueck & Frey, ICCV 2007 • Image Clustering with localization - Kim et al., CVPR 2008 • Supervised Feature Selection / Part Discovery • Discriminative Feature Selection - Dorko & Schmid, ICCV 2003, Quack et al., ICCV 2007 • Weakly Supervised Learning - Weber et al., ECCV 2000, Fergus et al., CVPR 2003, Chum & Zisserman, CVPR 2007… • Query Expansion - Chum et al., ICCV 2007
Clusters formed from foreground matches Mutual Relationship between Foreground Features and Clusters • If we have only foreground features, we can form good clusters… Clusters formed from full image matches
Mutual Relationship between Foreground Features and Clusters • If we have good clusters, we can detect the foreground…
Mutual Relationship between Foreground Features and Clusters • If we have good clusters, we can detect the foreground… • If we have only foreground features, we can form good clusters…
Our Approach Feature weights Feature index • Unsupervised task that iteratively seeks the mutual support between discovered objects and their defining features Refine feature weights given current clusters Update cluster based on weighted semi-local feature matches
X = {(f1(X),w1),(f2(X),w2),…,(fn(X),wn)} Y = {(f1(Y),w1),(f2(Y),w2),…,(fm(Y),wm)} Sets of local features
Optimal Partial Matching X = {(f1(X),w1),(f2(X),w2),…,(fn(X),wn)} Y = {(f1(Y),w1),(f2(Y),w2),…,(fm(Y),wm)} Earth Mover’s Distance [Rubner et al., IJCV 2000]: : features from sets , X and Y : distance between the descriptors : scalars giving the amount of weight mapped from ,
D(fi(X), fj(Y)) Feature Contribution to Match f1(X) f1(Y) f2(X) f2(Y) f3(X) Y X
Feature Contribution to Match D(fi(X), fj(Y)) f1(X) f1(Y) f2(X) f2(Y) f3(X) Y X Weight computation is influenced by both the flow (amount of mass transferred) and distance between the matching features: Contribution = weight / distance Contribution to Match Feature index
Feature Contribution to Match f1(X) f1(Y) f2(X) f2(Y) f3(X) Y X Weight computation is influenced by both the flow (amount of mass transferred) and distance between the matching features: Contribution = weight / distance Contribution to Match Feature index
Feature Contribution to Match f1(X) f1(Y) f2(X) f2(Y) f3(X) Y X Weight computation is influenced by both the flow (amount of mass transferred) and distance between the matching features: Contribution = weight / distance Contribution to Match Feature index
Feature Contribution to Match f1(X) f1(Y) f2(X) f2(Y) f3(X) Y X Weight computation is influenced by both the flow (amount of mass transferred) and distance between the matching features: Contribution = weight / distance Contribution to Match Feature index
Feature Contribution to Match f1(X) f1(Y) f2(X) f2(Y) f3(X) Y X Weight computation is influenced by both the flow (amount of mass transferred) and distance between the matching features: Contribution = weight / distance Contribution to Match Feature index
Mutual Relationship between Foreground Features and Clusters • If we have good clusters, we can detect the foreground… • If we have only foreground features, we can form good clusters…
Computing Feature Weights feature index contribution to match
Computing Feature Weights new feature weights
Computing Feature Weights new feature weights
Computing Feature Weights new feature weights
Computing Feature Weights new feature weights
Computing Feature Weights new feature weights
Mutual Relationship between Foreground Features and Clusters • If we have good clusters, we can detect the foreground… • If we have only foreground features, we can form good clusters…
feature weights feature weights : Matching features have highweights and highsimilarity High contribution to match score Computing Image Similarity
feature weights feature weights : Matching features have lowweights and lowsimilarity low (negligible) contribution to match score Computing Image Similarity
feature weights feature weights : Matching features have low and highweights and highsimilarity. The amount of weight that is matched is always the smaller of the two feature weights. Low contribution to match score Computing Image Similarity
Forming Clusters Compute Pair-wise Partial Matching Image Similarities
Forming Clusters Normalized Cuts Clustering
Mutual Relationship between Foreground Features and Clusters • If we have good clusters, we can detect the foreground… • If we have only foreground features, we can form good clusters… • Now we have the pieces to do both…
Cluster and Feature Weight Refinement: Iteration 1 Feature weights Images as Local Feature Sets Pair-wise Partial Matching Normalized Cuts Clustering Initial Set of Clusters Feature index
Cluster and Feature Weight Refinement: Iteration 1 Feature weights Feature index Compute Feature Weights New Feature Weights
Cluster and Feature Weight Refinement: Iteration 2 Feature weights Images as Local Feature Sets w/ New Weights Pair-wise Partial Matching Noticeable Change in Matching Normalized Cuts Clustering Feature index
Cluster and Feature Weight Refinement: Iteration 2 Feature weights New Set of Clusters Feature index Compute Feature Weights New Feature Weights
Cluster and Feature Weight Refinement: Iteration 3 Feature weights Pair-wise Partial Matching + Normalized Cuts Final Set of Clusters Feature index New Feature Weights
Semi-local features: Our proximity distribution descriptor: Local features may not produce good matches… Local features: Lazebnik et al., BMVC 2004, Sivic & Zisserman, CVPR 2004, Agarwal & Triggs, ECCV 2006, Pantofaru et al., Beyond Patches Wkshp 2006,Quack et al., ICCV 2007
Experiments • Goals: • Unsupervised Foreground Discovery • Unsupervised Category Discovery • Comparison with Related Methods • Datasets: Caltech-101, Microsoft Research Cambridge, Caltech-4 • Semi-local Features: Densely sampled SIFT, DoG SIFT, Hessian-Affine SIFT • Number of Clusters: # of Classes
Quality of Foreground Detection • Object categories with highest clutter were chosen • 2 supervised classifiers built: 1) trained on all features, 2) trained on foreground features • Ranked categories for which segmentation most helped supervised classification
Quality of Foreground Detection 10-classes subset - highly weighted features
Quality of Clusters Formed • Cluster quality for the 4-classes and 10-classes sets of Caltech-101 • Quality Measure: F-measure • Black dotted lines indicate the best possible quality that could be obtained if the ground truth segmentation were known
Comparison with clustering methods • Affinity Propagation: message passing algorithm which identifies good exemplars by propagating non-metric affinities [Dueck & Frey, ICCV 2007] • Partial Match Clusters: forms groups with partial-match spectral clustering but does not iteratively improve foreground feature weights and cluster assignments [Grauman & Darrell, CVPR 2006] Caltech-101 subsets: 7-class (N=441) and 20-class (N=1230) Caltech-4 dataset (N=3188), 10 runs with 400 randomly selected images
Comparison with topic models • Comparison of accuracy of foreground discovery • Positive Class: Caltech motorcycle class (826 images) • Negative Class: Caltech background class (900 images) • Foreground detection rate: threshold varied among top 20% most confident features [1] correspondence-based pLSA variant -[Liu & Chen, ICCV 2007] [2] pLSA with spatial information - [Liu & Chen, CVPR wkshop, 2006]
Assumptions and Limitations • Support of the pattern among multiple examples in the dataset • Some support must be detected in the initial iteration • Background can be consistently reoccurring: introduce semi-supervision
Contributions • Unsupervised foreground feature selection from unlabeled images • Automatic object category learning • Mutual reinforcement of foreground and category discovery benefits both • Novel semi-local descriptor
Future Work • Incremental updates to unlabeled dataset • Extension to multi-label cluster assignments • Automatic Model Selection: k • Automatically construct summaries of unstructured image collections
Quality of Foreground Detection and Clusters Formed • Microsoft Research Cambridge (MSRC)–v1 dataset
Proximity Distribution Descriptor p: base feature Ellipses denote features, their patterns indicate the visual word types, numbers indicate rank order of spatial proximity to the base feature Motivated by Proximity Distribution Kernels [Ling & Soatto, ICCV 2007]