320 likes | 467 Views
Semantic Kernel Forests from Multiple Taxonomies. Sung Ju Hwang (University of Texas at Austin), Fei Sha (University of Southern California), and Kristen Grauman (University of Texas at Austin). Limitation of status quo recognition.
E N D
Semantic Kernel Forests from Multiple Taxonomies Sung Ju Hwang (University of Texas at Austin), FeiSha(University of Southern California), and Kristen Grauman (University of Texas at Austin)
Limitation of status quo recognition Until recently, most categorization methods solely relied on the category labels, treating each instance as an isolated entity. Visual world Semantic space 1 Cat x x 2 Dog x Wolf 3 x Zebra 4
Limitation of status quo recognition However, semantic entities exist in relation to others. Visual world Semantic space Pet Cat Similar Dog Wolf Wild Canine Zebra Dissimilar Larger and finer-grained datasets → more meaningful relations How can we exploit such relations for improved categorization? [Fergus10] Semantic Label Sharing for Learning with Many Categories, R. Fergus, H. Bernal, Y. Weiss, A. Torralba,, ECCV 2010 [Zhao11] Large Scale Category Structure Aware Image Classification, B. Zhao, L. FeiFei, E. P. Xing, NIPS 2011
Motivation Our focus: a semantic taxonomy - But, potentiallytwo snags. 1) Partial alignment between the taxonomy and visual distribution 2) No single ‘optimal’ taxonomy Appearance Biological Habitat Texture Animal Tameness Spotted Pointy Corner Canine Feline Domestic Wild Dalmatian Leopard Wolf Siam. Cat Dalmatian Wolf Siam. Cat leopard Dalmatian Siam. Cat Wolf leopard What information to exploit from multiple taxonomies and how to leverage it?
Idea Exploit multiple semantic taxonomies for visual feature learning - Taxonomies provide human merge/split criteria - Each taxonomy provides complementary information Biological Appearance Habitat Indoor setting, person Dog-like shape Woods Spot Pointy corner Cat-face Animal Texture Tameness How do we then, Canine Feline Spotted Pointy Corner Domestic Wild 1. Learn granularity and view specific features on each taxonomy, and 2. Combine learned features across taxonomies for object recognition? Dalmatian Wolf Siam. Cat leopard Dalmatian Leopard Wolf Siam. Cat Dalmatian Siam. Cat Wolf leopard
Overview Goal: Learn and combine features across multiple taxonomies Indoor setting, person Dog-like shape Woods Spot Pointy corner Cat-face Animal Texture Tameness Canine Feline Spotted Pointy Corner Domestic Wild Dalmatian Wolf Siam. Cat leopard Dalmatian Leopard Wolf Siam. Cat Dalmatian Siam. Cat Wolf leopard 1. Learn view and granularity specific features at each taxonomy Indoor setting, person Spot Woods Dog-like shape Pointy corner Cat-face + + + + + 2. Optimally combine learned features for categorization [Hwang11] Learning a Tree of Metrics with Disjoint Visual Features, S. J. Hwang, F. Sha, and K. Grauman, NIPS 2011 Categorization model
Tree of Metrics How to learn granularity- and view- specific features? – Exploit parent-child relationship to isolate features at each node Carnivore Canine Feline Domestic Cat Dalmatian Wolf Bit cat Siamese cat Persian cat Intuition: Features useful for the discrimination of the superclasses less useful for subcategory discrimination [Hwang11] Learning a Tree of Metrics with Disjoint Visual Features, S. J. Hwang, F. Sha, and K. Grauman, NIPS 2011
Tree of Metrics Approach the feature learning problem as hierarchical metric learningwith disjoint regularization Lighter element has higher value Carnivore Canine Canine Feline xj Feline xi xl Domestic Cat Dalmatian Wolf Big cat margin Siamese cat Persian cat Given a taxonomy ,we learn a metric for each internal (superclass) node n to discriminate between its subclasses. [Hwang11] Learning a Tree of Metrics with Disjoint Visual Features, S. J. Hwang, F. Sha, and K. Grauman, NIPS 2011
Tree of Metrics Approach the feature learning problem as hierarchical metric learning with disjoint regularization Carnivore Canine Feline Domestic Cat Dalmatian Wolf Big cat Siamese cat Persian cat Given a taxonomy ,we learn a metric for each internal (superclass) node n to discriminate between its subclasses. [Hwang11] Learning a Tree of Metrics with Disjoint Visual Features, S. J. Hwang, F. Sha, and K. Grauman, NIPS 2011
Tree of Metrics Further, we learn all metrics simultaneously with two regularizations A sparsity-based regularization to identify informative features. A disjointregulazation to learn features exclusive to each granularity. Carnivore Canine Feline Domestic Cat Dalmatian Wolf Big cat Siamese cat Persian cat [Hwang11] Learning a Tree of Metrics with Disjoint Visual Features, S. J. Hwang, F. Sha, and K. Grauman, NIPS 2011
Regularization Terms to Learn Compact, Discriminative Metrics Sparsity regularization How can we select few informative features at each node? Minimize the sum of the diagonal entries. → Competition between features in a single metric [Hwang11] Learning a Tree of Metrics with Disjoint Visual Features, S. J. Hwang, F. Sha, and K. Grauman, NIPS 2011
Regularization Terms to Learn Compact, Discriminative Metrics Disjoint regularization How can we regularize each metric to use features disjoint from its ancestors? Descendant Ancestor Enforce two metrics not to have large value at the same time, for the same feature. → Competition between ancestors and descendants Both regularizers are convex [Hwang11] Learning a Tree of Metrics with Disjoint Visual Features, S. J. Hwang, F. Sha, and K. Grauman, NIPS 2011
Overview Goal: Learn and combine features across multiple taxonomies Indoor setting, person Dog-like shape Woods Spot Pointy corner Cat-face Animal Texture Tameness Canine Feline Spotted Pointy Corner Domestic Wild Dalmatian Wolf Siam. Cat leopard Dalmatian Leopard Wolf Siam. Cat Dalmatian Siam. Cat Wolf leopard 1. Learn view and granularity specific features at each taxonomy Indoor setting, person Cat-face Spot Woods Dog-like shape Pointy corner + + + + + 2. Optimally combine learned features for categorization [Hwang12] S. J. Hwang, F. Sha, K. Grauman, Semantic Kernel Forest from Multiple Taxonomies, NIPS 2012 Categorization model
Semantic Kernel Forest From multiple ToMs, we obtain a semantic kernel forest, a set of non-linear view- and granularity- specific feature spaces Biological Appearance Habitat Animal Texture Tameness Canine Feline Spotted Pointy Corner Domestic Wild Dalmatian Wolf Siam. Cat leopard Dalmatian Leopard Wolf leopard Dalmatian Siam. Cat Wolf leopard Compute RBF kernel on the distance computed using the learned metric [Hwang12] S. J. Hwang, F. Sha, K. Grauman, Semantic Kernel Forest from Multiple Taxonomies, NIPS 2012
Semantic Kernel Forest How to combine the learned kernel forest for optimal discrimination? Biological Appearance Habitat Animal Texture Tameness Canine Feline Spotted Pointy Corner Domestic Wild Dalmatian Wolf Siam. Cat leopard Dalmatian Leopard Wolf leopard Dalmatian Siam. Cat Wolf leopard Obtain class specific kernel by linearly combining kernels on the tree paths. multiple kernel learning Consider only a small fraction of relevant kernels – O(TlogN)
Proposed Sparse Hierarchical Regularization Multiple taxonomies may provide some redundant kernels - Interleaved selection of kernels Animal Biological Tameness Habitat Canine Feline Domestic Wild Dalmatian Wolf Siam. Cat leopard Dalmatian Siam. Cat Wolf leopard Usual L1 regularization: selects few useful kernels Are all kernels equal?
Proposed Sparse Hierarchical Regularization Multiple taxonomies provide redundant kernels - Higher level kernels discriminate with more categories Animal Biological Tameness Habitat < < Canine Feline Domestic Wild Dalmatian Wolf Siam. Cat leopard Dalmatian Siam. Cat Wolf leopard • Hierarchical regularization • weight of a node must be larger than its children’s - Implicitly enforce hierarchical structure among kernels
Optimization for Semantic Kernel Forest We minimize the sum of the MKL objective + regularization term MKL objective Sparsity Reg. Hierarchical regularization Nonsmooth due to the hierarchical regularization term - Use projected subgradient to optimize [Hwang12] S. J. Hwang, F. Sha, K. Grauman, Semantic Kernel Forest from Multiple Taxonomies, NIPS 2012
Datasets Constructed on different attribute groups • AWA-10 • 6,180 images • 10 animal classes • Fine-grained (a) Wordnet (b) Appearance (c) Behavior (d) Habitat • Imagenet-20 • -28,957 images • 20 non-animal classes • Coarser-grained (a) Wordnet (c) Attributes (b) Visual
Multiclass Classification Results We compare to three baselines - Raw feature kernel: RBF kernel computed on the original image features - Raw feature kernel + MKL: MKL with RBF kernels with different bandwidth. - Perturbed semantic kernel tree: Semantic kernel forest on randomly permuted taxonomy.
Multiclass Classification Results Semantic kernel tree (ToM) > perturbed kernel tree - Semantic kernel tree + Avg: Averged semantic kernel tree on a single taxonomy - Semantic kernel tree + MKL: MKL on a single taxonomy only with sparsity reg. - Semantic kernel tree + MKL-H: with both sparsity and hierarchical regularization. - More meaningful grouping/splits for object categorization
Multiclass Classification Results Multiple taxonomies > a single taxonomy - Semantic kernel forest+MKL: MKL with kernels learned on multiple taxonomies, with only the sparsity regularization - Semantic kernel forest+MKL-H: with both sparsity and hierarchical regularization. - Each taxonomy provides complementary information
Multiclass Classification Results Hierarchical regularizer > Standard L1 regularization - Good to consider the structure of the feature spaces - Regularizer’s effect is minimal on the semantic kernel tree, which lacks redundancy
Confusion matrices on 4 animal classes Biological Appearance Habitat Blue: Low confusion Animal Animal Animal Red: High confusion Spotted Pointy Ear Domestic Wild Canine Feline Dalmatian Siam. Cat Leopard Wolf Dalmatian Wolf Siam. Cat Leopard Dalmatian Siam. Cat Leopard Wolf Each taxonomy is suboptimal, but provides complementary information which could be optimally leveraged with MKL
Effect of hierarchical regularization Hierarchical regularizer avoids overfitting with implicit structure enforced among kernels Wordnet Appearance Behavior Habitat land land aquatic feline aquatic jungle habitat cat/rat aquatic ~panda racoon/rat behavior hairless nonjungle carnivore procyonid even-toed placental appearance predator/prey Lower Higher Sparsity regularization only: 34.33 Sparsity+ Hierarchical: 35.67
Summary Intuition Competing features between parent and child Complementary across different semantic views Key message: semantic taxonomies for visual feature learning Learning Methods Disjoint and hierarchical regularizer for competing features MKL with hierarchical regularizer for complementary features - Exploits disjoint sparsitybetween parent and child classes in a taxonomy: Tree of Metrics - Leverages complementary information from multiple semantic taxonomies: Semantic Kernel Forest - Novel regularizers that exploit category relations - Disjoint regularizerthat exploits parent-child relationship to learn disjoint features. - Hierarchical regularizerthat favors upper level kernels. [Hwang11] Learning a Tree of Metrics with Disjoint Visual Features, S. J. Hwang, F. Sha, and K. Grauman, NIPS 2011 [Hwang12] S. J. Hwang, F. Sha, K. Grauman, Semantic Kernel Forest from Multiple Taxonomies, NIPS 2012
Summary Key message: Semantic taxonomiesfor visual feature learning Intuition: Competing features between parent and child Complementary across different semantic views Learning methods: Disjointregularizer for competing features MKL with hierarchical regularizer for complementary features [Hwang11] Learning a Tree of Metrics with Disjoint Visual Features, S. J. Hwang, F. Sha, and K. Grauman, NIPS 2011 [Hwang12] S. J. Hwang, F. Sha, K. Grauman, Semantic Kernel Forest from Multiple Taxonomies, NIPS 2012
Per-class results A single taxonomy often improves performance on some classes, at the expense of others. - Individual taxonomy suboptimal. Habitat - Better for h. whale - Worse for panda Wordnet - Better for panda - Worse for h. whale All - Better for both Semantic kernel forest takes the best of both through learned combination.
Idea Learn non-linear feature space for each view and granularity, that splits the categories according to each merge/split criteria Canine vs. Feline Appearance Habitat Indoor setting, person Dog-like shape Woods Spot Pointy corner Cat-face Texture Tameness Spotted Pointy Corner Domestic Wild Dalmatian Leopard Wolf Siam. Cat Dalmatian Siam. Cat Wolf leopard
Idea Learn non-linear feature space for each view and granularity, that splits the categories according to each merge/split criteria Canine vs. Feline Spot vs. Pointy corner Habitat Indoor setting, person Dog-like shape Woods Spot Pointy corner Cat-face Tameness Domestic Wild Dalmatian Siam. Cat Wolf leopard
Idea Learn non-linear feature space for each view and granularity, that splits the categories according to each merge/split criteria Canine vs. Feline Spot vs. Pointy corner Domestic vs. Wild Indoor setting, person Dog-like shape Woods Spot Pointy corner Cat-face
Idea Then, combine the feature space to obtain an optimally discriminative space for categorization. Canine vs. Feline Spot vs. Pointy corner Domestic vs. Wild Indoor setting, person Dog-like shape Woods Spot Pointy corner Cat-face How do we then, Combined feature space - learn such features, and - optimally combine them?