470 likes | 644 Views
Towards Bridging Semantic Gap and Intention Gap in Image Retrieval. Attribute-augmented Semantic Hierarchy. Hanwang Zhang 1 , Zheng -Jun Zha 2 , Yang Yang 1 , Shuicheng Yan 1 , Yue Gao 1 , Tat- Seng Chua 1. 1: National University of Singapore.
E N D
Towards Bridging Semantic Gap and Intention Gap in Image Retrieval Attribute-augmented Semantic Hierarchy Hanwang Zhang1, Zheng-Jun Zha2, Yang Yang1, Shuicheng Yan1, Yue Gao1, Tat-Seng Chua1 1: National University of Singapore 2: Institute of Intelligent Machines, Chinese Academy of Sciences
What happened? Data Search Engine Query User Large-scale Unstructured
What happened? SemanticGap Data Search Engine Query User IntentionGap
Bridging Semantic Gap High-level Semantic SemanticGapBridged? ontological No! semantic Low-level Visual Feature
Bridging Intention Gap User Intention IntentionGapBridged? No! Low-level Visual Feature
Challenges Semantics Search Intent Low-level Feature 5/33
Solution: Attributes Semantics Search Intent Attributes Low-level Feature 6/33
Attributes Component snout, ear, etc Appearance furry, brown, etc cat or dog?etc Discriminability
Solution: Attribute-augmented Semantic Hierarchy (A2SH) 1 Root • Semantichierarchy • Poolofattributes • Conceptclassifiers • Attributeclassifiers Animal Vehicle 2 metal head • Hierarchical Semantics • Hierarchical Semantic • Similarity Cat Dog Root Animal Dog Pug wheel wet leg glass furry shiny brown Corgi Pug General framework for Content-based Image Retrieval
A Prototype ofA2SH ILSVRC2012ImageNet Concepts: 1322 (958 leaves) Depth: 3 ~ 11 Images: 1.23 million 50% training 50% testing Tail Leg • 95,800imagesaremanually labeledwith33attributes • Automaticallydiscovered2-26attributesforeachconceptnode • 15 ~ 58 attributes per concept
Why A2SH? • Attributes bridge the semantic gap 1 concept Smaller Variance attribute 2 glass Descriptive,Transferrable wing wheel
Why A2SH? • A2SH well defines attributes more informative Which “Wing”?
Why A2SH? • A2SHbridges the intention gap 1 Intention as attributes throughattributeandimagefeedbacks LegSkin Attribute Feedback Image Feedback 2 Feedbacksareautomaticallydigested into multiple levels Leg Tail
Concept Classifiers predicts whether an image belongs to concept c C
Concept Classifiers predicts whether an image belongs to concept c _ hierarchicalonev.s.all + _ c + • Exploit hierarchical relation • Alleviate error propagation + +
Attribute Classifiers predicts the presence of an attribute a of concept c • Nameable attributes: • human nameable, hierarchical supervised learning • Unnameable attributes: • human unnameable, hierarchicalunsupervisedlearning • They together offer a comprehensive description of the multiple facets of a concept
Unnameable Attribute Classifiers • Nameable attributes are not discriminative enough. • Discover new attributes for concepts that share many nameable attributes. • 2-26 for each concept. Ear Snout Eye Furry D. Parikh, K. Graman. “Interactively Building a Discriminative Vocabulary of Nameable Attributes”, CVPR 2011.
What we have now? • Concept classifiers • Semantic path prediction • Attribute classifiers • Imagerepresentation along the semantic path Hierarchical Semantic Representation 20/33
Hierarchical Semantic Similarity Images are represented by attributes in the context of concepts Hierarchical semantic similarity
LocalSemantic Metric Same concept close, different concepts far
What we have now? • Concept classifiers • Semantic path prediction • Attribute classifiers • Imagerepresentation along the semantic path • Hierarchical Semantic Similarity Function • Semantic similarity between images Hierarchical Semantic Representation 23/33
Automatic Retrieval Hierarchical semantic similarity Ic c child(c) Candidate images are retrieved by semantic indexing Lowcomplexity! Efficient! candidate images
Evaluation • A2SH: our method • hBilinear: retrieves images by bilinear semantic metric (Deng et al. 2011 CVPR) • hPath: length (confidence) of the common semantic path of an image and the query • hVisual: hPath+visual similarity • fSemantic: flat semantic feature similarity • fVisual: visual feature similarity Training: 50%, Gallery: 50% (95, 800 queries)
Evaluation: AutomaticRetrieval Effective! Efficient!
Case Study AutomaticRetrieval matched semantically similar fVisual hBilinear A2SH
Interactive Retrieval • Image-level Feedback Query
Interactive Retrieval • Attribute-level Feedback Query Leg Cloth Zhang et al. “Attribute Feedback”, MM 2012
Evaluation: InteractiveRetrieval 2-min fixed time
Case Study InteractiveRetrieval initial matched semantically similar QPM HF A2SH
Summary Attribute-augmented Semantic Hierarchy A2SH SH with Attributes Framework for CBIR Effectiveness Verified Gaps bridging 1.23 M Images 33/33
& Q A ? !
Nameable Attribute Classifiers selected base
Unnameable Attribute Classifiers confusion matrix
Unnameable Attribute Classifiers confusion matrix
Data Set • Only leaves have images and each concept’s images are merged bottom-top • 50% to 50% training and testing (gallery) • 100 random images per leaf from testing are used as queries • 100 random images from each leaf’s training images are annotated with attributes • Color, texture, edge and multi-scale dense SIFT. LLC with max-pooling, 2-level spatial pyramid. 35,903-d feature vector
0.93 Concept Classifiers
Nameable Attributes Classifiers 0.92 0.77