360 likes | 495 Views
Content-based Image Retrieval: Approaches and Trends of the New Age. Ritendra Datta, Jia Li, James Z. Wang The Pennsylvania State University – University Park. ACM International Conference on Multimedia, November 2005. CBIR: Open Problems. What is an acceptable similarity measure ?.
E N D
Content-based Image Retrieval: Approaches and Trends of the New Age Ritendra Datta, Jia Li, James Z. Wang The Pennsylvania State University – University Park ACM International Conference on Multimedia, November 2005
CBIR: Open Problems What is an acceptable similarity measure ?
CBIR: Open Problems Can the images be automatically annotated with meaningful labels ? Elegance Love Symmetry Flower Petals Tower France Corolla Rose Australian Floribunda Rose EiffelTower Paris
CBIR: Basic Approach Image Database Extract Features Compute Similarity Rank images Query Relevance Feedback
CBIR: Approaches in the past • Past = prior to 2000 • Systems: • Commerical • IBM QBIC (1995) • Virage (1997) • Academic • MIT Photobook (1994) • Columbia VisualSeek (1996) • UCSB NeTra (1997) • Stanford WBIIS (1998) • Berkeley Blobworld (1999) • Summary: A. W. Smeulders, M. Worring, S. Santini, A. Gupta, R. Jain, Content-Based Image Retrieval at the End of the Early Years, IEEE Trans. Pattern Analysis and Machine Intelligence , 22(12):1349-1380, 2000.
CBIR: Current Trends • Less than 20% concerned with applications/real-world systems • No consensus on any one approach to image understanding • Unlikely to reach one in the near future • Focus may change towards • Useful systems • Domain-specific systems • Synonymous with text-based search engines
CBIR: Approaches of the New Age Extract Features • Feature Extraction • Global Features • Local Features • Search and Retrieval • Similarity measures • Region-based approaches • Classification / Clustering • Annotation • Classification Models • Joint models • Relevance Feedback and Learning • Hardware Support • Interface and visualization Compute Similarity Rank images Relevance Feedback
Feature Extraction: Color and Texture • Region-based dominant color descriptor + percentage cover [B.S. Manjunath et al., 2001] • More efficient than color histograms • Gets around drawbacks of color moment descriptors • Multi-resolution histogram [S.K. Nayar et al., 2004] • Captures spatial information as well • Works well on textured images • Has typical advantages of a histogram • Gaussian Mixture Vector Quantization [R.M. Gray et al.,2001] • Used to extract color histograms • Shown to be better than uniform, vector quantization • Color + Texture descriptors for the MPEG-7 Standard [B.S. Manjunath et al., 2001] • Histogram + Spatial color descriptors • Wavelet based texture descriptors
Feature Extraction: Shape • Shape Context for shape matching [S. Belongie et al., 2002] • Compact representation • Robust to many geometric transformations • Dynamic Programming for shape matching [E.G.M. Petrakis et al., 2002] • Involves computing Fourier descriptors – slow • Dynamic Time Warping distance for shape matching [I. Bartolini et al., 2005] • Alternative to Euclidean distance • Exploits amplitude and phase of Fourier descriptors
Feature Extraction: Segmentation • Normalized Cuts for image segmentation [J. Shi et al., 2000] • Based on Spectral Graph partitioning • Widely used technique • Extended for textured images, known grouping priors • Hidden Markov Random Fields for 3-D segmentation of brain MR images [T.S. Huang et al., 2002] • Multi-resolution segmentation of low DOF images [J.Z. Wang et al., 2001] • Bayesian Segmentation involving Markov Chain Monte Carlo [S.-C. Zhu et al., 2002] • Gaussian Mixture Model based segmentation [C. Carson et al., 2002]
Feature Extraction: Other Features • Two-dimensional Multiresolution Hidden Markov Models for characterizing spatial arrangements of color and texture [Li et al., 2000] • “Segmentation free” approach • Scale and affine-invariant interest points [C. Schmid et al., 2004] • Shown effective for image retrieval • Robust to viewpoint and illumination changes • Wavelet-based salient points [Q. Tian et al., 2001] • Performance evaluation of various interest point detectors [C. Schmid et al., 2003] • Semantics-sensitive feature selection [J.Z. Wang, 2001] • Different features for graph v/s non-graph images
Approaches to Retrieval: SIMPLIcity • Region-based approach [J.Z. Wang et al., 2001] • Integrated Region Matching (IRM) distance used • Semantics-sensitive feature selection • Scalable – real-time performance on very large datasets • Tested on Airliners.net – over 1 million images • Advantages: • Reduces the influence of inaccurate segmentation • Helps to clarify the semantics of a particular region given its neighbors • Provides the user with a simple interface • Speed • 800 MHz Pentium PC with LINUX OS • Databases: 200,000 general-purpose image DB (60,000 photographs + 140,000 hand-drawn arts) 70,000 pathology image segments • Image indexing time: 1 second per image • Image retrieval time: 0.15 second per query
Approaches to Retrieval: SIMPLIcity Original Image Wavelet, RGBLUV Feature 2 Segmentation Result K-means Feature 1 Feature 2 (Centroid, Inertia)=region features Feature 1
Approaches to Retrieval: SIMPLIcity • Partition an image into 4×4 blocks • Extract wavelet-based features from each block • Use k-means algorithm to cluster feature vectors into ‘regions’ • Compute the shape feature by normalized inertia
Approaches to Retrieval: SIMPLIcity IRM: Integrated Region Matching • IRM defines an image-to-image distance as a weighted sum of region-to-region distances • Integrate point-wise distance by linear combination • Different names in different fields (e.g., Mallows Distance) • Weighting matrix is determined based on significance constraints and a ‘MSHP’ greedy algorithm
Approaches to Retrieval: SIMPLIcity • Extensions • Scalable version [J.Z. Wang et al., 2001] • Statistical clustering of the feature space • Fuzzy feature matching [Y. Chen et al., 2002] • Fuzzy shape feature • Fuzzy region matching • Reduces sensitivity to number of region segments generated
Approaches to Retrieval: Other methods • Blobworld [C. Carson et al., 2002] • Region-based approach – regions called “blobs” • Querying also region-based, e.g. “tiger” • Anchoring-based image retrieval [J.R. Smith et al., 2002] • Wald-Wolfowitz statistical testing [S. Fotopoulos et al., 2005] • Used for comparing non-parametric multivariates • Multiple-instance learning [Goldman et al., 2002, Y. Chen, 2004] • Probabilstic frameworks for image retrieval • [N. Vasconcelos et al., 2000] • Vector-quantization for generating codebooks [R. Srihari, 2000]
Approaches to Annotation: ALIP • Observations: • Human beings are able to build models about objects or concepts by mining visual scenes • The learned models are stored in the brain and used in the recognition process • Hypothesis: • It is achievable for computers to mine and learn a large collection of concepts by 2D or 3D image-based training
Automatic Image Annotation: ALIP • Concepts • Basic building blocks in determining the semantic meanings of images • Training concepts can be categorized as: • Basic Object: flower, beach • Object composition: building+grass+sky+tree • Location: Asia, Venice • Time: night sky, winter frost • Abstract: sports, sadness Low-level High-level
Automatic Image Annotation: ALIP 2D-MHMM: Two-dimensional multi-resolution hidden Markov model
Automatic Image Annotation: ALIP Annotation Process • Classification results form the basis • Salient words appearing in the classification favored more Food, indoor, cuisine, dessert Building, sky, lake, landscape, Europe, tree Snow, animal, wildlife, sky, cloth, ice, people
Automatic Image Annotation: Other Approaches • Hierarchical models for associating words with image regions [K. Barnard et al., 2003] • Translation of set of images to a set of words [P. Duygulu et al., 2002] • Soft annotation using Bayes point machines [E. Chang et al., 2003] • Ensemble of SVM-based classifiers for annotation [B. Li et al., 2003] • Generative Language Models for annotation [R. Manmatha et al., 2003, R. Jin et al., 2004] • Latent Semantic Analysis on visual-textual space for annotation [F. Monay et al., 2003]
CBIR: Interface • Ways to interact with the system • Random browsing • Search by example • Search by sketch • Linguistic search • Hierarchical search • Relevance feedback
CBIR: Visualization Visual Structures for Image Browsing[R.S. Torres et al., 2003]
Visualization: An Overview Time Quilt [ D. F. Hunyh et al., 2005]
Visualization: An Overview Does Organisation by Similarity Assist Image Browsing? [K. Rodden et al., 2001]
Visualization: An Overview Does Organisation by Similarity Assist Image Browsing? [K. Rodden et al., 2001]
Conclusion • Expect/ hope to see a paradigm shift in research focus • Application-oriented CBIR • Domain-specifc systems with leveraging domain knowledge • Impact the average Internet user the way Google or Yahoo! Does • Greater stress on • Interface and visualization • Scalability • Evaluation • Yet … • Continue the search for algorithms to narrow the semantic gap !