270 likes | 282 Views
A hierarchical system that automatically assigns keywords to digital images using a holistic approach. Overcomes labor-intensive and time-consuming annotation process.
E N D
HANOLISTIC: a hierarchical automatic image annotation system using holistic approach Özge Öztimur Karadağ & Fatoş T. Yarman Vural Department of Computer Engineering Middle East Technical University, Ankara, Turkey
Automatic Image Annotation • Image Annotation : Assigning keywords to digital images. • Labor intensive • Time consuming • Need a system that automatically annotates images.
Image Annotation Literature • Annotation problem has become popular since 1990s. • Related to CBIR. • CBIR processes visual information • Annotation processes visual and semantic information • Relate visual content information to semantic context information.
Problems About AutomaticImage Annotation • Human subjectivity • Semantic Gap • Availability of datasets
Image Annotation Approaches in the Literature… • Segmental Approaches • Segment or partition the image into regions • Extract features from the regions • Quantize features into blobs • Model the relation between the image regions and annotation words • Holistic Approaches • Features are extracted from the whole image.
The Proposed System: HANOLISTIC • Introducing semantic information as supervision. • each word is considered as a class label, • an image belongs to one or more classes • Holistic Approach: multiplevisual features are extracted fromthe several whole image. • Multiple feature spaces
Description of an Image • Content Description by Visual Features of Mpeg-7 • Color Layout • Color Structure • Scalable Color • Homogenous Texture • EdgeHistogram • Context Description by Semantic Words • Annotation words
System Architecture of HANOLISTIC • Level-0 : consists of level-0 annotators, one annotator for each visual description space. • Meta-level : consists of a meta-annotator
Level-0 Annotator • refers to the features of the i th image in the j th description space • refers to the membership value of the l th word for the i th image in the j th description space.
Meta-Level • The results of level-0 annotators are aggregated. • is a vector, referring to the final word membership values for the i th image.
Experimental studies • Realization of HANOLISTIC • Instance based realization of Level-0 • Eager realization of Level-0 • Realization of Meta-level • Performance criteria • Results
Experimental Setup • Data set: A subset of Corel Stock Photo Collection, consisting of 5000 images. • Training set: 4500 images (500 images for validation) • Testing set: 500 images • Each image is annotated with 1-5 many words.
Instance based Realization of Level-0 Annotator by Fuzzy-knn • Level-0 annotators are realized by fuzzy-knn. • For each description space; k nearest neighbors of the image is determined. • Word membership values are estimated considering the neighbors’words and their distance from the image. • High membership values are assigned to words that appear in close neighborhood.
Eager Realization of Level-0 by Ann • For a given image Ii, ANN receives visual description of the image as input and semantic annotation words of the image as target. • Each ANN is trained with backpropagation and a randomly selected set of images is used for validationto determine when to stop training. K-fold cross validation is applied.
Realization of Meta-Levelby Majority Voting • Adds the membership values returned by level-0 annotators using the formula • where, Pi,j is a vector containing the word membership values returned from the jth level-0 annotator. • For each word select the maximum of the five word membership values estimated by the level-0 annotators.
Performance Criteria • Precision • Recall • F-score
Performance of Level-0 Annotators • Performance of Level-0 annotatorswith fuzzy-knn
Performance of HANOLISTIC • Comparison of HANOLISTICwith other systems in the literature:
Conclusion • We proposed a hierarchical automatic image annotation system using holistic approach. • We tested the system both with an instance based and an eager method. • We realized that the instance based methods are more promising in the considered problem domain.
Conclusion… • The power of the proposed system comes from thefollowing main principles: • Simplicity • Fuzziness • Simultaneous processing of content and context information • Holistic view of image through different perspectives
Future Work • Conduct experiments on other descriptors • Test other algorithms at level-0 conforming to the principle of least commitment • Apply holistic approach followed by a segmentation process , for annotation or intelligent segmentation.
Thank you • Questions and Comments
References • Duygulu, Barnard, Freitas and Foryth ‘Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary’ in ECCV’02: Proceedings of the 7th European Conference on Computer Vision,2002 • Jeon, Lavrenko, Manmatha ‘Automatic image annotation and retrieval using cross-media relevance models’, in SIGIR’03 • Monay and Perez ‘Plsa-based image auto-annotation: constraining the latent space’ in MULTIMEDIA’04 • Akbaş and Vural ‘Automatic image annotation by ensemble of visual descriptors’ in CVPR’07 • Feng, Manmatha and Lavrenko ‘Multiple bernoulli relevance models for image and video annotation’. CVPR’02. • Tang and Lewis, ‘Image auto-annotation using ‘easy’ and ‘more challenging’ training sets’, 7th International Workshop on Image Analysis for Multimedia Interactive Services, 2006