280 likes | 400 Views
Cross media indexing of images on the web. P. Mulhem, I. Dioletti, M. Belkhatir CLIPS – IMAG Grenoble TRUMA 2005 20 December 2005. Outline. Introduction Related works Proposal Prototype Conclusion. Matching. Introduction. Image retrieval systems. Interface. User. Queries. Images.
E N D
Cross media indexing of images on the web P. Mulhem, I. Dioletti, M. Belkhatir CLIPS – IMAG Grenoble TRUMA 2005 20 December 2005
Outline • Introduction • Related works • Proposal • Prototype • Conclusion
Matching Introduction • Image retrieval systems Interface User Queries Images Results Indexing Interpretation Knowledge Base Collection ofdata Analysisof data Query Representation Image Représentation
Solution: Extended automatic Signal/semantics Yellow cathedral yellow building building Automatic semantic Automatic Signal/Semantic Signal -Colours -Textures -Shapes Introduction • How to index images? Information need: « yellow cathedral » Non satisfactory existing approaches for image indexing
Introduction • Objectives • To enrich an automatic signal/semantic index of images by more specific terms AND fix errors of automatically generated indexes • Extracted elements: • From the textual contex of occurrence of images in web pages • Terms that describe image visibles elements (ex. « cathedral ») • Terms that describe visual attributes of the elements (ex. « yellow cathedral » • That enrich the concepts extracted from image • Ex. building
QBIC Related works - Signal • Indexing and retrieval of images based on visual content • QBIC [Flinker et al 95] • (colour, texture, shape) • Queries on visual attributes • No semantic queries • Other systems: • Web-Wise [Gang et al. 98] • DrawSearch [Di Sciascio et al. 99]
Grass Hut Sky Related worksAutomatic semantic • Indexing and retrieval of images based on symbols related to their semantic content • On the Web using context (Google, Altavista) • Semantic content: VK [Lim 00]
Related worksAutomatic signal/semantic • Indexing and retrieval based on visual content AND semantic content • 2 ways : • Loosely coupled: • « adding » of symbolic terms and visual attributes • Strongly coupled: • « association » of symbolic terms and visual attributes of image visible objects
Related worksAutomatic signal/semantic • Loosely coupled approaches • Image Rover[Sclaroff et al. 97,98] • Terms not used in queries • Other systems • IFIND [Lu et al.] • WebSeek [Smith et al. 97] • WebMars [Ortega et al. 99] Indexinghistograms[colour, texture, terms] QueryingQuery by image examplesrelevance feedback
ind_tx Oi2 Oi1 Spatial agent2 Spatial agent1 Related worksAutomatic signal/semantic • Strongly coupled approaches • SIR [Belkhatir et al. 04] based on EMIR2 [Mechkour 95] Colour facet Texture facet Oi1 < Whilte: 10, Blue: 0, Cyan: 0, Grey: 0, Yellow: 35, Black: 0, Orange: 0, Skin: 0, Red: 0, Green: 55, Violet: 0> Oi2 q_c < Lined:1, Bumpy:0, Cracked: 0, Smeared: 0, Disordered:0, Netlike:0, Whirly: 0,Interlaced: 0, Marbled: 0, Spotted: 0, Uniform: 0> Semantic visual facet Oi1 Oi2 Spatial facet < Touch: 0, Inside: 0, Disconnected: 0, Ontop: 1, Under: 0, Left: 0, Right: 0 > sct sct Grass:0.2 Hut:0.3
Related works • Overview
Proposal • Data • Each image content is described by a conceptual graph GI [SOWA 84] according to SIR model [BELKHATIR et al. 04, 05] • Hypotheses • The text that surround an image describes its content • Some elements of the wab page text allows fixing or specifying the image content. • Objectives • Integration of structured terms extrqcted from a Web page into GI to enrich and improve the representation of the image content.
Proposal Image Generationof an initial graph GI according to SIR • graph GI • Visual semantic facet • Signal colour facet • Signal texture facet • Lattice of visual concepts … Textual elements • Integration in GI • Localization of extraction zones • Common representation • Integration in the lattice • Translation in graphs • Extraction of structured • terms
Proposal1. Localization of extraction zone • Where to extract the potential text • alt attribute (brief description of an image) • src attribute (localization of the image resource) • Text surrounding the image (before and after) • Impact of words • estimation of the relevance of a word to describe the visual content of an image imp(t): probability that a word t extracted from a specific zone of the Web page describes a region of the image.
Proposal2. Extraction of structured terms • A structured term • A set of couples (word, probability) associated to an image • Extraction • Definition of morphologic for structured terms • Set of patterns: • SC: visual semantic, Co: colour, Tx: texture • Application of the patterns on text with imp(t) • Result: structured terms • Ex. PCo2=[adjectiveConounSC] {(red,0.3),(house,0.32)}
E TSC Proposal3. Common representation • Structured terms into conceptual graphs • Enriched lattice for SC Image entity Io people … individual crowd … … man … policeman
Proposal3. Common representation • Structured terms into conceptual graphs • Visual semantic facet • Colour signal facet • Texture signal facet Result: A={αi} set of graphs generated from structured terms Example translation
Proposal4. Integration in a GC • A) Aggregation of text and image elements • Integration of • The relevance of a semantic visual concept tSCifrom αi :imp(tSCi) • Vtsc = {tsci} (universe of visual semantic concepts from text) • The certainty of recognition of a visual semantic concept sck from gk: rκ • Vsc = {sck} • The semantic similarity between tSCi and sck αi: gκ:
IO srt building, 0.15 IO srt individual, 0.11 IO srt cathedral, 0.35 IO srt building, 0.3 IO srt man, 0. 21 Proposal4. Integration in a GC • A) Aggregation of text and image elements • For each c Usc, define a tot(c) that uses • The dependencies between each element of Usc • The impact (for text) or certainty of recognition (for image). • tot(c) is the plausability (fuzzy value) that c belongs to the description of an image region • Example • Vtsc= { cathedral, man, building} • VSC= { individual, building} • Usc = { cathedral, man, building, individual}
IO srt cathedral, 0.35 IO srt building, 0.3 IO srt man, 0. 21 Proposal4. Integration in a GC • A) Aggregation of text and image elements • Effect of semantic concepts from the whole text • Example Using St-conorm(a,b)=S2(a,b)=a+b-a.b
IO srt building, 0.15 IO srt individual, 0.11 Proposal4. Integration in a GC • A) Aggregation of text and image elements • Effect of semantic concepts from the whole image • Example
Proposal4. Integration in a GC • A) Aggregation of text and image elements • example
Proposal4. Integration in a GC • B) Similarity simik between αi and gk gk αi
Proposal4. Integration in a GC • Similarity simik between αi and gk • Generation of a weighted graph • Choice of couples of matching graphs so that αi and gk : max(simik) above threshold
Proposal4. Integration in a GC • Fusion of matching graphs • Fusion of semantic visual concepts • 2 cases • Semantic match • Keep the more specific concept between tsci and sck and the corresponding value • No semantic match • Keep the concept with the larger membership value • Result g’k
Prototype • Estimation of impact • 100 web pages • Relevant words • atl and src attributes • 100 words before or after the image
Prototype • Processing of text • Gatherig of web pages (crawler) • Analysis of web page to extract good potential images (size) • Processing of images • Use of VK-like approach [LIM00] • Squares blocks segmentation • Color + texture mean+std • Generation of GCs in SIR • Cross-media processing • Syntactical analysis • Xip (Xerox) • Similarity between concepts • Wordnet distances between synsets
Conclusion • Results • Modeling of textual context of image occurrence in web page to enrich description • Definition of specialization and correction of image indexing • Prototype • Perspectives • Qualitative and quantitative tests • Use of logical structure of web pages ([Wang & al. 04]) • Use of typology of web pages ([Tate Al. et al. 96]) • Evaluation in the DIESKAU/SIR image retrieval system.