300 likes | 453 Views
Content-based image retrieval integrated into Fedora. Pierre-Yves Burgi IT Division, University of Geneva, Switzerland Patrick Monbaron & Nastaran Fatemi University of Applied Sciences, Yverdon, Switzerland. Presentation outline. Context MPEG-7 Data migration Indexing Image retrieval
E N D
Content-based image retrievalintegrated into Fedora Pierre-Yves Burgi IT Division, University of Geneva, Switzerland Patrick Monbaron & Nastaran Fatemi University of Applied Sciences, Yverdon, Switzerland
Presentation outline • Context • MPEG-7 • Data migration • Indexing • Image retrieval • Demo • Conclusions • Perspectives Context MPEG7 Migration Indexing Retrieval Demo Conclusions Perspectives
Context of the project • Paradigm shift • Migration of image collections from Oracle DB to Fedora • First step: data synchronization • Second step: user interface targeting data retrieval • Third step: user interface for ingesting images (through Valet) • Why Fedora? • Conceptually rich (objects, datastreams, SOA, etc.) • Based on open standards (e.g. XML) • Convenient for adding datastreams such as MPEG-7 Context MPEG7 Migration Indexing Retrieval Demo Conclusions Perspectives
Old (relational) object model Oracle Object Model 4 Objectifs Fedora Migration Indexing Retrieval Demo Conclusions Perspectives Context MPEG7 Migration
Fedora Object Model Fedora’s view New object model 5 Objectifs Fedora Migration Indexing Retrieval Demo Conclusions Perspectives Context MPEG7 Migration
Why MPEG-7 ? (and not DC) • Makes easier migration from the DB • Match from DB’s 21 fields to MPEG-7 possible • Fits more image description • dc:creator versus DS Creator with <Role> & <Agent> • <VisualDescriptor>, <MediaFormat>, etc. • Exif metadata • Etc. 6 Context MPEG-7 MPEG7 Migration Indexing Retrieval Demo Conclusions Perspectives
What is MPEG-7 ? 7 Context MPEG-7 MPEG7 Migration Indexing Retrieval Demo Conclusions Perspectives
Scalable color descriptor Edge histogram descriptor Color layout descriptor Caliph and Emir http://sourceforge.net/projects/caliph-emir/ MPEG-7: feature extraction 8 MPEG-7 Context MPEG7 Migration Indexing Retrieval Demo Conclusions Perspectives
MPEG-7: encoding 9 Context MPEG-7 MPEG7 Migration Indexing Retrieval Demo Conclusions Perspectives
MPEG-7: retrieving Match is expressed as a number 10 Context MPEG-7 MPEG7 Migration Indexing Retrieval Demo Conclusions Perspectives
XML XSLT FOXML Image Data migration Oracle DB Record reading Application server Fedora MPEG7 + FOXML API-M gSearch XSLT FOXML Messaging Datamigration indexing Lucene File system Context MPEG7 Migration Indexing Retrieval Demo Conclusions Perspectives 11
Data synchronisation • 3 Phases • Delete obsolete objects • « Delete » in Fedora those recordings which do not exist anymore in Oracle DB • Update objects • Update in Fedora objects corresponding to recordings modified in Oracle DB • Create new objects • Create new objects in Fedora corresponding to the new recordings present in Oracle DB • Algorithm based on the date of the last batch • Date saved within a configuration file Context MPEG7 Migration Indexing Retrieval Demo Conclusions Perspectives 12
Migration: delete objects Compare both lists and « delete » from Fedora missing elements in Oracle Reading of the id of all Fedora objects. Reading of the id of all Oracle recordings 13 Context MPEG7 Migration Indexing Retrieval Demo Conclusions Perspectives
Migration: create new objects For new objects: Reading of metadata and image analysis Data migration into XML format and into MPEG-7 and FOXML through XSLT Read date of last object creation Search id of those elements created since last update Ingest FOXML and images into Fedora 14 Context Fedora Migration Indexing Retrieval Demo Conclusions Perspectives
Migration: update objects Read date of last object creation Search id of those elementsupdated since last update Ingest MPEG-7 into Fedora Reading of metadata and image analysis Data migration into XML format and into MPEG-7 15 Context Fedora Migration Indexing Retrieval Demo Conclusions Perspectives
Indexing • Indexing of • DC • Textual MPEG-7 metadata • Not indexed but saved in the index • Technical MPEG-7 metadata • Image attributes 16 Context Fedora Migration Indexing Retrieval Demo Conclusions Perspectives
XSLT FOXML Indexing • Indexing process within Lucene Oracle DB Record reading Application server XML XSLT Fedora FOXML MPEG7 + FOXML gSearch API-M Messaging Indexing Image Data migration File system Lucene 17 Context Fedora Migration Indexing Retrieval Demo Conclusions Perspectives
Fedora API-A gSearch XSLT Resolver Indexing Image retrieval Application server User interface HTML page generation Request parsing XSLT Search by image matching gSearch connector File system Lucene 18 Objectifs MPEG-7 Migration Indexing Retrieval Demo Conclusions Perspectives
Retrieval by image matching Image reference id Retrieval of the attributes of this image from Lucene XML output for display Reading of all data from Lucene and conversion into XML Selection (10 first) Match of each image attribute with that reference and computation of a matching score 19 Context Fedora Migration Indexation Retrieval Demo Conclusions Perspectives
Demo: Photothèque 20 Context Fedora Migration Indexation Retrieval Demo Conclusions Perspectives
Conclusions • Applications of the MPEG-7 standard to image matching retrieval give satisfactory results … but this is not terribly semantic! • For large image data bank indexation method might not be optimal (now about 1’000 images) • Parameter tuning of image attributes not easy • Image corpus too small to establish benchmarks • Migration procedure from a DB to Fedora tested • Data synchronization remains a difficulty • Integration within Fedora otherwise is a success • Culture change (paradigm shift) in progress 21 Context Fedora Migration Indexation Retrieval Demo Conclusions Perspectives
Perspectives • Exploiting more Fedora’s features • Using disseminator for watermarking images on the fly • Applying XACML policies • Improving user interface • Play with search parameters (shape versus color) • Applying the methodology to other image data banks • Medicine, architecture, etc. • Other media (audio, video) 22 Context Fedora Migration Indexation Retrieval Demo Conclusions Perspectives
Questions Complete report available in French at: http://www.unige.ch/dinf/ntice/accueil/MembresProjet/PMonbaronRapportFinal.pdf 23 Context Fedora Migration Indexation Retrieval Demo Conclusions Perspectives
« building » (key word) 24 Context Fedora Migration Indexation Retrieval Demo Conclusions Perspectives
Retrieval results 25 Context Fedora Migration Indexation Retrieval Demo Conclusions Perspectives
Display of chosen image 26 Context Fedora Migration Indexation Retrieval Demo Conclusions Perspectives
Retrieval by image matching 27 Context Fedora Migration Indexation Retrieval Demo Conclusions Perspectives
Retrieval results (by contour matching) 28 Context Fedora Migration Indexation Retrieval Demo Conclusions Perspectives
Display of an other chosen image 29 Context Fedora Migration Indexation Retrieval Demo Conclusions Perspectives
Retrieval results (by color matching) 30 Context Fedora Migration Indexation Retrieval Demo Conclusions Perspectives