IMEDIA Image and Multimedia Indexing, Browsing and Retrieval

IMEDIAImage and Multimedia Indexing,Browsing and Retrieval Evaluation 2001-2005 14 November 2005 INRIA Rocquencourt http://www-rocq.inria.fr/imedia/

The Team(November 2005)Senior members • INRIA personnel • Nozha Boujemaa (DR2) • Anne Verroust-Blondet (CR1) • Jean-Paul Chièze Research Engineer [part-time] • Laurence Bourcier Team Assistant • Scientific Adviser • Donald Geman (1/2 time, Pr. Johns Hopkins) • External collaborators • Michel Crucianu (Pr. CNAM) [3 years mob. IMEDIA] • Valérie Gouet-Brunet (MdC CNAM) [2 years mob. IMEDIA] • Jean-Philippe Tarel (CR1 LCPC) [2 years mob. IMEDIA] • Olivier Buisson INA Researcher (Institut National de l’Audiovisuel) • Marie-Luce Viaud INA Researcher

The Team Non permanent members Presentteam members Formerteam members Post-docs /Expert engineers • Sabri Boughorbel • Marin Ferecatu • Alexis Joly • Itheri Yahiaoui PhD students • Olfa Besbes • Mohamed Chaouch • Nizar Grira • Nicolas Hervé • Hichem Houissa • Julien Law-To

Overview • Objectives • Results and Contributions • Applications and Grants • Positioning • Future objectives

Objectives Design and Develop new Methods for Visual Information Retrieval by Content • Visual content indexing • Visual appearance modeling • Constructing efficient indexes for minimizing query cost • Interactive browsing, querying and retrieval • Similarity learning • Clustering techniques • Relevance feedback: learning from user interaction • Combine keyword annotation (when available) search with visual-content search

Key Issues • Fidelity of physical-content descriptors to visual appearance • Numerical gap vs. Semantic gap • Rich user expression : • Partial visual query formulation focused on user interest (region-based or point-based) • Subjectivepreference by relevance feedback mechanism • Mental image search and “page zero” problem • Smart navigation • Cross-media indexing and retrieval

General Methodological Issues • Image content description: • analysis, segmentation; • considering specific and generic content • Learning from few examples: • Active learning for efficient personalization mechanism • Semi-supervised clustering • Adaptive Clustering (interactive SVM-based refinement) • Information theory: Mental Image search

Overview • Objectives • Results and Contributions • Visual Content Description • Clustering Methods • Relevance Feedback Mechanism • Mental Image Search • Applications and Grants • Positioning • Future Objectives

Visual Content Description • Generic content: • Global image signature: combined color-structure signature (MMCBIR 01, LNCS 05), shape signature (ICIP 05), 3D signature, • Local image description: region-based (JVLC 04), color point-based (CBAIVL/CVPR 01) • Specific content: • Face detection (IJCV 01, JMLR 05) • Face recognition (Biometric WS/ECCV 02) • Fingerprints recognition(ACCV 02) • IKONA search engine demo available http://www-rocq.inria.fr/imedia/ikona.html

Numerical Gap / Fidelity vs Weakness of signatures Basic color histogram Local Color activity descriptor (before combination with shape and texture descrip.)

Specific Content Image Database FaceRecognitionDynamic programming on local entropy map featuresWBA/ ECCV 2002 (LNCS)

Coarse-to-Fine Strategy forFace Detection (Nested partitions of the set of possible poses– IJCV01) ( Hierarchy of SVM-classifiers - JMLR 05)

Local Description of the Image Point of interest extraction Region Segmentation Region-based query Points-based query

n n n n X Y X Y å å å å = + - d ( X , Y ) x x a y y a 2 x y a quad i j i j i j X X Y Y X Y c c c c c c i j i j i j = = = = , 1 , 1 , 1 , 1 i j i j i j i j Region-based Indexing and Retrieval User interest selection (Visual query): Lavender regions regardless the background information New Coarse Segmentation + Fine Region Description Introduction of ADCS Signature+ Generalized Quadratic Distance JVLC 2004

Optimal order of color differential invariant Robustness to JPEG coding Color constancy Precise Search by Local Color Invariants Descriptors CVPR/CBAIVL 01

Overview • Objectives • Results and Contributions • Visual Content Description • Clustering Methods • Relevance Feedback Mechanism • Mental Image Search • Applications and Grants • Positioning • Future Objectives

Clustering Methods • Context: unknown number of clusters, competitive agglomeration approaches • Application: image database categorization, image segmentation • Contributions: • Adaptive robust clustering (ICPR02):Noise cluster and cluster density/shape adapting • Entropy regularization and extension to non linearly separable data (IEEE Fuz.Sys05) • Active semi-supervised learning (MIR05, IEE VISP 05)

Active Semi-Supervised Categorization • Learning from few examples: Fully automatic categories could do not reflect user expectations User constraints indicate howsimilarity space is different from feature space New clustering objective function that takes into account violationcost of “must-link” and “can-not-link” constraints IEE Vision, Image & Signal Processing, to appear

Active Semi-Supervised Categorization • Active selection of constraints: Identifying the ambiguous data items with weak membership • Supervision effort • Identifying non compact and less separated clusters from their neighbors • Identify the frontier of the least well separated cluster using the fuzzy hypervolume: Ck is the covariance matrix

Can not-link Class1 Class2 Class3 must-link Class4 Illustration • Scientific databases: Gene Expression Studies • Plants with long stems and round leaves • Textured plants, … Generalist databases applicable to video-keyframes for smart video abstract

Overview • Objectives • Results and Contributions • Visual Content Description • ClusteringMethods • Relevance Feedback Mechanism • Mental Image Search • Applications and Grants • Positioning • Future Objectives

Positive Examples Negative Examples Relevance Feedback Mechanism Online Personalization of Retrieval Results Example: search for Cézanne Paintings • Selection strategy? • Most informativeimages • Most similarimages

Active Relevance Feedback Framework Contribution to Components of RF Mechanism : • Learner: kernels inducinginsensitivity to the scale of the datain the feature vector space • Selector: active learningselection criterion that minimizes the redundancy between the samples • SVM-based decision function • select least redundant (orthogonal) items among mostambiguousitems • User: consistent annotation? Extensive study of user strategies [MIR04, MIR05, AVIVDiLib'05 ] [ACM Multimedia journal (under revision)]

Overview • Objectives • Results and Contributions • Visual Content Description • ClusteringMethods • Relevance Feedback Mechanism • Mental Image Search • Applications and Grants • Positioning • Future Objectives

Mental Picture Retrieval • Context: No starting image example or keyword • A person has a picture “in mind”, e.g., a • face • painting • Scene • Problem: How to reach the target? • Bayesian framework • Composition from Visual Thesaurus

Bayesian Framework • Components: • Answer Model: Discover answer models which match human behavior • Display Model: (Optimization Problem) Discover approximations to the optimal display Each display should catch as much as possible information about target from user. =>The idea is to maximize mutual information between target and answer. Reduction in uncertainty of r.v. Y due to r.v. X

Mental face retrieval: Complications • Mental matching involves human memory, perception and opinions. • Images are not indexed by semantic content, but rather by low-level features (“semantic gap”). • Face recognition is easier, yet unsolved. • Sparse literature. Best Paper Award A-V-based Biometric Person Authentication (AVBA'2005) Joint work with Sagem Corp.

Retrieved images Query by “Visual Words” Composition Visual Thesaurus: set of similar regions categories “Cityscape” ? Rejected images Landscapes

Category 48 Category 23 Query by “Visual Words” Composition Query composition interface => The Visual Thesaurus= summary of region categories (cluster prototypes set) [MTAP 05]

Symbolic Indexing “Inverted visual files” in MTAP 05

Additional Results • Cross-modal Indexing and Retrieval • Copy detection and more generally semantic behavior of local descriptors for selective video content retrieval • Kernels for similarity learning • Extensive study of user strategies in relevance feedback. • 3D model indexing and retrieval, 2D shape descriptors

3D model retrieval

Applications and Grants • Scientific content collections: • Remote sensing images (ACI QuerySat – CNES, IGN) • Biodiversity images (ACI Biotim – INRA/NASC, IRD) • Audio-visual content: • TV news (RIAM Mediaworks – TF1 Tv; INA) • Personal and prof. content (IP-FP6 AceMedia) • Art and Design: Alinari collection • Security application: • Pedophilia images (Central Judiciary Police Dep. Europ. STOP) • Biometry (Face - Sagem, fingerprints – Thales)

Other Grants • NoE-FP6 Muscle • Important involvement (WP leader, NoE deputy scientific coordinator, steering committee) • NoE-FP6 Delos • PAI Galilée (recognition for video-surveillance with Modena Univ.) • Associated-Team ViMining with NII • RNRT - RECIS (FT R&D, INSA, NF)

IKONA Search Engine Images courtesy of Alinari (Oldest private European art photo archive)

keyword: “building” [MIR05] Relevance of hybrid signatures: visual + semantic information

Costal area with visible boats Starting point for RF

Gene expression studies on “Arabidopsis” Jointly withINRA Images courtesy of NASC (Nottingham Arabidopsis Stock Centre)

Leaf Identification Smithsonian database Shape descriptor [ICIP05] Images courtesy of Peter Belhumeur (Columbia Univ. NY)

Copy detection Detected copy False Alarm

Security ApplicationCriminal Investigation within Pedophilia Images Ikona prototype for “Ministère de l’Intérieur” Central Judiciary Police Department within EC « STOP »

USER INTERFACE Annotate display given a target face

INRIA Positioning Wrt. INRIA’s strategic goals (2nd): Developing multimedia data and information processing INRIA projects: • ARIANA: probabilistic and variational image analysis for earth observation, joint ACI QuerySat on remote sensing image indexing, Muscle NoE • LEAR: focus on object recognition involving offline learning methods (learning datasets) while we work on information retrieval and develop different learning methods from few examples (on-line) for image clustering and search personalization - complementary, joint AceMedia FP6 • VISTA: Video indexing – complementary, NoE Muscle, MediaWorks, • TEXMEX (SymC): Pluri-disciplinary project (NLP, ImageP.,DB), we have joint interest to feature space structuring and hybrid indexing. (Texmex: audio, video, NLP, visual…); AceMedia and NoE Muscle

National Positioning • Telecom Paris – SIP: Remote sensing indexing, partner within ACI QuerySat, 3D indexing • INT ARTEMIS: 2D and 3D indexing • Ecole Centrale Lyon (L. Chen): face detection recognition, TechnoVision IV2. • INSA Lyon IRIS (J-M Jolion): local descriptors • ENSEA ETIS : Relevance feedback, Muscle NoE • Ecole des Mines (JP. Vert): kernel design

International Positioning Very active domain, below non-exhaustive list • T.Huang (Urbana-Champaign), Ed. Chang (U.Cal.Santa-Barbara), Relevance feedback, • A. Smeulders (ISIS group U. Amsterdam), D. Lowe (Univ. BC), A. Zisserman (Oxford), H. Bishof (Tech. Univ Graz); point-based features • J. Wang (Penn State Univ.), region-based retrieval • P. Belhumeur (Columbia Univ.), Leaf species identification and shape descriptors • S. Satoh (NII – Japan) Associated-team “ViMining”, saliency detection, face detection, image and text–based retrieval • R. Cucchiara (Univ. Modena) PAI Gallileo, biometry and video surveillance, 3D indexing • A. Delbimbo (Univ. Florence) NoEDelos, 3D indexing • H. Frigui (Univ. NSF-INRIA), semi-supervised clustering • T. Tan (CASIA) Liama project

Overview • Objectives • Results and Contributions • Applications and Grants • Positioning • Future Objectives

Future Scientific Objectives • Visual content description • Saliency investigation for selective content retrieval • Geometric consistency of local descriptors • Specific content: 2D/3D shape (biodiversity), extension of face detection methods to be invariant to view point • Efficient search in large collections of images Multidimensional data structure indexing (example: multiple queries processing)

Future Scientific Objectives (cont.) • Mental image search: • improved models for perceptual similarity for a higher degree of coherence between system models and actual human behavior • More efficient visual thesaurus construction methods (hierarchical description with relational clustering) • Toward scalable methods: semi-supervised clustering, Relevance Feedback • Hybrid image and text indexing and retrieval: • extension to semi-annotated databases, • dynamic weighting of text and visual rankings

IMEDIA Image and Multimedia Indexing, Browsing and Retrieval

IMEDIA Image and Multimedia Indexing, Browsing and Retrieval

Presentation Transcript

Image and Video Retrieval

Multimedia Search and Retrieval

CM613 Multimedia storage and retrieval Content-based image retrieval

Music Indexing and Retrieval for Multimedia Digital Libraries

Indexing and Retrieval Semantic Search

Indexing and Retrieval

Image indexing and Retrieval Using Histogram Based Methods,

Multimedia Indexing and Dimensionality Reduction

Multimedia I: Image Retrieval in Biomedicine

Image Indexing and Retrieval using Moment Invariants

Image and Video Retrieval

Splitting and Merging Approach for Image Indexing and Retrieval in DC Domain

Techniques for Indexing and Browsing Image/Video Databases

Multimedia and Text Indexing

Multimodal Semantic Indexing for Image Retrieval

Chapter Two Trajectory Indexing and Retrieval

Image retrieval and categorization

Multimedia and Text Indexing