Matthew Simpson, Md Mahmudur Rahman, Dina Demner-Fushman, Sameer Antani, George R. Thoma

Text- and Content-based Approaches to Image Retrieval for the ImageCLEF2009 Medical Retrieval Track Matthew Simpson, Md Mahmudur Rahman, Dina Demner-Fushman, Sameer Antani, George R. Thoma Lister Hill National Center for Biomedical Communications, National Library of Medicine, NIH, Bethesda, MD, USA CLEF 2009

Retrieval tasks and approaches • ITI project long term goal • Find a way to combine image and text features so that the whole is greater than the sum of its parts • Ad-hoc image retrieval • Text-based • Image content-based • Automatic mixed • Relevance feedback mixed • Case-based document retrieval • Text-based

Text-based approach • Indexing: • Create image documents for ad-hoc image retrieval • Create surrogate documents for case-based retrieval • Index using Essie • term normalization using the SPECIALIST Lexicon • query expansion based on UMLS synonymy • term weighting based on location in the document • Phrase-based search

Text documents • Image document • Title and caption provided by organizers • Mention extracted from paper • MEDLINE citation (abstract +MeSH) • PICO frame of the caption + image modality (structured caption summary) • Surrogate document • MEDLINE citation • caption, mention, and structured caption summary of each image contained in the article

Text retrieval • PICO-based structured query and case representation • <topicID>19</topicID> <description>Crohn's disease CT</description> • <modality essieExp="false">ct</modality> <modSyn>c.a.t.</modSyn><modSyn>cat</modSyn><modSyn>computerised axial tomography</modSyn>…. • <cond essieExp="true">Crohn's disease</cond><condPN>crohn disease</condPN><condSyn>Regional enteritis</condSyn> <condSyn>eleocolitis</condSyn><condSyn>Cicatrizing enterocolitis</condSyn><condSyn>granulomatous enteritis</condSyn><condSyn>INFLAMMATORY BOWEL DISEASE</condSyn><condSyn>regional enterocolitis</condSyn> …

CBIR - Image feature representation • Concepts - color and texture patches from local image regions • Low-level global features • Color (Color Layout Descriptor, MPEG-7) • Edge (histogram of local edge distribution and direction) • Texture (grey level co-occurrence matrix) • Average grey level (256-dimensional vector of blocks in image normalized to gray-level 64x64) • Lucene (LIRE)-based Color Edge Direction Descriptor and Fuzzy Color Texture Histogram

Image similarity computation • Category-specific • Determine image category (training set of 5000 images manually assigned to 32 mutually exclusive categories) • Use category-specific weights in linear similarity matching • Relevance feedback • Feature weights updated using images judged relevant

Combining text and image • Based on text search results, • Compute mean vector of top 5 retrieved images, use as input to category-specific retrieval • Select 3-5 relevant images manually, use as input to category-specific retrieval • Re-rank text retrieval results using visual retrieval scores • Provide feedback using all retrieval results, • expand query using image documents • Pad selected relevant images with new retrieval results

Relevance Feedback

Results category- specific RF text re- ranked BRF RF RF+QE case-based visual mixed

Image-text search engine

Thank you! Questions?

Matthew Simpson, Md Mahmudur Rahman, Dina Demner-Fushman, Sameer Antani, George R. Thoma

Matthew Simpson, Md Mahmudur Rahman, Dina Demner-Fushman, Sameer Antani, George R. Thoma

Presentation Transcript

Dina Merhav

SIMPSON

MaR THOMA CHURCH

Murray CD, Rahman R, Stephenson J

Jake R. Simpson

Hatem Kobtan MD FRCS (Ed) ( Glasg ) Dina Kobtan MD Cairo University

Teacher introducing Md. Mizanur Rahman Assistant teacher (English)

Matthew R

Dina Thanthi

R. Cionni, MD

George F. Kroker, MD FACAAI

George E. Fragoulis, MD

Introduction to Randomized Algorithms Md. Aashikur Rahman Azim

Md. Wahidur Rahman Chief Engineer, LGED

George A. Diamond, MD

George R. Chapdelaine

A R-Rahman-hit-songs-bajao-latest

Sameer

Sameer Kumar