150 likes | 163 Views
This participation report highlights the usage of graph-based segmentation and biclustering for image retrieval in the Photo and Object Retrieval tasks of ImageCLEF. The system features a 3-level segmentation, a 35-feature segment vector, and various post-processing techniques to enhance retrieval accuracy. The biclustering algorithm is applied to improve feature matching and classification for both photo and object retrieval.
E N D
joint work with András Benczúr, István Bíró, Mátyás Brendel, Károly Csalogány, Dávid Siklósi Data Mining and Web Search Group Computer and Automation Research Institute Hungarian Academy of Sciences Participation reportPhoto and Object Retrieval task Bálint Daróczy
Common CBIR system 3-level segmentation, 35-feature segment vector Finding similar segments ImageCLEF Photo Task Cross-modal retrieval by text and image feature biclustering ImageCLEF Object Retreival Task Re-Segmented Pre-Classified Images for Object Retrieval Overview
CBIR Segmentation • Pre-segmentation • Resize to 1024x1024 (OpenCV) • Smooth to eliminate noise (OpenCV) • Downsizing with Gaussian kernel (Three-level Gaussian-Laplacian pyramid) • Intra and inter level threshold for joining pixels • Graph-based method [Felzenszwalb, Huttenlocher] • Undirected weighted graph over • neighboring pixels • Bottom-up clustering with dynamic thresholds • Efficient heuristic solution, better than min-cut, close to normalized cut
CBIR Post-processing • Graph based segmentation scenario: • Low initial thresholds: small sized relevant segments disappear • High initial thresholds: too many segments • Solution: • Sobel gradient image for selecting important edges • Result: • # of Graph based segments: in average 1000+ per image • After Post-processing: 100-
Original Picture Sobel Image After Graphbased Method After Post-Processing
CBIR Feature vectors • 35 dimensional real valued vector • Size in pixels (every picture must be resized) • Average color in RGB space, 8bit for every channel – in overall 24 bit • RGB color histogram, five sample for every 8 bit channel – in overall 8x3x5 bit • Shape representation: a grayscale image with the resolution of 4x4 Size Average RGB Histogram R Histogram G Histogram B Shape 4x4
Biclustering for ImageCLEF Photo sea sky … visual features • Matrix of image segments and annotation water tower building terms tf.idf segments
sea building … visual features Biclustering for ImageCLEF Photo • Result of the biclustering procedure water tower building
Biclustering for ImageCLEF Photo sea building … visual features • Term and segment cluster pair weights water tower term clusters building segment clusters cluster-cluster correspondence
Photo Retrieval method Settings for the Biclustering algorithm • Kullbach-Leibler distance on tf.idf an Information theoretic distance of distributions • Eucledian distance over the visual features • Row-column iterated EM, 16 iterations, 1000 segment clusters and 500 word clusters • Query term clusters selected • Corresponding image segment clusters determined • Query image segment weights with low correspondence discarded • CBIR run with remaining segments
ImageCLEF Photo Results • Term match to count number of words occurring from query (description down, location upweighted) • tf.idf breaks ties, improves if no other information • Tie breaking by image biclustering superior to tf.idf • No improvement for other combinations of text and image (No query expansion, no feedback)
ImageCLEFObject Retrieval Task Class of Query Image Pre-classified Images VOC2007 Query Images Original Training Set
ImageCLEF Object • Basic assumptions on Pre-Classified Images • - Sample objects can have different shape and color • Pre-segmentation made by humans • Abstract classes Method • Feature vector of pre-classified objects (VOC2007 dataset) • Search for the most similar object Key Idea • Re-segment objects to improve similarity
ImageCLEF Object: budapest-acad315 • Segment the query image • Classify into the class of the most similar sample segment • Drawback • Granularity of automatic and human segmentation is different • MAP results • - AP results with completely annotated database • MAP: 0.020
ImageCLEF Object: budapest-acad314 • Re-segment the pre-classified images • New segments from classified objects • Class representatives formed by segments with over 80% overlap of the training objects • Higher similarities, more adequate classification • MAP results • - AP results with completely annotated database • MAP: 0.031 • - for class of bicycles: MAP: 0.283