E N D
Abstract This paper empirically compares nine image dissimilarity measures that are based on distributions of color and texture features summarizing over 1,000 CPU hours of computational experiments. Ground truth is collected via a novel random sampling scheme for color, and via an image partitioning method for texture. Quantitative performance evaluations are given for classification, image retrieval, and segmentation tasks, and for a wide variety of dissimilarity measures. It is demonstrated how the selection of a measure, based on large scale evaluation, substantially improves the quality of classification, retrieval, and unsupervised segmentation of color and texture images.
Goals • compare distribution-based image dissimilarity measures • evaluate dependency on parameter settings • develop generic and statistically sound benchmarking methodology • examine influence in different applications:classification, retrieval, annotation and unsupervised segmentation
Image Representation Distributions: • adaptive binning (multivariate histograms) • marginal histograms • cumulative marginal histograms
Image Representation dissimilarity measures: for multivariate (full) distributions for marginal distributions • Color: CIELab color space • Texture:Gabor filter responses
Heuristic Dissimilarity Measures • Minkowski-distancee.g. p = 1 [see 8] (Histogram Intersection), [see 9] • Weighted-Mean-Variance (WMV)[see 4]
Statistical Dissimilarity Measures • Kolmogorov-Smirnoff distance (KS)[see 2] • Cramer/Von Mises (CvM) • -statistic[see 6]
Information-Theoretic Measures • Kullback-Leibler divergence (KL)[see 5] • Jeffrey divergence (JD)[see 6]
Ground Distance Measures • Quadratic Form (QF)via similarity matrix A to incorporate similarities between bins[see 3] • Earth Movers Distance (EMD)by solving the transportation problem for the optimal admissible flow gij between the two distributions. dij is the dissimilarity between bins.[see 7]
Methodology • quality measure: separating into different tasks (classification, retrieval, segmentation) • parameters: select best possible for every measure by exhaustive evaluation • evaluate processing steps separately: such as representation, dissimilarity measures, application • ground truth: collected by sampling given images
Parameter Settings • exhaustive search over parameter values: • K nearest neighbors (k = 1, 3, 5, 7) • sample size: (color: 4, 8, 16, 32, 64 pixelstexture: 82 , 162 , 322 , 642 , 1282 , 2562 pixels) • number of bins: (4, 8, 16, 32, 64, 128, 256; for EMD only for 4, 8, 16, 32) • number of Gabor filters: (12, 24, 40) • quality measures: • classification: K-NN classifier with leave-one-out • image retrieval: precision vs. number of retrieved images • unsupervised segmentation: pixel-wise error
Results: Texture Segmentation • Unsupervised Grouping by normalized pairwise clustering [6]
Results: Color Classification 94 images from Corel Database, 16 Samples from each image Full distributions:
Results: Color Classification Marginal distributions:
Results: Texture Classification 94 images from Brodatz Album, 16 samples from each image Full distributions:
Results: Texture Classification Marginal distributions:
Conclusion • no overall best measure, but different tools for different tasks • marginal histograms and aggregate measures good for large feature spaces and small samples • multivariate histograms effective on large sample sizes and/or well-adapted binning • EMD attractive for moderate similarities
Literature [1] M.Flickner et al. Query by image and video content: The cubic system. IEEE Computer 1995. [2] D. Geman et al. Boundary detection by constraint optimization. PAMI 1990. [3] J. Hafner et al. Efficient color histogram indexing for quadratic form distance function. PAMI 1995. [4] B. Manjunath and W. Ma. Texture features for browsing and retrieval of image data. PAMI 1996. [5] T. Ojala et al. A comparative study of texture measures with classification based on feature distributions. Pattern Recognition 1996. [6] J. Puzicha et al. Non-parametric similarity measures for unsupervised texture segmentation and image retrieval. CVPR 1997.[7] Y. Rubner et al. A metric for distributions with applications to image databases. ICCV 1998. [8] M. Swain and D. Ballard. Color indexing. IJCV 1991. [9] H. Voorhees and T. Poggio. Computing texture boundaries from images. Nature 1988.
Results: Texture Segmentation • Evaluated over database of 100 Brodatz images