470 likes | 609 Views
Empirical Evaluation of Dissimilarity Measures for Color and Texture. Jan Puzicha, Joachim M. Buhmann, Yossi Rubner & Carlo Tomasi. Presented by: Dave Kauchak Department of Computer Science University of California, San Diego dkauchak@cs.ucsd.edu. The Problem: Image Dissimilarity. D(.
E N D
Empirical Evaluation of Dissimilarity Measures for Color and Texture Jan Puzicha, Joachim M. Buhmann, Yossi Rubner & Carlo Tomasi Presented by: Dave Kauchak Department of Computer Science University of California, San Diego dkauchak@cs.ucsd.edu
The Problem: Image Dissimilarity D( ) =? ,
Where does this problem arise in computer vision? • Image Classification • Image Retrieval • Image Segmentation
Classification ? ? ?
Retrieval Jeremy S. De Bonet, Paul Viola (1997). Structure Driven Image Database Retrieval. Neural Information Processing 10 (1997).
Segmentation http://vizlab.rutgers.edu/~comanici/segm_images.html
Histograms for image dissimilarity • Examine the distribution of features, rather than the features themselves • General purpose (i.e. any distribution of features) • Resilient to variations (shadowing, changes in illumination, shading, etc.) • Can use previous work in statistics, etc.
Histogramming Image Features • Color • Texture • Shape • Others… • Create histogram through binning or some procedure to get a distribution
Color Which is more similar? L*a*b* was designed to be uniform in that perceptual “closeness” corresponds to Euclidean distance in the space.
L*a*b* L – lightness (white to black) a – red-greeness b – yellowness-blueness
Texture • Texture is not pointwise like color • Texture involves a local neighborhood • Gabor Filters are commonly used to identify texture features
Gabor Filters • Gabor filters are Gaussians modulated by sinusoids • They can be tuned in both the scale (size) and the orientation • A filter is applied to a region and is characterized by some feature of the energy distribution (often mean and standard deviation)
Examples of Gabor Filters Scale: 3 at 72° Scale: 4 at 108° Scale: 5 at 144°
Creating Histograms from Features • Regular Binning • Simple • Choosing bins important. Bins may be too large or too small • Adaptive Binning • Bins are adapted to the distribution (usually using some form of K-means)
Marginal Histograms Marginal histograms only deal with a single feature Normal Binning Marginal binning resulting in 2 histograms
Cumulative Histogram Normal Histogram Cumulative Histogram
Dissimilarity Measure Using the Histograms • Heuristic Histogram Distances • Non-parametric Test Statistics • Information-Theoretic diverges • Ground distance measures
Notation • D(I,J) is the dissimilarity of images I and J • f(i;J) is histogram entry i in histogram of image J • fr(i;J) is marginal histogram entry i of image J • Fr(i;J) is the cumulative histogram
Heuristic Histogram Distances • Minkowski-form distance Lp • Special cases: • L1: absolute, cityblock, or Manhattan distance • L2: Euclidian distance • L: Maximum value distance
More heuristic distances • Weighted-Mean-Variance (WMV) • Only includes minimal information about distribution
Non-parametric Test Statistics • Kolmogorov-Smirnov distance (K-S) • Cramer/von Mises type (CvM)
Histogram 1 Histogram 2 Difference - = CvM = K-S = Cumulative Difference Example
Non-parametric Test Statistics (cont.) • 2-statistic (chi-square) • Simple statistical measure to decide if two samples came from the same underlying distribution
Information-Theoretic diverges • How well can one distribution be coded using the other as a codebook? • Kullback-Leibler divergence (KL) • Jeffrey-divergence (JD)
Ground Distance Measure • Based on some metric of distance between individual features • Earth Movers Distance (EMD) • Minimal cost to transform one distribution to the other • Only measure that works on distributions with a different number of bins
EMD • One distribution can be seen as a mass of earth properly spread in space, the other as a collection of holes in that same space • Distributions are represented as a set of clusters and an associated weight • Computing the dissimilarity then becomes the transportation problem
Transportation Problem • Some number of suppliers with goods • Some other number of consumers wanting goods • Each consumer-supplier pair has an associated cost to deliver one unit of the goods • Find least expensive flow of goods from supplier to consumer
Various properties of the metrics • K-S, CvM and WMV are only defined for marginal distributions • Lp, WMV, K-S, CvM and, under constraints, EMD all obey the triangle inequality • WMV is particularly quick because the calculation is quick and the values can be pre-computed offline • EMD is the most computationally expensive
Key Components for Good Comparison • Meaningful quality measure • Subdivision into various tasks/applications (classification, retrieval and segmentation) • Wide range of parameters should be measured • An uncontroversial “ground truth” should be established
Data Set: Color • Randomly chose 94 images from set of 2000 • 94 images represent separate classes • Randomly select disjoint set of pixels from the images • Set size of 4, 8, 16, 32, 64 pixels • 16 disjoint samples per set per image
Data Set: Texture • Brodatz album • Collection of wide range of texture (e.g. cork, lawn, straw, pebbles, sand, etc.) • Each image is considered a class (as in color) • Extract sets of 16 non-overlapping blocks • sizes 8x8, 16x16,…, 256x256
Setup: Classification • k-Nearest Neighbor classifier is used • Nearest Neighbor classification: given a collection of labeled points S and a query point q, what pointbelonging to S is closest to q? • k nearest is a majority vote of the k closest points • k = 1, 3, 5 and 7 • Average misclassification rate percentage using leave-one-out
Setup: Classification (cont.) • Bins: {4, 8, 16, 32, 64, 128, 256} • Texture case, three sets of filters were used of sizes 12, 24 and 40 filters • 1000 CPU hours of computation
Results: Classification • For small sample sizes, the WMV measure performs best in the texture case. • WMV only estimates means and variances • Less sensitive to sampling noise • EMD also performs well for small sample sizes • Local binning provides additional information • For large sample sizes 2 test performs best
Results: Classification (cont.) • For texture classification, marginal distributions do better than multidimensional distributions except for very large sample sizes (256x256) • Binning is not well adapted to the data since it is fixed for all the 94 classes • EMD, which uses local adaption does much better • For multidimensional histograms, the more bins the better the performance • For texture, usually 12 filters is enough
Setup: Image Retrieval • Vary sample size • Vary number of images retrieved • Performance measured based on precision (i.e. percent correct of the images retrieved) vs. the number of images retrieved
Results: Image Retrieval (cont.) • Similar to classification • EMD, WMV, CvM and K-S performed well for small sample sizes • JD, 2 and KL perform better for larger sizes
Setup: Segmentation • 100 images • Each image consists of 5 different Brodatz textures • For multivariate, the bins are adapted to specific image
Setup: Segmentation (cont.) • Image is divided into 16384 sites (128 x 128 grid) • A histogram is calculate for each site • Each site histogram is then compared with 80 randomly selected sites • Image sites with high average similarity are then grouped
Results: Segmentation (cont.) • Binning can be adapted to image • Increased accuracy in representing multidimensional distributions • Adaptive multivariate outperforms marginal • Best results were obtained by 2 • EMD suffers from high computational complexity
Conclusions • No measure is overall best • Marginal histograms and aggregate measures are best for large feature spaces • Multivariate histograms perform well with large sample sizes • EMD performs generally well for both classification and retrieval, but with a high computational cost • 2 is a good overall metric (particularly for larger sample sizes