1 / 36

Image Similarity and the Earth Mover’s Distance

Empirical Evaluation of Dissimilarity Measures for Color and Texture Y. Rubner, J. Puzicha, C. Tomasi and T.M. Buhmann The Earth Mover’s Distance as a Metric for Image Retrieval Y. Rubner, C. Tomasi and J.J. Guibas

amalie
Download Presentation

Image Similarity and the Earth Mover’s Distance

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Empirical Evaluation of Dissimilarity Measures for Color and Texture Y. Rubner, J. Puzicha, C. Tomasi and T.M. Buhmann The Earth Mover’s Distance as a Metric for Image Retrieval Y. Rubner, C. Tomasi and J.J. Guibas The Earth Mover’s Distance is the Mallows Distance: Some Insights from Statistics E. Levina and P.J. Bickel Image Similarity and the Earth Mover’s Distance Learning-Based Methods in Vision - Spring 2007 Frederik Heger (with graphics from last year’s slides) 1 February 2007

  2. How Similar Are They? Images from Caltech 256

  3. Similarity is Important for … • Image classification • Is there a penguin in this picture? • This is a picture of a penguin. • Image retrieval • Find pictures with a penguin in them. • Image as search query • Find more images like this one. • Image segmentation • Something that looked like this was called penguin before.

  4. Image Representations: Histograms Images from Dave Kauchak • Normal histogram Cumulative histogram • Generalize to arbitrary dimensions • Represent distribution of features • Color, texture, depth, … Space Shuttle Cargo Bay

  5. Image Representations: Histograms Images from Dave Kauchak • Joint histogram • Requires lots of data • Loss of resolution to avoid empty bins • Marginal histogram • Requires independent features • More data/bin than joint histogram

  6. Image Representations: Histograms Images from Dave Kauchak • Adaptive binning • Better data/bin distribution, fewer empty bins • Can adapt available resolution to relative feature importance Space Shuttle Cargo Bay

  7. Image Representations: Histograms Images from Dave Kauchak • Clusters / Signatures • “super-adaptive” binning • Does not require discretization along any fixed axis EASE Truss Assembly Space Shuttle Cargo Bay

  8. y y x x Distance Metrics - = Euclidian distance of 5 units - = Grayvalue distance of 50 values - = ?

  9. Bin-by-bin comparison Sensitive to bin size. Could use wider bins …… but at a loss of resolution Cross-bin comparison How much cross-bin influence is necessary/sufficient? Issue: How to Compare Histograms?

  10. Overview: Similarity Measures • Heuristic Histogram Distance: • Minkowski-form distance (Lp) • Special Cases: • L1 Mahattan distance • L2 Euclidian Distance • L Maximum value distance

  11. Overview: Similarity Measures • Heuristic Histogram Distance: • Weighted-Mean-Variance (WMV) • Info: • Per-feature similarity measure • Based on Gabor filter image representation • Shown to outperform several parametric models for texture-based image retrieval

  12. Overview: Similarity Measures • Nonparametric Test Statistic: • Kolmogorov-Smirnov distance (KS) • Info: • Defined for only one dimension • Maximum discrepancy between cumulative distributions • Invariant to arbitrary monotonic feature transformations

  13. Overview: Similarity Measures • Nonparametric Test Statistic: • Cramer/von Mises type statistic (CvM) • Info: • Squared Euclidian distance between distributions • Defined for single dimension

  14. Overview: Similarity Measures • Nonparametric Test Statistic: • 2 • Info: • Very commonly used

  15. Overview: Similarity Measures • Information-theory Divergence: • Kullback-Leibler divergence (KL) • Info: • Code one histogram using the other as true distribution • How inefficient would it be? • Also widely used.

  16. Overview: Similarity Measures • Information-theory Divergence: • Jeffrey-divergence (JD) • Info: • Similar to KL divergence • But symmetric and numerically stable

  17. Overview: Similarity Measures • Ground Distance Measure: • Quadratic Form (QF) • Info: • Heuristic approach • Matrix A incorporates cross-bin information

  18. Overview: Similarity Measures • Ground Distance Measure • Earth Mover’s Distance (EMD) • Info: • Based on solution of linear optimization problem (transportation problem) • Minimal cost to transform one distribution to the other • Total cost = sum of costs for individual features

  19. Summary: Similarity Measures

  20. Earth Mover’s Distance

  21. Earth Mover’s Distance

  22. Earth Mover’s Distance =

  23. Earth Mover’s Distance (amount moved) * (distance moved) =

  24. P m clusters (distance moved) * (amount moved) (distance moved) * (amount moved) * (amount moved) Q All movements n clusters How EMD Works

  25. P m clusters Q n clusters How EMD Works Move earth only from P to Q P’ Q’

  26. P m clusters Q n clusters How EMD Works P cannot send more earth than there is P’ Q’

  27. P m clusters P’ Q n clusters Q’ How EMD Works Q cannot receive more earth than it can hold

  28. P m clusters P’ Q Q’ n clusters How EMD Works As much earth as possiblemust be moved

  29. L1 distance Jeffrey divergence χ2 statistics Quadratic form distance Earth Mover Distance Color-based Image Retrieval

  30. Red Car Retrievals (Color-based)

  31. Zebra Retrieval (Texture-based)

  32. without position with position EMD with Position Encoding

  33. Issues with EMD • High computational complexity • Prohibitive for texture segmentation • Features ordering needs to be known • Open eyes / closed eyes example • Distance can be set by very few features. • E.g. with partial match of uneven distribution weight EMD = 0, no matter how many features follow

  34. Help From Statisticians • For even-mass distributions, EMD is equivalent to Mallows distance • (for uneven mass distributions, the two distances behave differently) • Trick to compute Mallows distance • 1-D marginals give better classification results than joint distributions (experimental results) • Get marginals from empirical distribution by sorting feature vectors

  35. EMD Summary / Conclusions • Ground distance metric for image similarity • Uses signatures for best adaptive binning and to lessen impact of prohibitive complexity • Can deal with partial matches • Good performance for color/texture classification • Statistical grounding

  36. Last Slide • Comments? • Questions?

More Related