1 / 34

Mode-detection via Median-shift

Lior Shapira, Tel-Aviv University. Mode-detection via Median-shift. Joint work with Shai Avidan , Adobe Inc. Ariel Shamir, Interdisciplinary Center, Herzliya. Clustering. An important problem Vision and Image processing Segmentation Tracking Classification (images, textures etc.)

lane-mendez
Download Presentation

Mode-detection via Median-shift

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lior Shapira, Tel-Aviv University Mode-detection via Median-shift Joint work with ShaiAvidan, Adobe Inc. Ariel Shamir, Interdisciplinary Center, Herzliya

  2. Clustering An important problem • Vision and Image processing • Segmentation • Tracking • Classification (images, textures etc.) • Reducing visual clutter • Image retrieval • Other • 3D shape and part retrieval • Data mining • Search result grouping • etc.

  3. Clustering A challenging problem • Large volume of data • High-dimensional • Continuously changing Flickr: tsevis • Over 4 billion photos • Over 5 million added daily

  4. Non-parametric Clustering • Mean-shift is a popular mode-seeking algorithm • Does not require a-priori knowledge on number of clusters • Does not place restrictions on cluster size

  5. Non-parametric Clustering • Mean-shift algorithm • Starting from each point in the data set • Move towards mean of local neighborhood • Repeat until converge to a mode Breaking it down N points Find nearest neighbors, calculate mean All points converging to mode X define a cluster

  6. Our Approach Median-shift Mode Seeking • For each point in the data set • Find local neighborhood of the point • Shift towards the median of the local neighborhood • Iterate until convergence Key differences • Median – more robust than Mean • The Mode is a point in the data set

  7. Related Work

  8. Defining a High-dimensional Median • Using the Tukey depth

  9. Tukey Depth • Pass all possible hyper-planes through a point

  10. Tukey Depth • Pass all possible hyper-planes through a point

  11. Tukey Depth • Pass all possible hyper-planes through a point

  12. Tukey Depth • Pass all possible hyper-planes through a point • The Tukey depth equals

  13. Tukey Median • Is the point which achieves the maximum depth

  14. Random Tukey Depth • We approximate the Tukey Median using random projections* • All points are projected to K random vectors • We sort the points on the K projections and calculate depth • The Tukey Median is the point which achieves the maximum depth *The random Tukey depth, Cuesta-Albertos and Nieto-Reyes, 2008

  15. Median comparison

  16. Back to our approach Median-shift Mode Seeking • For each point in the data set • Find local neighborhood of the point • Shift towards the median of the local neighborhood • Iterate until convergence • After one step, all modes are points in the set • We can now work on reduced set with a weighted median calculation But finding a local neighborhood is still highly challenging!

  17. Locality-sensitive Hashing • An algorithm for solving the Approximate nearest neighbor search in high-dimensional spaces • Published by Andoni et al • Locality-sensitive hashing using stable distributions [2006] • Near-optimal hashing algorithms for near neighbor problems in high dimensions [2008]

  18. Intuition • Construct hash functions g:RdU such that for any points p,q: • If ||p-q||≤r, then Pr[g(p)=g(q)] is “not so small” • If ||p-q||>cr, then Pr[g(p)=g(q)] is “small” q p

  19. c-Approximate r-range Query • If there is at least one pєS: d(p,q)≤r return some p’:d(q,p’)≤cr • c-Approximate NN query: return some p’: d(p’,q)≤crnn, where rnn=minpєSd(p,q) < cr q p p' r

  20. Building Hash Functions • A family H of functions h:RdU is called (c,r,P1,P2)-sensitive, if for any p,q: • If ||p-q||<r then Pr[h(p)=h(q)]>P1 • If ||p-q||>crthen Pr[h(p)=h(q)]<P2 • Example: Hamming distance • LSH functions: h(p)=pi, i.e., the i-th bit of p • Probabilities: Pr[h(p)=h(q)]=1-D(p,q)/d p=10010010 q=11010110

  21. Projection-based LSH (2004) • For Ls norm we define hash functions using 1-Dimensional projections

  22. Projection-based LSH (2004) • p-stable distributions • A distribution D over R is called p-stable, if there exists p≥0 such that for any n real numbers v1,…,vn and i.i.d variables X1,…,Xn with distribution D, the random variable ∑i(viXi) has the same distribution as the variable (∑i|vi|p)1/pX, where X is a random variable with distribution D. • Stable distributions exist for any pє(0,2], in particular: • A Cauchy distribution is 1-stable • A Gaussian distribution is 2-stable

  23. The Idea a v p-stable Distribution (a1,…,ad) (v1,…,vd) Distributed as It follows from s-stability that for two vectors (p,q) the distance (a.p-a.q) is distributed as ||p-q||sX where X is a s-stable distribution (a1,…,ad) (v1,…,vd) +b w

  24. Integrating LSH and Median-shift • LSH is used to find local neighborhood • Spatial bandwidth = radius of NN queries • Both LSH and Median use random projections • Updating modes • LSH is easily updated with new points • Still requires running Median-shift again

  25. Mode-detection via Median-shift Observation: some applications require only finding the modes • Most points lie within a small number of bins • Modes are most likely to fall within these bins (areas of high density)

  26. Mode-detection via Median-shift • Construct LSH structure • Detect significant bins • Bins holding at least 0.1%-1% of the points • Select representative from each bin (Median) • Run Median-shift on weighted representatives • If necessary, propagate modes to form clusters

  27. Experiments

  28. Experiments

  29. Performance • LSH Construction • Mode-detection vs. Mode-seeking

  30. Applications Image Segmentation Median-shift Mean-shift

  31. Applications Chromatic Noise Filtering

  32. Applications Chromatic Noise Filtering

  33. Future Work • Patent pending • Hierarchical LSH construction • Enable multiple range queries (adaptive bandwidth) • More applications • Image collections • Video frame classification

  34. Thank you

More Related