1 / 39

A Space-Optimal Data-Stream Algorithm for Coresets in the Plane

A Space-Optimal Data-Stream Algorithm for Coresets in the Plane. Pankaj K. Agarwal Joint work with Hai Yu. Extent Measures. Diameter, width, convex hull Simple shapes that enclose point sets Min-radius ball, min-volume box, min-radius cylinder

zeroun
Download Presentation

A Space-Optimal Data-Stream Algorithm for Coresets in the Plane

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Space-Optimal Data-Stream Algorithm for Coresets in the Plane Pankaj K. Agarwal Joint work with Hai Yu

  2. Extent Measures • Diameter, width, convex hull • Simple shapes that enclose point sets • Min-radius ball, min-volume box, min-radius cylinder • Min-width cylindrical shell, min-width spherical shell • Maintaining shape descriptors for moving points • Applications • Computer graphics • Solid modeling • Machine learning • Data mining • Sensor networks

  3. Approximation • ε-Kernel[A., Har-Peled, Varadarajan,04] a small subset that ε-approximate directional width of input points in all directions

  4. Approximation • ε-Kernel[A., Har-Peled, Varadarajan,04] • This talk: maintaining an ε-kernel Q of a stream S of points a small subset that ε-approximate directional width of input points in all directions wu u computable an e-kernel of size 1/ed/2 in O(n+1/εd-3/2) time [A. etal, Chan] Wu results in linear-time approx algorithm for many extent measures

  5. Results • Previous results • Our main result: a space-optimal data-stream algorithm in R2 • O(1/ε1/2) space, O(log1/ε) update time (amortized)

  6. Algorithm Overview • Problem is easy if point set is fat

  7. Algorithm Overview • Problem is easy if point set is fat

  8. Algorithm Overview • Problem is easy if point set is fat • Keep track of nearest neighbor of each grid point • Can be implemented efficiently in O(log 1/ε) time per point Apply affine transform if point set not fat

  9. Algorithm Overview • Epochs and subepochs o

  10. Algorithm Overview • Epochs and subepochs o xi

  11. Algorithm Overview • Epochs and subepochs o xi 2·||oxi|| an epoch

  12. Algorithm Overview • Epochs and subepochs yj o xi 2·||oxi||

  13. Algorithm Overview • Epochs and subepochs yj 2·d(yj,oxi) o xi 2·||oxi||

  14. Algorithm Overview • Epochs and subepochs yj 2·d(yj,oxi) o xi 2·||oxi|| a subepoch

  15. Algorithm Overview • Epochs and subepochs yj 2·d(yj,oxi) o xi 2·||oxi|| a subepoch

  16. Algorithm Overview • Epochs and subepochs yj 2·d(yj,oxi) o xi 2·||oxi|| a subepoch

  17. Algorithm Overview • Epochs and subepochs yj 2·d(yj,oxi) o xi 2·||oxi|| a subepoch

  18. Algorithm Overview • Epochs and subepochs yj 2·d(yj,oxi) o xi 2·||oxi|| a subepoch

  19. Algorithm Overview • Epochs and subepochs Starting a new subepoch yj+1 2·d(yj+1,oxi) o xi 2·||oxi||

  20. Algorithm Overview • Epochs and subepochs yj+1 2·d(yj+1,oxi) o xi Starting a new epoch 2·||oxi||

  21. Algorithm Overview • Epochs and subepochs 2·||oxi+1|| o Starting a new epoch xi+1

  22. Algorithm Overview • Chan’s algorithm o x

  23. Algorithm Overview • Chan’s algorithm • Keep last log(1/ε) epochs alive o x Too close to point o

  24. Algorithm Overview • Chan’s algorithm • Keep last log(1/ε) epochs alive o x Too close to point o

  25. Algorithm Overview • Chan’s algorithm • Keep last log(1/ε) epochs alive o x Too close to point o Too close to line ox • Keep last log(1/ε) subepochs alive

  26. Algorithm Overview • Chan’s algorithm • Total space: (1/ε1/2) log2(1/ε) • Keep last log(1/ε) epochs alive o x Too close to point o Too close to line ox • Keep last log(1/ε) subepochs alive

  27. New Ingredient • Prune points of previous epochs at the beginning of each new epoch • last log(1/ε) epochs o

  28. New Ingredient • Prune points of previous epochs at the beginning of each new epoch • last log(1/ε) epochs o x

  29. New Ingredient • Prune points of previous epochs at the beginning of each new epoch • last log(1/ε) epochs o x

  30. New Ingredient • Prune points of previous epochs at the beginning of each new epoch • last log(1/ε) epochs o x • j-th last: O(j2/(2jε)1/2) • O(1) • O(1/ε1/2) • Total space remains O(1/ε1/2) !

  31. New Ingredient • Prune points of previous subepochs at the beginning of each new subepoch • Algorithm for subepoch does not touch points of previous epochs o x • j-th last: O(j/(2jε)1/2) • O(1/ε1/2) • O(1) • last log(1/ε) subepochs

  32. Intuition for Subepoch • Point set is always stretched in vertical direction yi 2·d(yi,ox) yi-j 2·d(yi-j,ox) o x π/2 2·||ox|| π/4 … π/2j 0

  33. Starting a Subepoch • Splitting and pruning π/2 π/2 π/4 π/4 … … π/2j π/2j π/2j+1 0 0

  34. Intuition for Epoch • Point set is stretched in several directions π/2 π/4 … π/2j 0

  35. Intuition for Epoch • Point set is stretched in several directions π/2 π/4 … π/2j 0

  36. Intuition for Epoch • Point set is stretched in several directions π/2 π/4 … π/2j π/2 0 π/4 … π/2j+1 0

  37. Intuition for Epoch • Point set is stretched in several directions • Overlay two angular partitions and maintain proper invariants π/2 π/4 … π/2j π/2 0 π/4 … π/2j+1 0

  38. Extensions • Space-optimal approximation algorithm for maintaining width (smallest enclosing box, etc) in R2 • Space-optimal algorithm for maintaining (k, ε)-kernels in R2 • O(k /ε1/2) space, O(k log 1/ε) update time • Improved algorithm for maintaining ε-kernels in Rd • O(1/εd-3/2) space, O(log 1/ε) update time • Similar results for (k, ε)-kernels • Improved algorithms for numerous data-stream problems related to extent measures

  39. Open Problems • Is the query time log (1/e) optimal if O(1/e1/2) space allowed? • Coresets in the sliding window model?

More Related