1 / 17

Dimension reduction for finite trees in L 1

James R. Lee. Mohammad Moharrami. Dimension reduction for finite trees in L 1. University of Washington. Arnaud De Mesmay. École Normale Supérieure. TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A A A A A A A A.

luther
Download Presentation

Dimension reduction for finite trees in L 1

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. James R. Lee Mohammad Moharrami Dimension reduction for finite trees in L1 University of Washington Arnaud De Mesmay ÉcoleNormaleSupérieure TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAAAAAA

  2. Given an n-point subset XµRd, find a mapping dimension reduction in Lp such that for all x, y2X, n = size of X k = target dimension D = distortion Dimension reduction as “geometric information theory”

  3. When p=2, the Johnson-Lindenstrauss transform gives, for every n-point subset XµRd and " > 0, the case p=2 Applications to… • Statistics over data streams • Nearest-neighbor search • Compressed sensing • Quantum information theory • Machine learning

  4. Natural to consider is p=1. n = size of X k = target dimension dimension reduction in L1 D = distortion History: - Caratheodory’s theorem yields D=1 and - [Schechtman’87, Bourgain-Lindenstrauss-Milman’89, Talagrand’90] Linear mappings (sampling + reweighting)yield D·1+" and - [Batson-Spielman-Srivastava’09, Newman-Rabinovich’10] Sparsification techniques yield D·1+" and

  5. the Brinkman-Charikar lower bound [Brinkman-Charikar’03]: There are n-point subsets such that distortion D requires [Brinkman-Karagiozova-L 07] Lower bound tight for these spaces Very technical argument based on LP-duality. [L-Naor’04]: One-page argument based on uniform convexity.

  6. [Andoni-Charikar-Neiman-Nguyen’11]: There are n-point subsets such that distortion 1+" requires more lower bounds [Regev’11]: Simple, elegant, information-theoretic proof of both the Brinkman-Charikar and ACNN lower bounds. Low-dimensional embedding ) encoding scheme

  7. A tree metric is a graph theoretic tree T=(V, E) together with non-negative lengths on the edges the simplest of L1 objects Easy to embed isometrically into RE equipped with the L1 norm.

  8. Charikar and Sahai (2002) showed that for trees one can achieve dimension reduction for trees in L1 A. Gupta improved this to In 2003 in Princeton with Gupta and Talwar, we asked: Is possible? even for complete binary trees?

  9. Theorem: For every n-point tree metric, one can achieve dimension reduction for trees in L1 and (Can get for “symmetric” trees.) Complete binary tree using local lemma Schulman’s tree codes Complete binary tree using re-randomization Extension to general trees

  10. dimension reduction for the complete binary tree Every edge gets B bits ) target dimension = Blog2n Choose edge labels uniformly at random. Nodes at tree distance have probability to get labels with hamming distance

  11. dimension reduction for the complete binary tree Every edge gets B bits ) target dimension = Blog2n Choose edge labels uniformly at random. Siblings have probability 2-B to have the same label, yet there are n/2 of them.

  12. Lovász Local Lemma Pairs at distance L have probability to be “good” Number of dependent “distance L” events is LLL+ sum over levels ) good embedding

  13. LLL argument difficult to extend to arbitrary trees. • Same as construction of Schulman’96: • Tree codes for interactive communication Schulman’s tree codes

  14. re-randomization Random isometry: For every level on the right, exchange 0’s and 1’s with probability half (independently for each level)

  15. re-randomization Pairs at distance L have probability to be “good” Number of pairs at distance L is

  16. Unfortunately, the general case is technical (paper is 50 pages) Obstacles: extension to general trees Use “topological depth” of Matousek. How many coordinates to change per edge, and by what magnitude? General trees do not have O(log n) depth Multi-scale entropy functional

  17. open problems ? or Close the gap: For distortion 10, is the right target dimension Other Lp norms: Nothing non-trivial is known for Coding/dimension reduction: Extend/make explicit the connection between L1 dimension reduction and information theory.

More Related