260 likes | 375 Views
geometric embeddings and graph expansion. James R. Lee. Institute for Advanced Study (Princeton) University of Washington (Seattle). outline. in the talk:. Philosophy of geometric embeddings Example: Finding balanced cuts in graphs Four important open problems. not in the talk:.
E N D
geometric embeddings and graph expansion James R. Lee Institute for Advanced Study (Princeton) University of Washington (Seattle)
outline in the talk: • Philosophy of geometric embeddings • Example: Finding balanced cuts in graphs • Four important open problems not in the talk: No proofs (one slide). Mathematics borrows from high-dimensional convex geometry, functional analysis, harmonic analysis, differential geometry... (see other talks on my web page) so you should ask questions if something is confusing!
combinatorial problem embedding nicer geometric space geometric representation combinatorial solution geometric embeddings in CS
connections in CS geometric search clustering dimension reduction machine learning computational biology approximation algorithms divide and conquer network design graph layout tree decompositions geometric optimization semi-definite programming PCPs, unique games fourier analysis of boolean functions
S E(S, S) Input: A graph G=(V,E). graph expansion and the sparsest cut For a cut (S,S) let E(S,S) denote the edges crossing the cut. The sparsity of S is the value The SPARSEST CUT problem is to find the cut which minimizes (S). This problem is NP-hard, so we try to find approximately optimal cuts. (approximation algorithms)
Given a graph G=(V,E), we want to graph expansion and the sparsest cut Clustering Divide & conquer algorithms
Given a graph G=(V,E), we want to graph expansion and the sparsest cut This is actually the EDGE EXPANSION problem. The full SPARSEST CUT problem is a weighted version
S S where is the geometry? Leighton-Rao (1988) approach via LP duality d is a metric on V if d(x,y) = d(y,x) and d(x,y) · d(x,z)+d(z,y) 8x,y,z 2 V “cut metric” d(x,y) =1 if x,y are on different sides of S d(x,y) = 0 otherwise
where is the geometry? Leighton-Rao (1988) approach via LP duality d is a metric on V if d(x,y) = d(y,x) and d(x,y) · d(x,z)+d(z,y) 8x,y,z 2 V can minimize with a linear program dual of the multi-commodity flow LP - every edge has capacity 1 - send 1 unit of flow from x ! y for every x,y 2 V
cut S S Now we find a cut using LP relaxation + embeddings[Linial London Rabinovich 1992] finding cuts using embeddings ? metric d LP relaxation 1. Want to find a good cut in G. S 2. Solve a linear program to get a metric d. 3. Embed the metric into a Euclidean space. 4. Use a geometric algorithm to find S. (random hyperplane cut) S Rn
Given a metric space (X,d), a Euclideanembedding of X a mapping f : X !Rn. embeddings and distortion The distortion of f is the smallest number D such that for all x,y 2 X: distortion measures how well f preserves the structure of X
Given a metric space (X,d), a Euclideanembedding of X a mapping f : X !Rn. embeddings and distortion The distortion of f is the smallest number D such that for all x,y 2 X: Depending on the application, sometimes we consider the L1 norm or the L2 norm. • Embeddings into L2 are stronger than L1 embeddings • L1 embeddings are good enough for finding sparse cuts • We have many fewer techniques for analyzing L1 embeddings
[Bourgain 1985] Every n-point metric space has a Euclidean embedding (L2 norm) with distortion O(log n). first results [Linial-London-Rabinovich, Aumann-Rabani STOC’92] - Can use this to get an O(log n)-approximation for the SPARSEST CUT problem. - Bourgain’s result is tight (using expander graphs)
semi-definite programming new results special family of metric spaces “negative type” A metric space (X,d) is said to be negative typeif we can write where xu2Rn for every u 2 X.
metric spaces have various scales embedding overview
embedding overview exploit non-trivial interaction between scales Measured descent: New multi-scale embedding technique [Krauthgamer-L-Mendel-Naor FOCS’04]
embedding overview single-scale analysis via geometric chaining argument Measured descent: New multi-scale embedding technique [Krauthgamer-L-Mendel-Naor FOCS’04] -approximation algorithm for EDGE EXPANSION [Arora-Rao-Vazirani STOC’04] new techniques in high-dimensional convex geometry
embedding overview Measured descent: New multi-scale embedding technique [Krauthgamer-L-Mendel-Naor FOCS’04] -approximation algorithm for EDGE EXPANSION [Arora-Rao-Vazirani STOC’04] new techniques in high-dimensional convex geometry Gluing embeddings with “partitions of unity” [L SODA’05]
embedding overview upper bound [CGR 05] Measured descent: New multi-scale embedding technique [Krauthgamer-L-Mendel-Naor FOCS’04] -approximation algorithm for EDGE EXPANSION [Arora-Rao-Vazirani STOC’04] new techniques in high-dimensional convex geometry Gluing embeddings with “partitions of unity” [L SODA’05] Improvements to the ARV geometric structure theorems [Chawla-Gupta-Racke SODA’05, L 05]
embedding overview Measured descent: New multi-scale embedding technique [Krauthgamer-L-Mendel-Naor FOCS’04] -approximation algorithm for EDGE EXPANSION [Arora-Rao-Vazirani STOC’04] new techniques in high-dimensional convex geometry Gluing embeddings with “partitions of unity” [L SODA’05] Improvements to the ARV geometric structure theorems [Chawla-Gupta-Racke SODA’05, L 05] -approximation for SPARSEST CUT [Arora-L-Naor STOC’05, L 06] based on new Euclidean embedding theorems for “negative type” spaces
analyze this semi-definite program important problems: negative-type metrics - Analysis is equivalentto finding the best distortion of n-point “negative type” metrics into Euclidean space with the L1 norm Upper bound:[Arora-L-Naor STOC’05, L 06] Lower bound: [Khot-Vishnoi FOCS’05] • Related to Fourier analysis of boolean functions, probabilistically checkable • proofs (PCPs), unique games conjecture, geometric analysis...
A A G T C A A T C A A T C important problems: edit distance For two strings s,t 2 {A,C,G,T}d dEDIT(s,t) {minimum number of insert/delete character operations to change from s ! t} = - What is the distortion needed to embed dEDIT into a Euclidean space (with the L1 norm)? (Applications to nearest-neighbor search, sketching, fast distance computations...) Upper bound:[Ostrovsy-Rabani STOC’05] Lower bound: [Krauthgamer-Rabani SODA’06]
important problems: vertex separators Earlier, we talked about edge cuts. We can also consider vertex cuts • Most important application: Finding low-treewidth decompositions • (useful as a basic step in many algorithms) • Best approximation algorithms are from [Feige-Hajiaghayi-L STOC’05] • Requires a stronger kind of embedding. • We can only extend some of the known techniques.
t2 s1 G t3 s2 t1 s3 Max-flow / Min-cut theorem: In any graph G, for any two nodes s and t, the value of the value of the minimum s-t cut= value of the maximum s-t flow. important problems: planar multi-flows What about multi-commodity flows? - In general graphs, there is no max-flow/min-cut theorem for multi-flows. The gap can be log(k), k = # of flows • What about planar graphs? Conjecture: The max-flow/min-cut gap is only O(1) for multi-flows on planar graphs.
Max-flow / Min-cut theorem: In any graph G, for any two nodes s and t, the value of the value of the minimum s-t cut= value of the maximum s-t flow. important problems: planar multi-flows This conjecture is equivalentto the question: If d(u,v) is the shortest-path metric on a planar graph G, does the metric space (G,d) embed into a Euclidean space (with the L1 norm) with O(1) distortion? Conjecture: The max-flow/min-cut gap is only O(1) for multi-flows on planar graphs.
A A G C T A A T C t2 s1 G t3 s2 t1 s3 - Embeddings are a fundamental tool in Computer Science http://www.cs.berkeley.edu/~jrl conclusion - Many applications to other parts of science - Very rich, exciting mathematics - Lots of important open problems at various levels of difficulty