Local and Global Embeddings of Metric Spaces

Local and Global Embeddingsof Metric Spaces Microsoft Research joint work with Yury Makarychev Moses Charikar Princeton University Moses Charikar Princeton University Konstantin Makarychev IBM T.J. Watson Research Center

Embeddings into normed spaces • Finite metric space on n points • Embed it into a normed space with small distortion • Applications to • Approximation Algorithms • Online Algorithms • Dealing with Large Datasets

Embeddings into 1 • Bourgain’s theorem (1985): • Every n-point metric space embeds into 1 with distortion O(log n) • Linial, London, and Rabinovich: • This result is tight • Special cases?

Local versus global distortion • Metric on n points • Property: Embeddability into 1 • Local Distortion D: distortion for embedding every subset of size k • Global Distortion: distortion for embedding entire metric ALNRRV = Arora, Lovasz, Newman, Rabani, Rabinovich, and Vempala

Local versus Global • Local properties: properties of subsets • Global properties: properties of entire set • What do local properties tell us about global properties? • Property of interest: embeddability in normed spaces

Motivations • Natural mathematical question • Questions of similar flavor • Embedding into2d • Helly’s theorem • Ramsey theory

Motivations • Natural mathematical question • Questions of similar flavor • Embedding into2d • Helly’s theorem • Ramsey theory • Lift-and-project methods in optimization • Can guarantee local properties • Need guarantee on global property

( ) d x y ` ; 1 ( ) k k j j d » » » » E ¡ ¡ x y = = ` 1 x y x y ; 1 Definition of 1 • For each x define random variable xx, • Distribution over subset (cuts) A Ì X. • equals the probability that x and y are separated by the cut (times a scaling factor) X \ A A x y

( ) d x y ; k k j j ( ) » » » » d E · · ¡ ¡ x y = 1 x y x y ; D D-distortion embedding into 1 • For each x define random variable xx, • d(x,y) equals up to a factor D the probability that x and y are separated by the cut X \ A A x y

2 µ ¶ ( = ) l k 2 ( ( ( ( ( ( = = = ) ) ) ) ) ) l l k k k O O D D D o g n o o g n g n n D l o g n Results Upper bound: Lower bound: D – local distortion ALNRRV = Arora, Lovasz, Newman, Rabani, Rabinovich, and Vempala

= ± µ µ ¶ ¶ D D D ( = ) 1 3 1 2 ¸ l l k + = = o g o g n n ( ( = ) ) l k £ D o g n ( = ) l k l l ± l 1 + o g o g o g o g n Results • Upper and lower bounds: • Lower bounds: D – local distortion

Upper Bound

Upper bound • Every size k subset of (X,d) embeddable into 1with distortion D (X,d) embeddable into 1with distortion O(D log(n/k)) • Direct sum of two embeddings • Handle large and small distances separately • Each embedding doesn’t increase distances

Upper bound: Overview X – n points

Upper bound: Overview X – n points S – subset of k points that intersects every ball of size m = n/k

Upper bound: Overview X – n points Rx,m x S – subset of k points that intersects every ball of size m = n/k

Upper bound: Overview X – n points Rx,m x S – subset of k points that intersects every ball of size m = n/k or its neighborhood

Upper bound: Overview X – n points S – k points Naïve Approach

Upper bound: Overview X – n points S – k points Naïve Approach: Fails

Upper bound: Overview X – n points S – k points Partition X in sets of “diameter” Rx,m

Upper bound: Overview X – n points S – k points Partition X in sets of “diameter” Rx,m Each cluster has a center

Upper bound: Overview X – n points S – k points Partition X in sets of “diameter” Rx,m Get random mapping from X to S

Upper bound: Overview X – n points S – k points Embed S into 1 Almost preserves distance if x and y are far away

( ) d x y ; k k j j » » » » E ¸ ¡ ¡ c = 1 x y x y l o g m Bourgain’s embedding for small scales • Pick a random r from {1,…, log m} • Pick a random subset W Ì X: add each point from X to W w.p. 2-r • Let xx = d(x, W) • | xx– xy| = |d(x, W) – d(y, W)| ≤d(x, y) • if d(x,y) < Rx,m + Ry,m,

Lower Bound

Lower bound: Roadmap • Constant degree expander • High global distortion • Subgraphs of expander are sparse • Sparse graphs embed well

µ ¶ ( = ) l k o g n ( = ) l ± 1 o g New metric! • Expander with new metric • Every embedding of (G, ) into 1 requires distortion • Every subset of X of size k embeds into 1 with distortion 1+  (u,v) = 1 - (1 - )d(u,v)

Global distortion • Consider 3-regular expander G • [LLR] Min distortion for embedding it into 1is (avg distance / length of edge) • Proof: In every embedding of G into 1 avg 1 distance » avg 1length of edge

Global distortion • Proof: In every embedding of G into 1 avg 1distance » avg 1length of edge • # cut edges » |A| Þcontributes to avg length: |A|/n • # cut pairs =|A| * |V \ A| »|A| * n Þcontributes to avg distance |A|/ n |E| » |V| = n A X \ A

µ µ ¶ ¶ ( = ) l k 1 o g n ¼ ( = ) l ± 1 o g ¹ Global distortion • Consider 3-regular expander G, girth (log n) • [LLR] Min distortion for embedding it into 1is (avg distance / length of edge) = (u,v) = 1 - (1 - )d(u,v)

Embedding subgraphs

Multicuts • Construct a distribution of Multicuts! Goal:Pr(u,v separated) ~ (u,v) = 1 - (1 - )d(u,v) . • High level idea: remove every edge with probability . • The shortest path between u and v survives with probability (1 - )d(u,v) . • If the shortest path was the only path between u and v we would separate u and v w.p. (u,v) = 1 - (1 - )d(u,v)

( ( = ) ) l k O o g n l- path decomposable expanders • H is l- path decomposable if • every 2-connected subgraph contains a path (each vertex has degree 2) of length l • [ABLT]3-regular expander G, girth (log n), every subgraph H of size at most k is path decomposable ABLT = Arora, Bollobas, Lovasz, and Tourlakis

1 L ¡ ( ) ¹ O 1 + e = L ( ) 1 1 ¡ ¡ ¹ Multicuts • H is l - path decomposable, L = l/9,   1/L • Distribution on multicuts: • d(u,v)  L, Pr(u,v separated) = 1 - (1 - )d(u,v) • d(u,v) > L, Pr(u,v separated)  1 - (1 - )L Assume H contains shortest paths of length < L • Distortion

u v Distribution on multicuts • H has cut vertex c • Sample multicuts independently in Si Pr[u,v not separated] = Pr[u,c not separ] * Pr[v,c not separ] = (1-)d(u,c) (1- )d(v,c) = (1- )d(u,v) c S1 S3 S2

Q1 Q2 Q3 L L L Long paths • d(u,v)  L, Pr(u,v separated) = 1-(1-)d(u,v) • d(u,v) > L, Pr(u,v separated)  1-(1-)L • The end points are always separated! • Can be done for path of length 3L • Cut edges “independently” with probability  • Decisions for Q1 and Q3 not independent

P2 P1 P3 H Distribution on multicuts • H has a path of length l= 9L • Divide path into 3 parts P1, P2, P3 • Sample multicuts independently in H,P1, P2, P3 • Computation is the same as before

µ ¶ l o g n l l l k + o g o g n o g Isometric local embeddings • Every subset of size k embeds isometrically into 1 • Entire metric requires distortion • Main idea: • make distortion very close to 1; • add a uniform (the discrete) metric: r¢ = r + e

s ° l n o g n ¼ l k o g Applications • Sherali – Adams Hierarchy: Integrality gap of (2 - e ) for Vertex-Cover and MAX CUT after rounds. • Integrality gap of for Sparsest Cut after k rounds.

µ ¶ l o g n ( ( = ) ) l k O o g n l k l l + o g o g o g n Conclusions & Open Questions • We establish tight bounds when the local distortion is bounded away from 1. • Open Problem: What is the worst global distortion if every k points embed isometrically? versus

Thank you

Local and Global Embeddings of Metric Spaces

Local and Global Embeddings of Metric Spaces

Presentation Transcript

The construction of global tourism spaces

Algorithmic Aspects of Finite Metric Spaces

Metric Embeddings with Relaxed Guarantees

Book Embeddings of Chessboard Graphs

The Intrinsic Dimension of Metric Spaces

Global and Local Winds

Metric Embeddings with Relaxed Guarantees

Competitive and Deterministic Embeddings of Virtual Networks

E fficient similarity search in metric and nonmetric spaces

Local and Local-Global Approximations

Discrete-Event Systems: Generalizing Metric Spaces and Fixed-Point Semantics

Global and Local Arrays

Scalable and Distributed Similarity Search in Metric Spaces

Tighter local versus global properties of metric spaces

Hidden Metric Spaces and Navigability of Complex Networks

NM-Tree : Flexible Approximate Similarity Search in Metric and Non-metric Spaces

Embedding Metric Spaces in Their Intrinsic Dimension

Book Embeddings of Chessboard Graphs

metric embeddings, graph expansion, and high-dimensional convex geometry

Compact Metric Spaces as Minimal Subspaces of Domains of Bottomed Sequences

Dimensionality Reduction and Embeddings