380 likes | 523 Views
Metric Embedding with Relaxed Guarantees. Ofer Neiman Ittai Abraham Yair Bartal. Embedding metric spaces. Representation of the metric in a simple and structured space. Common target spaces: l p , trees (ultra-metrics).
E N D
Metric Embedding with Relaxed Guarantees Ofer Neiman Ittai Abraham Yair Bartal
Embedding metric spaces • Representation of the metric in a simple and structured space. • Common target spaces: lp, trees (ultra-metrics). • The price of simplicity: distortion, which is the multiplicative amount by which distances can change. • Goal: find low distortion embeddings. • A tool for approximation algorithms • Useful for many practical applications.
Metric embedding • Let X,Y be metric spaces with metrics dx, dyrespectively. • f : X→Y is an embedding of X into Y. • The distortion of f is the minimal α such that for some c:
Basic results • Every metric space on n points can be embedded into Euclidean spacewith distortion O(log n) and dimension O(log2n). [Bourgain/LLR] • Every metric space on n points can be embedded into a tree metricwith distortion [Bartal/BLMN/RR].
problem • The lower bounds on the distortion and the dimension are high, and grow with n. • In some cases, weaker guarantees are acceptable..
Some Alternative Schemes • Probabilistic embedding: considering the expected distortion. [Bartal, FRT] • Ramsey theorems: embedding a large subspace of the original metric. [BFM, BLMN] • Partial embedding: embedding all but a fraction of the distances.[KSW, ABCDGKNS]
Motivation • Estimating latencies (round-trip time) in the internet. - the distance matrix is almost a metric. - embedding heuristics yield surprisingly good results... [Ng+Zhang’02, ST’03, DCKM’04] • Practical network embedding requires: - Small number of dimensions. - No centralized co-ordination. - Linear number of distances measurement. • Finding nearest: copy of a file, service from some server, ect.
(1-ε) partial embedding • X, Y are metric spaces. • f : X→Y has (1-ε) partial distortion at most α if there exists a set of pairs Gε such that: For all pairs (u,v)єGε.
Scaling Embedding • A stronger requirement is a map that will be good for all εsimultaneously. • Definition: an embedding f has scaling distortion D(ε) if for anyε>0, it is an (1-ε) partial embedding with distortion D(ε).
Scaling & Average Distortion • Thm: every metric space has scaling probabilistic embedding with distortion O(log(1/ε)) into trees. • Thm:every metric space has scaling embedding with distortion O(log(1/ε)) and dimension O(log n)intoEuclidean space. implies constant average distortion! • Applications: weighted average problems sparsest cut, quadratic assignment, linear arrangement, ect.
Previous work Definition: a metric space X is called λ-doubling if for any r>0, any ball of radius r can be covered by λballs of radius r/2. • Any λ-doubling metric space X can be embedded into l2 with (1-ε) partial distortion [KSW].
Our Results • Partial embedding into l2 with distortion and dimension O(log(1/ε)). • General theorem converting classical lp embeddings into the partial model. • Distortion & Dimension don’t depend on the size of X! • Partial embedding into trees with distortion . • Tight lower bounds. Appeared in FOCS05 together with CDGKS
Embedding into lp • Thm: Any subset-closed family of metric spaces X , that has for any XєX on n points, an embedding φ:X→ lpwith - distortion α(n). - dimension β(n). φcan be converted into (1-ε) partial embedding of X with - distortion - dimension In practice..
Main Results • (1-ε) partial embedding of any metric space into lp with distortion and dimension [Bourgain,Matousek,Bartal] • (1-ε) partial embedding of any negative type metric (l1 metrics) into l2 with distortion and dimension [ARV, ALN] • (1-ε) partial embedding of any doubling metric into lp with distortion and dimension [KLMN] • (1-ε) partial embedding of any tree metric into l2 with distortion and dimension [Matousek]
Definitions • Let rε(u) be the minimal radius such that |B(u,rε(u))| ≥ εn. A pair (u,v), w.l.o.g rε(u) ≥ rε(v): • has short distance if d(u,v) <rε(u) • has medium distance if rε(u) ≤ d(u,v) < 4∙rε(u). • has long distanceif 4·rε(u) ≤ d(u,v).
Close Distances • (u,v) is a short pair. • Short pairs are ignored - at most εn2. rε(w) w rε(u) u rε(v) v
Beacon Based Embedding • Randomly choose beacons = B. • Each point attached to nearest beacon.
Some More Bad Points • If d(u,B) > rε(u) then is bad. • For each uєX : • With probability ½ at most 2εn2bad pairs. rε(w) w rε(u) u rε(v) v
Partial Embedding • Use the embedding φ:B→lp. • φhas distortion guarantee of . • The partial embedding is: f (u) h(u) φ(b) u attached to beacon b
Upper Bound We assume for the pair (u,v): - Each point has a beacon in its ball. - Both u,v are outside each other’s ball. - The mapping φ is a contraction. rε(v) v rε(u) u bv bu
Lower Bound - Long Distances d(u,v) ≥ 4·max{rε(u), rε(v)} rε(u) rε(v) u v bv bu d(bu,bv) ≥ d(u,v)/2
Medium Distances?? • There is a problem in this case: u,v are attached to the same beacon!! rε(u) u rε(v) v • The additional coordinates h will guarantee enough contribution..
Medium Distances Pairs satisfying: rε(u)≤d(u,v) ≤ 4rε(u) [w.l.o.g rε(u) ≥ rε(v) ] h(u)-h(v) With probability < εthe pair (u,v) will be smaller than half its expectation. rε(u),0 rε(u),0 rε(u),0 rε(v),0 rε(v),0 rε(v),0 With probability ¼ we get rε(u) rε(u) 0 rε(u) In expectation ¼ of the coordinates will be rε(u).
Medium Distances • With probability ½, 2εn2medium pairs failed, but for the others: • End of proof!
Coarse Partial Embedding • Another version: ignoring only the short distances (i.e., from each point to its nearest εn neighbors). the dimension increases to O(log(n)·β(1/ε)).
Partial Embedding into Trees • Thm: every metric space has (1-ε) partial embedding with distortion into a tree (ultra-metric).
Ultra-metrics • Metric on leaves of rooted labeled tree. • 0 ≤ Δ(D) ≤ Δ(B) ≤ Δ(A). • d(x,y) = Δ(lca(x,y)). d(x,y) = Δ(D). d(x,w) = Δ(B). d(w,z) = Δ(A). Δ(A) Δ(C) Δ(B) Δ(D) z w y x
Embedding into Ultra-metric • Partition X into 2 sets X1, X2 • Create a root labeled Δ = diam(X). • The children of the root are created recursively on X1, X2 • Using induction the number of distances we ignore is B – bad distances for current level. X X1 X2 Δ X1 X2 |B|≤ ε|X1||X2|
Where to Cut? • Take a point u such that |B(u,Δ/2)| ≤ n/2. • Let i=1,…,1/ε • Let Si=Ai+1-Ai • We need a “slim” shell… only distances inside the shell are distorted by more than Ai+1 Ai A1 u
Where to Cut? Ai Case 1: |A1|< εn. X1= u, X2= X\{u} A1 u Δ X\u
Where to Cut? • Case 2: • Choose an i such that: • Let X1=Ai+½, X2 = X\X1 X2 Ai+1 Ai A1 Δ u X1
Finding Shell Si • Assume by contradiction for all |Si|2 >εn|Ai| • Then by induction |Ai| ≥ εn·i2. which implies |At| ≥ n. • End of proof!
Lower Bounds • General method to obtain partial lower bounds from known classical ones. • Thm: given a lower bound αfor embedding a family X into a family Y :i.e. for any n there is XєX on n points and any embedding of X requires distortion at least α(n). Then there is X’єX for which any (1-ε) partial embedding requires distortion The family X must be nearly closed under composition!
Main corollaries • distortion for partial embedding into lp. [LLR, Mat] • distortion for partial embedding into trees. [Bartal/BLMN/RR]. • distortion for probabilistic partial embedding into trees. [Bartal] • distortion for partial embedding of doubling or l1metrics into l2. [NR]
General idea • Choose XєX such that • For each xєX create a metric Cxsuch that - CxєX. - • X’ contain many “copies” of X. • Let f be a (1-ε) partial embedding that ignores the set of edges I. By definition . X X’ δ d
Finding a copy of X • T: vertices intersecting less than edges in I. • For each xєX, choose some vxєCx∩T. • For each pair (vx,vy) find t єCy such that: Cx Cy vx vy in T in T t
Distortion of the Copy f has distortion guarantees for both these distances d(t,vy) is negligible vx vy t Its distortion must be at least