Algorithmic Aspects of Finite Metric Spaces

Algorithmic Aspects of Finite Metric Spaces Moses Charikar Princeton University

x z y Metric Space • A set of points X • Distance function d(x,y)d : X [0…) • d(x,y) = 0 iff x=y • d(x,y) = d(y,x) Symmetric • d(x,z) ≤ d(x,y) + d(y,z)Triangle inequality • Metric space M(X,d)

Example Metrics: Normed spaces • x = (x1, x2, …, xd) y = (y1, y2, …, yd) • ℓpnorm ℓ1ℓ2(Euclidean)ℓ • ℓpd : ℓpnorm in Rd • Hamming cube {0,1}d

Example Metrics: domain specific • Shortest path distances on graph • Symmetric difference on sets • Edit distance on strings • Hausdorff distance, Earth Mover Distance on sets of n points

Metric Embeddings • General idea: Map complex metrics to simple metrics • Why ? richer algorithmic toolkit for simple metrics • Simple metrics • normed spaces ℓp • low dimensional normed spaces ℓpd • tree metrics • Mapping should not change distances much (low distortion)

Low Distortion Embeddings f • Metric spaces (X1,d1) & (X2,d2),embedding f: X1 X2 has distortion D if ratio of distances changes by ≤ D x,y  X1: http://humanities.ucsd.edu/courses/kuchtahum4/pix/earth.jpg http://www.physast.uga.edu/~jss/1010/ch10/earth.jpg

Applications • High dimensional  Low dimensional(Dimension reduction) • Algorithmic efficiency (running time) • Compact representation (storage space) • Streaming algorithms • Specific metrics  normed spaces • Nearest neighbor search • Optimization problems • General metrics  tree metrics • Optimization problems, online algorithms Solve problems on very large data sets in one pass using a very small amount of storage

A (very) Brief History:fundamental results • Metric spaces studied in functional analysis • n point metric embeds into ℓnwithno distortion[Frechet] • n point metric embeds into ℓp with distortion log n [Bourgain ’85] • Dimension reduction fornpoint Euclidean metric with distortion 1+ε[Johnson, Lindenstrauss ’84]

A (very) Brief History:applications in Computer Science • Optimization problems • Application to graph partitioning [Linial, London, Rabinovich ‘95][Arora, Rao, Vazirani ’04] • n point metrics into tree metrics[Bartal ’96 ‘98] [FRT ’03] • Efficient algorithms • Dimension reduction • Nearest neighbor search, Streaming algorithms

Outline metric as data • Dimension reduction • Streaming data model • Compact representation • Finite metrics in optimization • graph partitioning and clustering Embedding theorems for finite metrics metric as model

Disclaimer • This is not an attempt at a survey • Biased by my own interests • Much more relevant and related work than I can do justice do in limited time. • Goal: Give glimpse of different applications of finite metric spaces • Core ideas, no messy details

Disclaimer: Community Bias • Theoretical viewpoint • Focus on algorithmic techniques with performance guarantees • Worst case guarantees

Metric as data • What is the data ? • Mathematical representation of objects (e.g. documents, images, customer profiles, queries). • Sets, vectors, points in Euclidean space, points in a metric space, vertices of a graph. • Metric is part of data

Johnson Lindenstrauss [JL84] • n points in Euclidean space (ℓ2 norm) can be mapped down to O((log n)/2) dimensions with distortion at most 1+. • Quite simple [JL84, FM88, IM98, AV99, DG99, Ach01] • Project onto random unit vectors • projection of (u-v) onto one random vector behaves like Gaussian scaled by ||u-v||2 • Need log n dimensionsfor tight concentration bounds • Even a random {-1,+1} vector works…

Dimension reduction for ℓ2 • Two interesting properties: • Linear mapping • Oblivious – choice of linear mapping does not depend on point set • Many applications … • Making high dimensional problems tractable • Streaming algorithms • Learning mixtures of gaussians [Dasgupta ’99] • Learning robust concepts [Arriaga,Vempala ’99] [Klivans,Servedio ’04]

Dimension reduction for ℓ1 • [C,Sahai ‘02]Linear embeddings are not good for dimension reduction in ℓ1 • There exist n points in ℓ1in n dimensions, such that any linear mapping with distortion  needs n/2dimensions

Dimension reduction for ℓ1 • [C, Brinkman ‘03]Strong lower boundsfor dimension reduction in ℓ1 • There exist n points in ℓ1, such that anyembedding with constant distortion  needs n1/2dimensions • Alternate, simpler proof [Lee, Naor ’03]

Outline metric as data • Dimension reduction • Streaming data model • Compact representation • Finite metrics in optimization • graph partitioning and clustering Embedding theorems for finite metrics Solve problems on very large data sets in one pass using a very small amount of storage metric as model

Frequency Moments [Alon,Matias,Szegedy ‘99] Data stream is sequence of elements in [n] ni : frequency of element i Fk =nik: kth frequency moment F0 = number of distinct elements F2= skewness measure of data stream Goal: Given a data stream, estimate Fk in one pass and sub-linear space

Estimating F2 • Consider a single counter c and randomly chosen xi{ +1, -1} for each i in [n] • On seeing each element i, update c += xi • c =  ni •xi • Claim: E[c2] = ni2= F2Var[c2]  2(F2)2(4-wise independence) • Average 1/2 copies of this estimator to get (1+) approximation

Differences between data streams • ni : frequency of element i in stream 1 • mi : frequency of element i in stream 2 • Goal: measure  (ni –mi)2 • F2 sketches are additive ni •xi -  mi •xi =  (ni –mi)•xi • Basically, dimension reduction in ℓ2 norm • Very useful primitivee.g. frequent items [C, Chen, Farach-Colton ’02]

Estimate ℓ1 norms ? [Indyk ’00] • p-stable distribution:Distribution over R such that ni •xi distributed as ( |ni|p)1/p X • Cauchy distribution: c(x)=1/(1+x2) 1-stable • Gaussian distribution 2-stable • As before, c =  ni •xi • Cauchy does not have finite expectation ! • Estimate scale factor by taking median

Similarity Preserving Hash Functions • Similarity function sim(x,y) • Family of hash functions F with probability distribution such that

Applications • Compact representation scheme for estimating similarity • Approximate nearest neighbor search [Indyk,Motwani ’98] [Kushilevitz,Ostrovsky,Rabani ‘98]

Estimating Set Similarity [Broder,Manasse,Glassman,Zweig,’97] [Broder,C,Frieze,Mitzenmacher,’98] • Collection of subsets

Minwise Independent Permutations

Existence of SPH schemes [C ’02] • sim(x,y) admits an SPH scheme if family of hash functions F such that Theorem: If sim(x,y) admits an SPH scheme then 1-sim(x,y) satisfies triangle inequality; embeds into ℓ1 • Rounding procedures for LPs and SDPs yield similarity and distance preserving hashing schemes.

Earth Mover Distance (EMD) LP Rounding algorithms for optimization problem (metric labelling) yield log n approximate estimator for EMD on n points. Implies that EMD embeds into ℓ1 with distortion log n P Q EMD(P,Q)

U V Graph partitioning problems • Given graph, partition into U,V • Maximum cut maximize |E(U,V)| • Sparsest cut minimize

Mr. Rumsfeld his The secretary he Saddam Hussein Correlation clustering [Cohen,Richman,’02][Bansal,Blum,Chawla,’02] Similar (+) Dissimilar (-) example courtesy Shuchi Chawla Mr. Rumsfeld his The secretary he Saddam Hussein

U V 1 0 0 Graph partitioning as metric problem • Partitioning is equivalent to finding appropriate {0,1} metric • possibly additional constraints • Objective function linear in metric • Find best {0,1} metric cut metric relaxation

Metric relaxation approaches • Max Cut [Goemans,Williamson ’94] • map vertices to points on unit sphere (SDP) • exploit geometry to get good solution(random hyperplane cut) • Sparsest Cut [Linial,London,Rabinovich ’95] • LP gives best metric; need ℓ1 metric • [Bourgain ’84] embeds any metric into ℓ1 with distortion log n • Existential theorem can be made algorithmic • log n approximation • recent SDP based log n approximation[Arora,Rao,Vazirani ’04]

Metric relaxation approaches • Correlation clustering [C,Guruswami,Wirth,’03] [Emanuel,Fiat,’03] [Immorlica,Karger,’03] • Find best [0,1] metric from similarity/dissimilarity data via LP • Use metric to guide clustering • close points in same cluster • distant points in different clusters • “Learning” best metric ? • Note: In many cases, LP/SDP can be eliminated to yield efficient algorithms

Some connections to learning • Dimension reduction in ℓ2 : • Learning mixtures of Gaussians [Dasgupta ’99]Random projections make skewed gaussians more spherical, making learning easier • Learning with large margin[Arriaga,Vempala ’99] [Klivans,Servedio ’04]Random projections preserve margin,large margin  few dimensions • Kernel methods for SVMs • mappings to ℓ2

Ongoing developments • Notion of intrinsic dimensionality of metric space[Gupta,Krauthgamer,Lee,’03][Krauthgamer,Lee,Mendel,Naor,’04] • Doubling dimension: How many balls of radius R needed to cover ball of radius 2R ? • Complexity measure of metric space • natural parameter for embeddings • Open: Can every metric of constant doubling dimension in ℓ2 be embedded into ℓ2 with O(1) dimensions and O(1) distortion ? • Not true for ℓ1 • related to learning low dimension manifolds, PCA, MDS, LLE, Isomap

Some things I didn’t mention • Approximating general metrics via tree metrics • modified notion of distortion • useful for approximation, online algorithms • Many mathematically appealing questions • Embeddings between normed spaces • Spectral methods for approximating matrices (SVD, LSI) • PCA, MDS, LLE, Isomap

Conclusions • Whirlwind tour of finite metrics • Rich algorithmic toolkit for finite metric spaces • Synergy between Computer Science and Mathematics • Exciting area of active research • range from practical applications to deep theoretical questions • Many more applications to be discovered

Algorithmic Aspects of Finite Metric Spaces

Algorithmic Aspects of Finite Metric Spaces

Presentation Transcript

CS522: Algorithmic and Economic Aspects of the Internet

The Intrinsic Dimension of Metric Spaces

Algorithmic and Economic Aspects of Networks

Algorithmic and Economic Aspects of Networks

Special Topics on Algorithmic Aspects of Wireless Networking

Algorithmic and Economic Aspects of Networks

Algorithmic and Economic Aspects of Networks

Lower Bounds on the Distortion of Embedding Finite Metric Spaces in Graphs

Algorithmic and Economic Aspects of Networks

Tighter local versus global properties of metric spaces

Finite Dimensional Vector Spaces

Hidden Metric Spaces and Navigability of Complex Networks

Local and Global Embeddings of Metric Spaces

Algorithmic Aspects of Searching in the Past

Algorithmic Aspects of Searching in the Past

CS522: Algorithmic and Economic Aspects of the Internet

Algorithmic and Economic Aspects of Networks

Algorithmic and Economic Aspects of Networks

6.4 Finite Dimensional Spaces

CS522: Algorithmic and Economic Aspects of the Internet

CS522: Algorithmic and Economic Aspects of the Internet