520 likes | 713 Views
Analyzing Kleinberg’s Small-world Model. Chip Martel and Van Nguyen Computer Science Department; University of California at Davis. Contents. Small-world phenomenon & Models Kleinberg’s Model Greedy routing The diameter of Kleinberg’s grid. Small-world phenomenon.
E N D
Analyzing Kleinberg’s Small-world Model Chip Martel and Van Nguyen Computer Science Department; University of California at Davis
Contents • Small-world phenomenon & Models • Kleinberg’s Model • Greedy routing • The diameter of Kleinberg’s grid
Small-world phenomenon • Two strangers meet and discover they are connected by a short chain of acquaintances Boston Nebraska • Milgram’s pioneering work (1967): “six degrees of separation between any two Americans” • Source person in Nebraska, target person in Boston People forward to someone they know on a first-name basis Paths were typically quite short
Milgram’s result shows that not only do short chains exist but they can be found using only local knowledge of links (people sent letters only knowing their own friends). People also knew about the general geography of of the “network”: e.g. New York is close to Boston. How can we model such networks?
Small World Properties • Small diameter: short path between all pairs (or almost all) • Efficient Greedy routing: short paths can be found with local knowledge • Clustering: If there is a link (u,v) and (u,w) then more likely to be a link (v,w)
Modeling Small-Worlds • Many networks are Small-Worlds (e.g. WWW, Social Networks, Physical systems) • Motivated models of small-worlds: (Watts-Strogatz, Kleinberg) • New Analysis and Algorithms • Applications • peer-to-peer systems • gossip protocols • secure distributed protocols
Based on an n by n, 2-D grid, where each node has 4 local undirected links Kleinberg’s Model
Based on an n by n, 2-D grid, where each node has 4 local undirected links Kleinberg’s Model • Add q directed random links per node q=2
Based on an n by n, 2-D grid, where each node has 4 local undirected links Kleinberg’s Model • Add q directed random links per each node where • Define d(u,v): lattice distance between u and v v d(u,v)=2+5=7 u • Now, u has a link to v with probability proportional to d -r(u,v). Parameter r determines crucial behaviors of the model.
Increasng r favors near nodes • r=0, • Link to each other node equally likely • r=1, inverse of distance • If a node is twice as far away, 1/2 as likely • r=2, inverse squared • If a node is twice as far away, 1/4 as likely d -r(u,v) =1 , Uniform Distribution
Normalization Constant • For a fixed r and u sum the probabilities to each other node to get the normalization constant C. Thus Pr[u->v] = 1/C * d -r(u,v) • r = 0, C =n2 –1 So, Pr[u->v] = 1/(n2 –1 ) For all v. • r = 2, C = O(logn) Pr[u->v] = (1/logn) * d -2(u,v)
When u is the current node, choose next v: the closest to t (use lattice distance) with (u,v) a local or random edge. t Kleinberg’s SW networkis Greedy Routable iff r=2 v • Greedy routing algorithm using local information only, find a short path from s to t u s
t Kleinberg’s SW networkis Greedy Routable iff r=2 v • A greedy routing algorithm using local information only, find a short path from s to t u s • This greedy routing achieves • expected `delivery time’ of O(log2n), i.e. the st paths have expected length O(log2n).
t Kleinberg’s SW networkis Greedy Routable iff r=2 v • A greedy routing algorithm using local information only, find a short path from s to t u s • This greedy routing achieves • expected `delivery time’ of O(log2n), i.e. the st paths have expected length O(log2n). • This does not work unless r=2 : for r2, >0 such that the expected delivery time of any decentralized algorithm is (n).
Greedy Routing Analysis • We say that the algorithm is in phase i • If the current node u has 2i d(u,t) < 2i+1 • The initial phase has i logn since d(s,t) < 2n for any pair. • If we are in phase i, how likely to jump to state i-1 ? • Pretty easy to show Pr[u->v| d(v,t) < 2i ] > 1/logn
Greedy Routing • The initial phase has i logn • We jump to the next lower phase with probability about 1/logn • So, expected O(logn) hops/ phase • After at most logn phases done. • total expected hops = logn * O(logn)= O(log2n).
Our Results • An analysis of the expected diameter of Kleinberg's setting. For a 2D-grid, One random link/node (q=1) • If 0 r 2: diameter=(logn) – PODC’04
Our Results • An analysis of the expected diameter of Kleinberg's setting. For a 2D-grid, One random link/node (q=1) • If 0 r 2: diameter=(logn) – PODC’04 • If 2< r <4: diameter < logcn SODA’05 • If 4< r: diameter > nc for 0<c<1
Our Results • An analysis of the expected diameter of Kleinberg's setting. For a 2D-grid, One random link/node (q=1) • If 0 r 2: diameter=(logn) – PODC’04 • If 2< r <4: diameter <logcn for c>1 • If 4< r: diameter> nc for 0<c<1 • Can be generalized for k-D grid, say if k< r <2k: diameter < logcn for c>1
Our new results: Routing • For a k-dimensional lattice model • The expected length of Kleinberg’s greedy paths is (log2 n). Also, they are this long with constant probability. • With more local knowledge we can improve the path length to O(log1+1/k n)
Prior work on similar (diameter) problems • Diameter of a cycle plus a random matching: Bollobas & Chung, 88 • Can be seen as a special case of Kleinberg’s grid setting where: 1-D lattice, undirected graph, r=0 (random links are uniform) • Diameter of long-range percolation graphs • Benjamini & Berger, 2001 • Coppersmith et al., 2002 • Biskup, 2004: similar to our approach
O(log n) Expected Diameter Proof for simple setting: • 2D grid with wraparound • 4 random links per node, with r=2 • Extend to: • K-D grids, 1 random link, • No wraparound
We construct neighbor trees from s and to t: is the nodes within logn of sin the grid is nodes at distance i (random links) from The diameter bound:Intuition s
T-Tree is the nodes within logn of tin the grid is nodes at distance i (random links) to t
Sj Ti T2 S2 T0 t t S1 T1 s S0 Small-worlds: Finding a short path from s to t • Si= nodes at distance i from S0, an Initial neighbor set of sufficient size • Ti is nodes at distance i (random links) to Initial setT0 • We want: the {Si} and {Ti} to grow exponentially until big enough, so the two subset chains intersect with high probability.
Subset chains • After O(logn) Growth steps and are almost surely of size nlogn • Thus the trees almost surely connect • Similar to Bollobas-Chung approach for a ring + random matching. • But new complications since non-uniform distribution and directed edges
Proving Exponential Growth • Growth rate depends on set size and shape • We analyze using an artificial experiment
Links into or out of a ball • Motivation • Links to outside For set C , node u C, a random link from u: How likely is this link to leave C ? • Links into • Given: subset C , node u C. • How likely is a link to u from outside C ? • Worst shape for C: A ball (with same size)
C u Exponential Growth • Neighbor sets should have exponential growth • If a node u is surrounded by a moderate size set of vertices C, a random link from u is likely to “escape” from C.
Links into or out of a ball: the facts • BL(u) ={nodes within distance L from u } • For any 0< <1, any integer 1 L n, for n large The Pr[u->v | v outside of BL (u) ] > 1--o(1) • Similar for a random link to u from outside of BL (u) • Note that BL (u) has about L2 nodes. • For a ball with radius n.51 a random link from the center leaves the ball with probability > .48 • With 4 links, expect 4*.48 > 1.9 new nodes.
S-Tree growth • By making the initial set larger than clogn, a growth step is exponential with probability: • By choosing c large enough, we can make m large enough so our sets almost surely grow exponentially to size nlogn
The t-Tree • Ball experiment for t-tree needs some extra care (links are conditioned) • Still can show exponential growth • Easy to show two (nlogn) size sets of `fresh’ nodes intersect or a link from s-set hits t-set • More care on constants leads to a diameter bound of 3logn + o(logn)
Reducing the Random Links • To change from 1/node to 4/node: • Collect nodes into super-nodes • Each 2x2 square contracted to a super-node • New graph has 4 random links /node and diameter differs from old by a constant factor
Diameter Results • Thus, for a K-D grid with added link(s) from u to v proportional to • The expected diameter is (log n) for • Now look at r > k.
The diameter of Kleinberg’s SW setting 0 n-1 1 • For simplicity, use a 1-D setting • Define C(r,n)as an n-node cycle. • Each node has 2 local links and • One directed random-link: i is connected to j i with Pr[ij] ~ |i-j|-r . 2 . . . . . j i • For 0 r 1, we showed the diameter is (logn) • Now consider r>1.
Upper bound for the diameter of C(r,n) when 1<r<2 • We use a probabilistic recurrence approach • Our approach is similar to Karp's (STOC’91) • We establish a (probabilistic) relation between the diameter of a segment and that of a smaller one.
Upper bound for the diameter of C(r,n) when 1<r<2 • We use a (probabilistic) relation between the diameter of a segment and a sub-segment. • We relate D(x) , the diameter of a segment of length x, to D(y), where y=xa for some a(0,1). • Intuitively, w.h.p, D(x)is bounded by a constant multiple of D(y).
D(n) D(na) D(na2) … D(x0) Upper bound for the diameter of C(r,n) when 1<r<2 • Iterating the relation, starting with x=n, standard recurrence techniques bound D(n) - the graph's expected diameter - based on D(x0) for some x0 small enough (a poly-log function of n).
Partitioning Hierarchy Partitioning: A segment of length x is divided into multiple sub-segments of length y=x afor a(0,1).
Partitioning Hierarchy • Partitioning • A segment of length x is divided into multiple sub-segments of length y=x a for some a(0,1). B A • A partition is complete when every pair of sub-segments has two random directed edges connecting one to the other.
D(n) D(na) D(na2) … D(x0) Partitioning Hierarchy • We iterate this partitioning from x=n to some small x0 (for fixed a). We need to specify y (or a) s.t. • Small enough # iterations is order loglog (n) • Not too small Almost surely, each phase’s partition is complete
Supporting Facts • Fact 1: For a fixed a s.t.r/2< a <1and for xlarge enough, almost surely, all partitions of length x segments are complete • Note: 0<r<1 and y=x a • Implies that all sub-segments are large enough so can get to another by one link.
* * w t u+x-1 s v u B A Supporting Facts • Fact 2: If a partitionof a segment of length x is complete, then almost surely D(x)is at most twice the maximum diameter of a subsegment, plus 1. • Basically, any shortest path st can be upper bounded by two shortest paths within a sub-segment plus 1 length(st) length(sv)+length(wt)+1for (v,w) 2 max D(y) +1
Supporting Facts • Fact 2: If a partitionof a segment of length x is complete, then almost surely D(x)is at most twice the maximum diameter of a sub-segment, plus 1. • Fact 2 still true if we redefine D(x) as the maximum value of the diameters of all segments of length x
Poly-log diameter for 1r2 • Consider the sequence of maximum diameter values in our partitioning hierarchy D(n), D(na), … ,D(x0) Where almost surely, D(x) 2D(x a)+1 • The # of terms is (loglog n) • D(x0) x0, bounded by a poly-log(n) • So, D(n)= O(logcn) for c>0 depending on r only
The diameter of C(r,n) • For r>2, C(r,n) is a ‘large’ world expected diameter (nc), c=r-1/r • Random links tend to go to close nodes Few long links
Higher dimensions • We generalize to k-dimensional grids • If 0 r k: diameter=(logn) • If k< r <2k: diameter < logcn , c>1 • If 2k< r: diameter> nc for 0<c<1 • The case r=2kis still open.
Analyzing Greedy Routing • For r=k (so r=2 for 2D grid), Kleinberg shows greedy routing is O((log2n) . • We show this bound is tight, and: With probability greater than1/2, Kleinberg’s algorithm uses at least clog2n steps. • Fraigniaud et. al also show tight bound, and • Suggested by Barriere et. al 1-D result.
Proof of the tight bound (ideas) • How fast does a step reduce the remaining distance to the destination? • We measure the ratio between the distance to t before and after each random trial: We reach t when the product of the ratios =d(s,t)
Rate of Progress • To avoid a product of ratios, we transform to Zv , log of the ratio: d(v,t)/d(v’,t) where v’ is the next vertex. • Done when sum of Zv totals log(d(s,t)) • Show E[Zv] = O(1/logn), so need (log2 n) steps to total log(d(s,t))= logn.