310 likes | 572 Views
Review of Literature. The Small World Phenomenon: An Algorithmic Perspective. Jon Kleinberg. Reviewed by: Siddharth Srinivasan. Oh, it’s such a small world!!. Milgram (1967, 69) – performed an empirical validation of the small world concept in sociology. Previous work-
E N D
Review of Literature The Small World Phenomenon: An Algorithmic Perspective Jon Kleinberg Reviewed by: Siddharth Srinivasan
Oh, it’s such a small world!! • Milgram (1967, 69) – performed an empirical validation of the small world concept in sociology. • Previous work- • Pool and Kochen model 2 people at random connected with k intermediaries. Assumes synthetic, homogenous structure. • Rapaport and Horvath – empirical study on school friendships. Asymmetric nets and Universe is small. • Packet sent by a randomly chosen source to a random target. • Mean chain length = 5.2 • Variables of geographic proximity, profession and sex • Funneling of chains by certain individuals
Small world! Small world! • White (1970) – tries fitting a simple model to Milgram’s work. • Gives hints to future work • Killworth & Bernard (1979) – Reverse SW • To understand social network structure, factors that influence the choice of acquaintance, the out-degree of people. • Results: • Generation of contacts not purely random. • Large number of contacts for local targets; few contacts for non-local targets. • The size of geographical area that a single contact is responsible for decreases as a function of the distance of the target from starter. • Most choices based on cues of occupation and geographic location.
Small Worlds Everywhere • Watts and Strogatz (1998) • Very small number of long range contacts needed to decrease path lengths without much reduction in cliquishness. • Long range contact picked uniformly at random (u.a.r) • Small world networks in 3 different areas esp. spread of infectious disease. • Probabilistic reach. No specific destinations. • Doesn’t require knowledge of paths and no active path selection. • Barabasi et al.(1999) – diameter of the WWW • Power-law distribution; Logarithmic diameter. • Need for search engines to intelligently pick links
Two Important Properties of Small World Networks • Low average hop count • High clustering coefficient Additionally, may be searchable on the basis of local information
Enter Kleinberg… • Two issues of concern in small-world networks: • Presence of short paths in a small world network • how do you find the short chains? • Gives an infinite family of small world network models on a grid n/w with power-law distributed random long-range links. • K(n,k,p,q,r) • p – radius of neighbours to which short, local links • q – no. of random long range links • k - dimension of mesh (k=2 in this paper) • r - clustering exponent of inverse power-law distribution. • Prob.[(x,y)] dist(x,y)-r. • Decentralized greedy routing algorithm • Decisions based on local information only.
Bounds on Kleinberg’s Model • Expected Delivery time = • O((log n)2), for r = 2. • Ω(n(2-r)/3), for 0 ≤ r < 2. • Ω(n(r-2)/(r-1)), for 2 < r • Disproves usefulness of Watts & Strogatz model (r=0). • Only for special case of r = k, possible to find short chains always of length O((log n)2) and dia = O(log n) (dia bound not proved by Kleinberg in this paper). • Cues used in small world networks propounded to be provided through a correlation between structure and distribution of long-range connections.
Proof of the upper bound • For r=2, p=1, q=1. • Event Eu(v) - u chooses v as its random long range contact • Prob[Eu(v)] = • Prob[Eu(v)] ≤ [4 ln(6n) d(u,v)2]-1. • In phase j, 2j < d(u,t) ≤ 2j+1. For log(log n) < j < log n, • No. of nodes in Bj ≥ each within lattice distance 2j + 2j+1 < 2j+2 of u • Prob[Enters Bj] ≥ • Steps in j = Xj; •
Proof of lower bound 1 • As in the previous proof, where, assumed that n2-r ≥ 23-r. • Let δ = (2-r)/3 and U be the set of nodes witihin radius pnδ of t. where, assumed that pnδ≥2. • Let ’ be the event that the msg reaches a node in Ut in λnδsteps. Let ’i be the event that this happens in the ith step. where
Proof of lower bound 1 contd. • Let events F (s and t separated by ≥ n/4). Pr[F] ≥ ½; Pr[!F ’] ≤ ¾; and so Pr[F !’] ≥ ¼. • Let - event that msg reaches t from s in λnδsteps. cannot occur if (F !’) occurs. • Pr[ | (F !’)] = 0 and E[X|(F !’)] ≥ λnδsteps. • E[X] ≥ E[X|(F !’)] . Pr[F !’] ≥ ¼λnδsteps, where, X is the random variable denoting the no. of steps. • Thus, lower bound on expected no. of steps is Ω(n(2-r)/3), for 0≤ r < 2.
Proof of lower bound 2 • Similar to the previous proof, where, ε = r-2. • Let β = ε/1+ε,γ = 1/1+ε, and λ’ = min(1,ε)/8q. Assumed that nγ ≥ p. • Let ’i be the event that in the ith step, msg reaches u w/ a long range contact v such that d(u,v)>nγ. Let ’ be the event that this happens in λ’nβsteps. • Similar to the previous proof, • max dist. Covered w/o ’ occuring is and hence, • Thus, lower bound on expected no. of steps is Ω(n(r-2)/(r-1)), for 2 < r
Major Ideas Contributed • Gives a model of a small world network where local routing is possible using small paths. • Shows the more generalized results for k dimensions in a subsequent publication. • Correlation between local structure and long range links provides fundamental cues for finding paths. • When r<k, few cues provided by the structure • When r>k, long range links do not provide sufficiently long jumps and path becomes long.
Questions Raised • Can the expected delivery time be reduced to the bounds of the diameter? • Is the model extendable to more general networks? • Can less regular base graphs also produce navigable small worlds?
Work Done post-papyri • Further analysis and generalization of Kleinberg’s models and other small world models • Conversion of general networks to small world networks • Applications of the small world idea to real networks
Further Analysis and Generalizations 1 • Barriere et al.(2001) – • proves Θ((log n)2) bound on routing complexity. Simplified analysis using a ring instead of a grid. • Oblivious greedy routing. • Basic concept used in analysis – (f, c)-long range contact graph – if for any pair (u,t) at distance at most d, we have Pr[u→Bd/c(t)] ≥ 1/f(d). • If graph (G, p) is an (f, c)-long range contact graph then greedy routing in O(∑i=1logcD f(D/ci)) expected steps. • If p is a non-decreasing fn., then Pr[u→Bd/c(t)] ≥ Pr[(c+1)d/c] . |Bd/c(t)| • extends results to any ring by epimorphisms (embedding) one graph to another.
Further Analysis and Generalizations 2 • Martel, C. and Nguyen, V. (2004): • Shows that Kleinberg’s algo is tight Θ(log2 n) expected delivery time and diameter tight at Θ(log n). • For k-dimensional grid as well. • If additional info, then O(log3/2 n) for k=2 and O(log1+1/k n) for k≥1. • Proof done in a manner that uses some interesting conceptual ideas (used by others previously as well): • p(u, v) = d−2(u, v)/cu , cu = ∑ d−2(u, v) = ∑ bj(u) j-2 ; • bj(u) = Θ (j), so, cu approx. as a harmonic sum. • Inherently uses the concept of gradient, δ(v) = d(v,t) – d(N(v),t), to show the lower bound. • Uses the concept of harmonics to get for any integer 1 < m < d(v, t): • Expected delivery time is Ω(log2n) for any s and t w/ probability ≥ 0.5 when d(s,t) is O(n).
Extended algo – Window (no. of neighbouring nodes whose long range contacts are known) = log n. • In k dimensions, O(log1+1/k n). Prove only for k=2. • Diameter = Θ(log n). Extended to all possible K|K*(k,n,p,q) where k, p, q ≥ 1 and even for 0<r<2. • grow trees from s and t using only long-range links starting from an initial set of size Θ(log n) and going upto a set of size Θ(nlog n) in O(log n) steps. With very high probability, these sets will overlap or be separated by a single link. • Extensions based on concept of developing supernodes (composite of neighbouring nodes to get all their random links) for analysis. • Subsequent work shows that • poly-log expected dia. when k<r<2k • Polynomial expected dia. when r>2k.
Further Analysis and Generalizations 3 • Fraigniaud et al. (2004) – “Eclecticism shrinks even small worlds” • Dimensions need not mean only geographical dimensions but can refer to the various parameters used for routing in social networks – geography, occupation, education, socio-economic status etc. • Higher dimensions intuitively must give better performance, • dimension not considered in routing performance in the greedy algo proposed by Kleinberg since O(log2n) in all dimensions. • Giving O(log2n) bits of topological awareness per node decreases the expected number of steps of greedy routing to O(log1+1/k n) in k-dimensional augmented meshes.
Called indirect greedy routing. Completely oblivious routing. • Analysis proves that between two nodes in a sequence of long-range nodes, dist(zi, zi+1) ≤ log1/kn. And, totally O(log n) such nodes. • Augmenting the topological awareness above this optimum of O(log2 n) bits would drastically decrease the performance of greedy routing. • Perhaps a first step towards the formalization of arguments in favor of the sociological evidence stating that eclecticism shrinks the world.
Further Analysis and Generalizations 4 • Raghavan et al. (2005). “Theoretical Analysis of Geographic Routing in Social Networks.” • rank-based friendship - probability that a person v is a friend of a person u is inversely proportional to the number of people w who live closer to u than v does. • ranku(v) = no. of people w such that d(u,w) < d(u,v). • prob(u,v) = ranku(v)-1. • more accurately models the behaviour of social networks – verified against LiveJournal data. • in a grid setting, prob(u,v) = rank-1 = d-k. • Halves distance in expected polylogarithmic steps – • Starting from s, expected number of steps before reaches a point in Bd(s,t)/2(t) is O(log n log m) = O(log2 n) • Finds short paths in all 2-D meshes – • For any 2-dimensional mesh population network with n people and m locations, expected path length is O(log n log2m) = O(log3 n). • Interesting proof methodology – using only balls. Plus rank and balls is general over all dimensions.
Further Analysis and Generalizations 5 • Watts et al. (2002) and Motter et al. (2003). • hierarchies of social groups with groups having some correlation between them. • social ties generated by picking links from social groups according to p.distribution governed by social affinity. • Manku et al. (2004). Know thy neighbour’s neighbour. • Shows that if every node is aware of the long-range links of its neighbours then greedy routing in O(log2n/(clog c)) with c long range contacts per node.
Conversion to small world networks • Duchon et al. (2006). At INRIA • On bounded growth graphs and extended to polylogarithmic expansion rates. • Using O(n) rounds and O(polylog n) space. No need for a node to have complete knowledge of the graph. • Any synchronized n-node network of bounded growth, of diameter D, and maximum degree Δ, can be turned into a small world via the addition of one link per node, • in O(n) rounds, with an expected number of messages O(nD log n), and requiring O(Δ log n logD) memory size with high probability, or, • in O(D) rounds with an expected number of messages O(nlog D log n), and requiring O(n) bits of memory in each node with high probability • In the augmented network, the greedy routing algorithm computes paths of expected length O(logDlog δ + log n) between any pair of nodes at mutual distance δ in the original network. • Sampling of leader nodes. • Only leader nodes explore a ball Bv(3l), when asked by a node u at a distance ≤ l (l=2i), to select a random long range link for it, where i is selected u.a.r.
Some Applications Areas • P2P overlay networks • Distributed hashing protocols • Security systems in mobile ad hoc networks • Hybrid sensor networks • Referral systems
Applications:Distributed Hashing • Manku et al. (2002) – Symphony • arrange all participants in a ring I [0,1). • A node manages that sub-range of I which corresponds to the segment between itself and its two neighbours • equip them with long range contacts • drawn randomly from a family of harmonic distributions • p = 1/(x ln n) where x[1/n, 1] drawn u.a.r. • advantages – low degree, can handle heterogeneity by variable number of long range links and only two mandatory short links, low latency O((log n)/k). • for fault tolerance, add f number of backups but only on the short link neighbours.
Applications:P2P Overlay Networks • Bonsma (2002) - SWAN (Small World Adaptive Network) • each node has 3 types of links – bootstrap, local (short-range) and long-range (random). • Hui et al. - SWOP (Small World Overlay Protocol) • Cluster links and long links • Head nodes and inner nodes • Pdf: Prob[X’=x] = p(x) = 1/(x ln m) where, x[1,m] and m is no. of clusters • To handle flash crowds, demand-driven replication over long links.
Applications:Hybrid Sensor Networks • Sharma & Mazumdar (2005) – • Adding of a few shortcut wires between wireless sensors. • Reduced energy dissipation per node as well as non-uniformity in expenditure. • Deterministic as well as probabilistic placement of wires. • Few wires unlike 1 long range contact per node in Kleinberg’s model. One in a cell / group of cells of sensors is wired. • Very good performance in static sink node case • with addition of Θ(nl(n)/log n) wires, average hop count reduced to Θ(1/√l(n)) and EDS to Θ(1/l(n)). • In dynamic case, with greedy routing, hop count cant be reduced below Ω(1/l(n)).
Applications:Security Systems in Ad Hoc N/ws • Hubaux et al. (2002). • Gray et al. (2003). Trust propagation
Bibliography • Albert, Jeong, Barabasi (1999). Diameter of the World Wide Web, Nature. • Barriere, Fraigniaud, Kranakis, Krizanc (2001). Efficient routing in networks with long range contacts • Bonsma and Hoile (2002). A distributed implementation of the SWAN peer-to-peer look-up system using mobile agents. • Duchon, Hanusse, Lebhar, Schabanel (2006). Fully distributed scheme to turn a network in to a small world. Research report No. 2006-03, INRIA Lyon. • Fraigniaud, Gavoille, Paul (2004). Eclecticism shrinks even small worlds. • Gray, Seigneur, Chen, Jensen (2003). Trust propagation in small world networks. • Helmy, A. (2003). Small Worlds in Wireless Networks. IEEE Commun. Lett., vol.7, no.10, pp. 490-492, Oct. 2003. G/A, 14. • Hawick & James (2004). Small-World Effects in Wireless Agent Sensor Networks. • Hubaux, J.P., Capkun, S., Buttyan, L., (2002). Small Worlds in Security Systems: an Analysis of the PGP Certificate Graph. In: New Security Paradigms Workshop, Norfolk, VA. • Hui, Lui, Yau (2006). Small world overlay P2P networks: construction and handling dynamic flash crowds. Accepted in J. of Comp. Networks. • Killworth, Bernard (1979). Reverse Small World Experiment, Social Networks. • Kleinberg (2000). Navigation in a small world, Nature.
Manku, Bawa, Raghavan (2003). Symphony: Distributed hashing in a small world. USENIX Symposium on Internet Technologies and Systems. • G. Manku, M. Naor, and U. Wieder (2004). Know Thy Neighbor’s Neighbor: The Power of Lookahead in Randomized P2P Networks. In 36th ACM Symp. On Theory of Computing (STOC). • Martel, C. and Nguyen, V. (2004). Analyzing Kleinberg’s (and other) small world networks. ACM PODC ’04. • Milgram, Travers (1969). An experimental study of the small world problem, Sociometry. • Motter, Nishikawa and Lai (2003). Large scale structural organization of social networks. Physical Review. • Raghavan, Kumar, Liben-Nowell, Novak, Andrew Tomkins (2005). Geographic Routing in Social Networks. • Raghavan, Kumar, Liben-Nowell, Novak, Andrew Tomkins (2005). Theoretical Analysis of Geographic Routing in Social Networks. • Sharma, Mazumdar (2005). Hybrid Sensor Networks: a small world. • Watts and Strogatz (1998). Collective dynamics of small world networks, Nature. • Watts, D., Dodds, P., Newman, M.: Identity and Search in Social Networks. Science, 296 (2002) 1302–1305 • White (1970). Search parameters for the small world problem, Social Forces. • Yu, Singh (2003). Searching social networks.