560 likes | 758 Views
ICMCM’09 December, 2009. Dynamic Models of On-line Social Networks. Anthony Bonato Ryerson University. Toronto in December…. Complex Networks. web graph, social networks, biological networks, internet networks , …. nodes : web pages edges : links
E N D
ICMCM’09 December, 2009 Dynamic Models of On-line Social Networks Anthony Bonato Ryerson University On-line Social Networks - Anthony Bonato
Toronto in December… On-line Social Networks - Anthony Bonato
Complex Networks • web graph, social networks, biological networks, internet networks, … On-line Social Networks - Anthony Bonato
nodes: web pages edges: links over 1 trillion nodes, with billions of nodes added each day The web graph On-line Social Networks - Anthony Bonato
Social Networks nodes: people edges: social interaction (eg friendship) On-line Social Networks - Anthony Bonato
On-line Social Networks (OSNs)Facebook, Twitter, Orkut, LinkedIn, GupShup… On-line Social Networks - Anthony Bonato
A new paradigm • half of all users of internet on some OSN • 250 million users on Facebook, 45 million on Twitter • unprecedented, massive record of social interaction • unprecedented access to information/news/gossip On-line Social Networks - Anthony Bonato
Properties of Complex Networks • observed properties: • massive, power law, small world, decentralized (Broder et al, 01) On-line Social Networks - Anthony Bonato
Small World Property • small world networks introduced by social scientists Watts & Strogatz in 1998 • low diameter/average distance (“6 degrees of separation”) • globally sparse, locally dense (high clustering coefficient) On-line Social Networks - Anthony Bonato
Paths in Twitter Dalai Lama Arnold Schwarzenegger Queen Rania of Jordan Christianne Amanpour Ashton Kutcher On-line Social Networks - Anthony Bonato
Why model complex networks? • uncover the generative mechanisms underlying complex networks • models are a predictive tool • nice mathematical challenges • models can uncover the hidden reality of networks • in OSNs: • community detection • advertising • security On-line Social Networks - Anthony Bonato
Many different models On-line Social Networks - Anthony Bonato
Social network analysis On-line • Milgram (67): average distance between two Americans is 6 • Watts and Strogatz (98): introduced small world property • Adamic et al. (03): early study of on-line social networks • Liben-Nowell et al. (05): small world property in LiveJournal • Kumar et al. (06): Flickr, Yahoo!360;average distances decrease with time • Golder et al. (06): studied 4 million users of Facebook • Ahn et al. (07): studiedCyworld in South Korea, along with MySpace and Orkut • Mislove et al. (07): studiedFlickr, YouTube, LiveJournal, Orkut • Java et al. (07): studied Twitter: power laws, small world On-line Social Networks - Anthony Bonato
Key parameters • power law degree distributions: • average distance: • clustering coefficient: Wiener index, W(G) On-line Social Networks - Anthony Bonato
Power laws in OSNs On-line Social Networks - Anthony Bonato
Flickr and Yahoo!360 • (Kumar et al,06):shrinking diameters On-line Social Networks - Anthony Bonato
Sample data: Flickr, YouTube, LiveJournal, Orkut • (Mislove et al,07): short average distances and high clustering coefficients On-line Social Networks - Anthony Bonato
(Leskovec, Kleinberg, Faloutsos,05): • many complex networks (including on-line social networks) obey two additional laws: • Densification Power Law • networks are becoming more dense over time; • i.e. average degree is increasing • e(t) ≈ n(t)a • where 1 < a ≤ 2:densification exponent • a=1: linear growth – constant average degree, such as in web graph models • a=2: quadratic growth – cliques On-line Social Networks - Anthony Bonato
Densification – Physics Citations e(t) 1.69 n(t) On-line Social Networks - Anthony Bonato
Densification – Autonomous Systems e(t) 1.18 n(t) On-line Social Networks - Anthony Bonato
Decreasing distances • distances (diameter and/or average distances) decrease with time • noted by Kumar et al. in Flickr and Yahoo!360 • Preferential attachment model (Barabási, Albert, 99), (Bollobás et al, 01) • diameter O(log t) • Random power law graph model (Chung, Lu, 02) • average distance O(log log t) On-line Social Networks - Anthony Bonato
Diameter – ArXiv citation graph diameter time [years] On-line Social Networks - Anthony Bonato
Diameter – Autonomous Systems diameter number of nodes On-line Social Networks - Anthony Bonato
Models for the laws • (Leskovec, Kleinberg, Faloutsos, 05, 07): • Forest Fire model • stochastic • densification power law, decreasing diameter, power law degree distribution • (Leskovec, Chakrabarti, Kleinberg,Faloutsos, 05, 07): • Kronecker Multiplication • deterministic • densification power law, decreasing diameter, power law degree distribution On-line Social Networks - Anthony Bonato
Models of OSNs • many models exist for general complex networks • few models for on-line social networks • goal: find a model which simulates many of the observed properties of OSNs • must be simple and evolve in a natural way • must be different than previous complex network models: densification and constant diameter! On-line Social Networks - Anthony Bonato
“All models are wrong, but some are more useful.” – G.P.E. Box On-line Social Networks - Anthony Bonato
Iterated Local Transitivity (ILT) model(Bonato, Hadi, Horn, Prałat, Wang, 08) • key paradigm is transitivity: friends of friends are more likely friends; eg (Girvan and Newman, 03) • iterative cloning of closed neighbour sets • deterministic: amenable to analysis • local: nodes often only have local influence • evolves over time, but retains memory of initial graph On-line Social Networks - Anthony Bonato
ILT model • parameter: finite simple undirected graph G = G0 • to form the graph Gt+1 for each vertex x from time t, add a vertex x’, the clone ofx, so that xx’ is an edge, and x’ is joined to each neighbour of x • order of Gt is 2tn0 On-line Social Networks - Anthony Bonato
G0 = C4 On-line Social Networks - Anthony Bonato
Properties of ILT model • average degree increasing to ∞ with time • average distance bounded by constant and converging, and in many cases decreasing with time; diameter does not change • clustering higher than in a random generated graph with same average degree • bad expansion: small gaps between 1st and 2nd eigenvalues in adjacency and normalized Laplacian matrices of Gt On-line Social Networks - Anthony Bonato
Densification • nt = order of Gt, et = size of Gt Lemma: For t > 0, nt = 2tn0, et = 3t(e0+n0) - 2tn0. → densification power law: et ≈ nta, where a = log(3)/log(2). On-line Social Networks - Anthony Bonato
Average distance Theorem 2: If t > 0, then • average distance bounded by a constant, and converges; for many initial graphs (large cycles) it decreases • diameter does not change from time 0 On-line Social Networks - Anthony Bonato
Clustering Coefficient Theorem 3: If t > 0, then c(Gt) = ntlog(7/8)+o(1). • higher clustering than in a random graph G(nt,p) with same order and average degree as Gt, which satisfies c(G(nt,p)) = ntlog(3/4)+o(1) On-line Social Networks - Anthony Bonato
Sketch of proof of lower bound • each node x at time t has a binary sequence corresponding to descendants from time 0, with a clone indicated by 1 • let e(x,t) be the number of edges in N(x) at time t • we may show that e(x,t+1) = 3e(x,t) + 2degt(x) e(x’,t+1) = e(x,t) + degt(x) • if there are k many 0’s in the binary sequence of x, then e(x,t) ≥ 3k-2e(x,2) = Ω(3k) On-line Social Networks - Anthony Bonato
Sketch of proof, continued • there are many nodes with k many 0’s in their binary sequence • hence, On-line Social Networks - Anthony Bonato
Example of community structure • Wayne Zachary’s Ph.D. thesis (1970-72): observed social ties and rivalries in a university karate club (34 nodes,78 edges) • during his observation, conflicts intensified and group split On-line Social Networks - Anthony Bonato
Adjacency matrix, A eigenvalue spectrum: (-2)41531 On-line Social Networks - Anthony Bonato
Spectral results • the spectral gapλ of G is defined by min{λ1, 2 - λn-1}, where 0 = λ0 ≤ λ1 ≤ … ≤ λn-1 ≤ 2 are the eigenvalues of the normalized Laplacian of G: I-D-1/2AD1/2(Chung, 97) • for random graphs, λtends to 1 as order grows • in the ILT model, λ < ½ • bad expansion/small spectral gaps in the ILT model found in social networks but not in the web graph (Estrada, 06) • in social networks, there are a higher number of intra- rather than inter-community links On-line Social Networks - Anthony Bonato
Random ILT model • randomize the ILT model • add random edges independently to new nodes, with probability a function of t • makes densification tunable • densification exponent becomes log(3 + ε) / log(2), where ε is any fixed real number in (0,1) • gives any exponent in (log(3)/log(2), 2) • similar (or better) distance, clustering and spectral results as in deterministic case On-line Social Networks - Anthony Bonato
Degree distribution • generate power law graphs from ILT? • deterministic ILT model gives a binomial-type distribution On-line Social Networks - Anthony Bonato
Geometric model for social networks • OSNs live in social space: proximity of nodes depends on common attributes (such as geography, gender, age, etc.) • IDEA: embed OSN in m-dimensional Euclidean space On-line Social Networks - Anthony Bonato
Dimension of an OSN • dimension of OSN: minimum number of attributes needed to classify nodes • like game of “20 Questions”: each question narrows range of possibilities • what is a credible mathematical formula for the dimension of an OSN? On-line Social Networks - Anthony Bonato
Random geometric graphs • nodes are randomly distributed in Euclidean space according to a given distribution • nodes are joined by an edge if and only if their distance is less than a threshold value (Penrose, 03) On-line Social Networks - Anthony Bonato
Spatial model for OSNs • we consider a spatial model of OSNs, where • nodes are embedded in m-dimensional Euclidean space • number of nodes is static • threshold value variable: a function of ranking of nodes On-line Social Networks - Anthony Bonato
Prestige-Based Spatial (PBS) Model(Bonato, Janssen, Prałat, 09) • parameters: α, β in (0,1), α+β < 1; positive integer m • nodes live in hypercube of dimension m, measure 1 • each node is ranked 1,2, …, n by some function r • 1 is best, n is worst • we use random initial ranking • at each time-step, one new node v is born, one node chosen u.a.r. dies (and ranking is updated) • each existing node u has a region of influence with volume • add edge uv if v is in the region of influence of u On-line Social Networks - Anthony Bonato
Notes on PBS model • models uses both geometry and ranking • dynamical system: gives rise to ergodic (therefore, convergent) Markov chain • users join and leave OSNs • number of nodes is static: fixed at n • order of OSNs has ceiling • top ranked nodes have larger regions of influence On-line Social Networks - Anthony Bonato
Properties of the PBS model (Bonato, Janssen, Prałat, 09) • with high probability, the PBS model generates graphs with the following properties: • power law degree distribution with exponent b = 1+1/α • average degree d =(1+o(1))n(1-α-β)/21-α • dense graph • tends to infinity with n • diameter D = (1+o(1))nβ/(1-α)m • depends on dimension m • m = clog n, then diameter is a constant On-line Social Networks - Anthony Bonato
Dimension of an OSN, continued • given the order of the network n, power law exponentb, average degree d, and diameterD, we can calculate m • gives formula for dimension of OSN: On-line Social Networks - Anthony Bonato
Uncovering the hidden reality • reverse engineering approach • given network data (n, b, d, D), dimension of an OSN gives smallest number of attributes needed to identify users • that is, given the graph structure, we can (theoretically) recover the social space On-line Social Networks - Anthony Bonato
Examples On-line Social Networks - Anthony Bonato