460 likes | 470 Views
This chapter delves into the Small World Model, discussing its unique features such as transitivity and clustering coefficients. It compares different network models and their ability to generate realistic social networks, emphasizing the balance between clustering and small-world effect. The chapter explores the implications of various parameters in the Small World Model and its applications in analyzing real-world networks. Topics covered include the impact of rewiring edges, shortcut addition, and degree distribution in modeling social networks.
E N D
MIS 644Social Newtork Analysis2015/2016 Spring Chapter 6 III The Small World Model
Outline • Introduction • Small world model
Introduction • One of the least well-understood - real networks • trnsitivity: propensty of two neighbors of a vertex being neighbors of one another • Neither RG or CM nor network growth models • generate significant lvvel of transitivity • Measured by clustering coefficients • E.g. As n becomes large – CC vanishes • Orders of magnitudes smaller then observed for real nets
A simple triangular latice has a trnsitivity • # of triangles = 2 n • C(6,2) =15 connected triples for each vertex • 0.4 competible with many real social networks • Not depends on size of the network
Fig. 15.1 of N-N • A triangular lattice. Any vertex in a triangular lattice, such as the one highlightedhere, has six neighbors and hence pairs of neighbors, of which six are connected by edges,giving a clustering coefficient of = 0.4 for the whole network, regardless of size
Another model with high CC • Fig. 15.2a of N-N • Vertices – on a one dimensional line • connectd to c nearest vertices – c being even
A simple one-dimensional network model. • (a) Vertices are arranged on a line andeach is connected to its c nearest neighbors, where c = 6 in this example. • (b) The same networkwith periodic boundary conditions applied, making the line into a circle
Traversing a “triangle” in our circle model means taking two steps forward around the circle andone step back. • C(c/2,2) total # of possibilities for two steps • # ways of observing the target c/2 steps • forward = (1/2)(c/2)(c/2-1) = (1/4)c(c/2-1)
# of connected triples centered on each vertex c/2=c(c-1)/2 • Total # connected triples: nc(c-1)/2 • C: • As c varied from 2 to infinite • CC varies from 2 to ¾ • Net depends on n • Unrealistic • Regular networks with DD = c
A more serious problem • Large worlds: - do not display small world effect observed in most real networks • The shortest path distances between most pairs of vertices is small (a few steps) even for notworks of billions of nodes • acquaintance network of entire World population
The shortest distance between two vertices in the circle model: • fastest move in a ring c/2 spacings • Two vertices m spaces apart can be connected by a path of 2m/c • Averaging over all possible m from 0 to n/2 • gives: n/2c • e.g.: for the acquaintance net of world pop • n - O(109), c - O(103), l - millions • measured l – 6 to 10
By contrast the RG capture the SW effect rather well • As with many other network models CM pass • Average shortest path – ln n/ln c • For the acquaintance network – 9/3 =3 • But has an unrealistically low CC • The two models: RG and simple circle (SC) • Each capture one property of real nets • RG – SW effect , SC – CC • Small world (SM) by Wattsw and Stogatz
SWM interpolat between SC and RG • By moving or rewiring edges from cicle to random possitions • Starting from a CM of n vertices each having c edges • Go through each of the edges in turn • With some prob p: • remove that edge and • Replace with one joining two vertices choosen uniformly at random - shortcuts
The parameter p controls the interpolation between the CM and RGM • p=0: no edges are rewired – original CM • P=1: all edges are rewired - RGM • For intermediate values of p: • Networks in between • For p=0, the SWM shows clustering (c>2) but not SW effect • For p=1: reverse -RGM • For modest values of p both high CC and SW efect
For analytical tracability • Variant of the original model: • Edges are added randomly • No edges are removed from the circle • p – same as in the original model • For every edge with independent prob p • add a shorcut between vertices choosen uniformly at random
Downside: • for p=1 – not RG • GR+original SC • Most interst p small • The two modesl are hardly difer • Small # edges around the cicle • absent in the orignal model
Two versions of the small-world model. • (a) In the original version of the smallworldmodel, edges are with independent probability p removed from the circle and placedbetween two vertices chosen uniformly at random, creating shortcuts across the circle as shown. Inthis example n = 24, c = 6, and p = 0.07, so that 5 out of 72 edges are “rewired” in this fashion. • (b)In the second version of the model only the shortcuts are added and no edges are removed from the circle.
Degree Distribution • Degree of a vertex: c #shortcut edges • For each of non-shortcut edges • #: nc/2 • Add a shortcut with prob p • There are npc/2 shortcuts on average • ncp ends of shortcuts • cpn/n = cp shortcuts end in any vertex on average • s vertex Poisson distributed • with a mean of cp
Total degree of a vertex: k = c+s • Putting s = k-c • Fig 15.4 of N-N • c=6,p=1/2 • Not mimic the DD of real networks • But the model does not intend to mimic that
Fig. 15.4 of N-N • The degree distribution of the small-world model. The frequency distribution ofvertex degrees in a small-world model with parameters c = 6 and p = 1/2
Clustering Coefficient • The CC is givenby • # of triangles and connected triples in the metwork • # triangles: • The circle does not change • # triangles in the cicle: nc(c/2-1)/4 • Some new triangles by the shortcuts: • Vertex pairs c/2+1 to c by one or more paths of length two - if by shortcuts also - triangles
# of such paths of length two n • average #of shortcuts in the SWM: ncp/2 • C(n,2):n(n-1)/2 places they can fall • any particular pair of vertices is connected with prob: • just cp/n whn n , • # paths of length two compleated by shortcuts to form trianges n x cp/n = cp – constant • For large network size they can be neglected compaird to triangles from CM – O(n)
triangles can be formed from two or three shortcuts • Those turn out to be neglegible in number • Leading order in n • # of triangles in the SWM - # of triangles in the CM = (nc)(c/2-1)/4
# of connected triples: • All connected triples in the CM present in SWM • nc(c-1)/2 • Triples from shortcuts combining with an edge in the circle • # shortcuts: ncp/2 , • c edges – they can form a triple – each of two ends • Total: (ncp/2) x c x 2 = ncp: - connected triples
triples by pairs of shortcuts: • If a vertex connected to m shortcuts • # triples of shortcuts: C(m,2) = m(m-1)/2 • averaging over Poisson distribution of m • With a mean cp • expected # of conected triples centered at a vertex: c2p2/2 • total of nc2p2/2 for all vertices • Expected # of triples of all kind: • nc(c-1)/2 + nc2p + nc2p2/2
Substituting into clustering eq • for p=0, same asfor the CM • as p increases - becomes smaller • when p=1 – minimum value of : • (3/4)(c-2)/(4c-1) • e.g., when c=6 CC: 3/23 = 0.130
Contrast with original WS-SWM • edges are removed from the circle • CC 0 as n • when p=1 • Since the network – RG • Fig 15.5 of N-N shows • CC as a function of p for SWM, for c=6
Clustering coefficient and average path length in the small-world model. • Thesolid line shows the clustering coefficient, Eq. (15.7), for a small-world model with c = 6 and n =600, as a fraction of its maximum value ,CCmax:0.6, plotted as a function of theparameter p. • The dashed line shows the average geodesic distance between vertices for the samemodel as a fraction of its maximum value ℓmax = n/2c = 50, calculated from the mean-fieldsolution, Eq. (15.14). Note that the horizontal axis is logarithmic.
Average Path Lengths • Calculating the average path length or mean geodesic path distances between pairs of vetrices in SWM - • harder than DD or CC • No exact expression • Some approximate expressins • by simulations – reasonaby accurate
Known about path lenghts • How they scale with model parameters • Simple SWM with c=2: • each vertex is connected to its immediate neighbors • Argument: • distane covered by an edge – one unit: meter • What other quantities in terms of – length • distance around the whole cicle :n
Mean distance between the ends of the shortcuts • suppose s shortcuts – 2s ends • s= ncp/2 • average distance between end around the circle: =n/2s • Once specifiy n and - specify the entire model • from n, to s to p, when c=2
Ratio of l to n • l: average shortest path • as function of n and - specify the model • ratio of two distances - dimensionless • one such dimensionless combination n/ • F(x) does not depends on any of the parameters – universal function in scaling • Hence mean geodesic path in SWM wih c=2
For larger c • c lenghts of shortest paths (SP)between vertices • Keep everything the same – n, s • s from 2 to 4 halve SPs • If the SP includes a shortcuts – distance does not change • If density of shortcuts low – most paths are on ciicles • For c length of the path by c/2 • For low densty of shortcuts
Alternatively • s = npc/2 • l= 2(n/c)F(npc) • Absorbing the leading factor of 2 into the functional form • definng f(x)= 2F(x) • l= (n/c)f(npc) • How average path lengh depends on parameters • n,p,c for low values of shortcut density
We do not know the form fo f(x) • Numerical simulation • generate SW metworks • measure mean distance l – between vertices • Many runs with different parameters • the combination cl/n same function of npc • Fig 15.6 of N-N • Many networks – all roughly the same function
Scaling function for the small-world model. • The points show numerical results forcℓ/n as a function of ncp for the small-world model with a range of parameter values n = 128 to 32768 and p = 1 × 10−6 to 3 × 10−2, and two different values of c as marked. • Each point is averagedover 1000 networks with the same parameter values. The points collapse, to a reasonableapproximation, onto a single scaling function ƒ(ncp) in agreement with Eq. (15.13). • The dashedcurve is the mean-field approximation to the scaling function given in Eq. (15.14)
Another approach • Calculate f(x) approximately • Series, distributional or mean-field methods • A mean field approximation : • becomes exact when # shortcuts is very small or very large • In between around x=1 – approximate • Show as the dashed line in Fig 15.6 of N-N • agrees well with numerical resutls at the ends • less in the middle • Enough to prove that SWM expalins SW effect
for large velues of x: x >>1 – npc >>1 • npc : 2 x # of shortcuts • average of l logarithmically with n – very slowly • for fixed p and c • n very large • average l: remains small • SW effect
# shortcuts not (# shortcuts per vertex) large • small density of random shortcutrs to a large networks – SW behavior • Why most real networks show SW effects • Have long range connections with some randomness • Very few are regular or with short range connctions
SWM shows not only • SW effect but displaying clustering • # shortcuts = npc/2 • when n holding p and c fixed • CC independent of n • retains (non-zere) as n • In this limit - simultaneously • non-zere clustering and SW effect • The two are not with odds with one antoher
Fig 15.5 of N-N • plot of approximate values of l s a function of p • for a SWM n=600,c=6 • Along with CC • substantial values of p • the value of l is low • the value of CC is high
Clustering coefficient and average path length in the small-world model. Thesolid line shows the clustering coefficient, Eq. (15.7), for a small-world model with c = 6 and n =600, as a fraction of its maximum value ,CCmax:0.6, plotted as a function of theparameter p. The dashed line shows the average geodesic distance between vertices for the samemodel as a fraction of its maximum value ℓmax = n/2c = 50, calculated from the mean-fieldsolution, Eq. (15.14). Note that the horizontal axis is logarithmic. 45