DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks

DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks Guido Caldarelli CNR-INFM Istituto dei Sistemi Complessi Guido.Caldarelli@roma1.infn.it 4/6

STRUCTURE OF THE COURSE • SELF-SIMILARITY (ORIGIN AND NATURE OF POWER-LAWS) • GRAPH THEORY AND DATA • SOCIAL AND FINANCIAL NETWORKS • MODELS • INFORMATION TECHNOLOGY • BIOLOGY

STRUCTURE OF THE FOURTH LECTURE 4.1) DEFINITION OF THE MODELS 4.2) RANDOM GRAPHS 4.3) SMALL WORLD 4.4) MULTIPLICATIVE PROCESSES 4.5) BARABASI-ALBERT 4.6) REWIRING 4.7) FITNESS

4.1 MODELS DEFINITIONS Standard Theory of Random Graph(Erdös and Rényi 1960) Small World(D. Watts and S.H. Strogatz 1998) P(k) k Degrees are Poisson distributed Random Graphs are composed by starting with n vertices. With probability p two vertices are connected by an edge Small World Graph are composed by adding shortcuts to regular lattices Degrees are peaked around mean value

4.1 MODELS DEFINITIONS Model of Growing Networks(A.-L. Barabási – R. Albert 1999) 1) Growth Every time step new nodes enter the system 2) Preferential Attachment The probability to be connected depends on the degree P(k)  k Degrees are Power law distributed “Intrinsic” Fitness/Static/Hidden variable Models (K.-I. Goh, B. Khang, D. Kim 2001 G.Caldarelli A. Capocci, P.De Los Rios, M.A. Muñoz 2002) 1) Growth or not Nodes can be fixed at the beginning or be added 2) Attachment is related to intrinsic properties The probability to be connected depends on the sites Degrees are Power law distributed

4.2 RANDOM GRAPHS • The number m of edges in a Random Graph is a random variable whose expectation value is • The probability to form a particular Graph G(N,m) is given by • The degree has expectation value It is easy to check that the degree probability distribution is given by

4.2 RANDOM GRAPHS • We can give an estimate of the Clustering Coefficientfor a complete graph it must be 1.If the graph is enough sparse then two points link each other with probability p • Same estimate can be given for the average distance l between two vertices.If a graph has <k> average degree then the first neighbours will be <k>the second neighbours ~ <k>2……………..the n-th neighbours ~ <k>n • For the Diameter D →<k>D of order N

4.3 SMALL WORLD Take a regular lattice and rewire with probability f some of the links(for analytical treatment, a slight modification is recommended:Instead of rewiring add the new links proportional to the existing links) The total number of shortcuts is Average degree is now Therefore for small f the degree distribution is peaked around 2c

We have that in the regular lattice (start with c=1 and generalize) → We have that in the Random Graph • 4.3 SMALL WORLD Clustering Coefficient of the regular lattice (f → 0 and <k> < 2/3N otherwise C=1) For the average distance there is no resultbut we can define a distance in the problem, given by the mean distance between two shortcuts endpoints.

4.3 SMALL WORLD Several conjectures, made but neither the actual distribution of path lengths nor the <l> has been found Now in Small World graphs, the behaviour must be intermediate between the regular lattice and Random Graph.If we define a characteristic length in the system as for example x = average distance between two endpoints of shortcuts (not the same!) xdiverges when f→ 0 • is characteristic distance we can define in the model so that we make the ansatz

4.4 MULTIPLICATIVE PROCESS In a multiplicative process you have S(t)=e(t)S(t-1)=e(t) e(t-1)…e(1)S(0) For the central limit theorem the log of S(t) is normal distributed. If variance is large it can look as a power law with apparent slope m/s 2 -1

4.4 MULTIPLICATIVE PROCESS In this case the apparent slope (m/s2-1) of blue line is 0.6 (m=4) If there is a threshold on the S YOU OBTAIN REAL POWER-LAWS INSTEAD M. Mitzenmacher, Internet Mathematics1 226 (2004)

4.5 BARABASI-ALBERT • This is by far the most successful and used model in the field • along with the related models • FITNESS MODEL • REWIRING MODEL • AGING EFFECTS • TWO STEPS • GROWTH: Every time step you add a vertex • PREFERENTIAL ATTACHMENT: • This idea has been reformulated in different fields and has different names • YULE PROCESS (G. Yule Phyl. Trans. Roy. Soc. 213 21 (1925) • SIMON PROCESS (H.A. Simon Biometrika 42 425 (1955) • DE SOLLA PRICE MODEL (D.J. De Solla Price Science149 510 (1965) • ST. MATTHEW EFFECT (K.R. Merton Science159 56 (1968)

4.5 BARABASI-ALBERT The basic approach is through continuum theory, degree is now a continuum variable: Start with m0 vertices and add for every tm new links As for the degree distribution we can compute the P(ki<k) R. Albert, A.-L. Barabási Review of Modern Physics7447 (2001)

4.5 BARABASI-ALBERT The distribution of incoming vertices is uniform in time From which we obtain

4.5 BARABASI-ALBERT The value of the exponent g depends on details of preferential attachment • If P(k)~ka NO POWER LAW • If P(k)~k +a g=3+a/m Clustering is larger than Erdos Renyi (m>1…) No clear Assortative/Disassortative behaviour

4.5 BARABASI-ALBERT: Fitness • TWO STEPS • GROWTH: Every time step you add a vertex • PREFERENTIAL ATTACHMENT: • But now vertices differ, some are good, some are bad, • you measure that by assigning a ``fitness’’ hi For some choices of the distribution of fitnesses r(h) you still have Power law degree distribution and also assortativeness. Great success in reproducing Internet (A.Vazquez, R. Pastor-Satorras, A. Vespignani Phys. Rev. E 65 066130 (2002) ) G. Bianconi A.-L. Barabási Europhys. Lett. 54 436 (2001)

4.5 BARABASI-ALBERT: Rewiring P.L. Krapivsky, G.J. Rodgers, S. Redner Phys. Rev. Lett. 86 5401 (2001) M. Catanzaro, G. Caldarelli, L. Pietronero Phys. Rev. E 70 037101 (2004). • TWO STEPS • PROBABILITY p • GROWTH: Every time step you add a vertex • PREFERENTIAL ATTACHMENT • PROBABILITY 1-p • REWIRING of existing nodes

4.5 BARABASI-ALBERT: Aging Only m vertices enter in the dynamics. Those are the ACTIVE SITES • DIFFERENT STEPS • GROWTH: Every time step you add a vertex • This new vertex draw a link with all the m active vertices. • AGING: A vertex is deactivated with a probability proportional • to (ki+a)-1 K. Klemm and V. M. Eguíluz Phys. Rev. E 65 036123 (2002)

4.6 REWIRING Consider the WWW. What is the “microscopic” process of growth? • You see a WWW page that you like (i.e. that of a friend of yours) • You copy it, and change a little bit • TWO STEPS • GROWTH: Every time step you copy a vertex and its m edges • MUTATION (for everyone of m edges) • With Probability (1-a)you keep it • With Probability ayou change destination vertex R. Kumar et al. Computer Networks 31 1481 (1999) A. Vazquez, et al. Nature Biotechnology 21 697 (2003)

4.6 REWIRING The rate of change is given by It becomes clear we have an effective preferential attachment. It can be demonstrated (NOT HERE!)

4.6 REWIRING In the completely different context of protein interaction networks the same mechanism is in agreement with the current view of genome evolution. When organisms reproduce, the duplication of their DNA is accompanied by mutations. Those mutations can sometimes entail a complete duplication of a gene. Since in this case the corresponding protein can be produced by two different copies of the same gene, point-like mutations on one of them can accumulate at a rate faster than normal since a weaker selection pressure is applied. Consequently, proteins with new, properties can arise by this process. The new proteins arising by this mechanism share many physico-chemical properties with their ancestors. Many interactions remain unchanged, some are lost and some are acquired. • CLUSTERING MUCH SIMILAR TO THAT OF WWW

4.6 FITNESS MODEL Without introducing growth or preferential attachment we can have power-laws We consider “disorder” in the Random Graph model (i.e. vertices differ one from the other). This mechanism is responsible of self-similarity in Laplacian Fractals • Dielectric Breakdown • In reality • In a perfect dielectric K.-I. Goh, B. Khang, D. Kim Phys. Rev. Lett 87 278701, 2001 G.Caldarelli et al. Phys Rev. Lett. 89258702 2002

4.6 FITNESS MODEL • Assign to every vertex one real positive number x that we call fitness. • fitnesses are drawn from probablity distribution r(x) • Link two vertices with fitnesses x and y according to a probability • function f(x,y)=f(y,x) (choice function). STATIC if N is kept fixed The model can be considered DYNAMIC if N is growing This is a GOOD GETS RICHER model No preferential attachment is present. V.D.P. Servedio, P. Buttà, G. Caldarelli Phys. Rev. E 70 027102 (2004).

4.6 FITNESS MODEL Different realizations of the model a) b) c) have r(x) power law with exponent 2.5 ,3 ,4 respectively. d) has r(x)=exp(-x) and a threshold rule.

4.6 FITNESS MODEL Degree distribution for cases a) b) c) with r(x) power law with exponent 2.5 ,3 ,4 respectively. Degree distribution for the case d) with r(x)=exp(-x) and a threshold rule.

4.6 FITNESS MODEL The Degree probability distribution P(k) is a functional of r(x) and f(x,y). DIRECT PROBLEM Given a fitness r(x) → which choice function f(x,y) produces scale free graphs? i.e. P(k) = cka INVERSE PROBLEM Given a choice function f(x,y)→ whichfitness r(x) produces scale free graphs? i.e. P(k) = cka

4.6 FITNESS MODEL • Fitness probability distribution Non decreasing • Vertex degree • Vertex degree Probability Distribution

4.6 FITNESS MODEL • Degree Correlation • Vertex Clustering Coefficient

4.6 FITNESS MODEL We impose P(k)=c(k(x))a → Multipling both sides of the equation for k’(x) and integrating from 0 to x

4.6 FITNESS MODEL We now have a constraint on the fitness distribution r(x) and choice function f(x,y) Some exact results

4.6 FITNESS MODEL Special case f(x,y)=g(x)g(y)

4.6 FITNESS MODEL Special case f(x,y)=f(x+y)

4.6 FITNESS MODEL Special case f(x,y)=f(x-y)

4.6 FITNESS MODEL • Using the intrinsic fitness model it is possible to create scale-free networks with any desired power-law exponent • This is possible for any fitness probability distribution r(x), it does not matter if they are (e.g.) exponential, power-law or Gaussian. • We found analytic expressions for the choice function f(x,y) in three • cases: • f(x,y)=f(x)f(y) for everyr(x), • f(x,y)=f(x  y) r(x)=e-x • If f(x,y)=f(x)f(y) both vertex degree correlation and clustering coefficient are constant

CONCLUSIONS • There are plenty of models around, to check what is more likely to reproduce the data we have to check a series of quantities • Degree distribution • Assortativity • Clustering, Motifs etc. Not all the ingredients are equally likely: • RANDOM GRAPH: You choose your partner at random. • BARABÁSI-ALBERT: To choose your partner: • You must know how many partners she/he already had • The larger this number, the better • COPYING: You choose the partners of your close friends • INTRINSIC FITNESS You choose your partner if you like her/him

DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks