1.43k likes | 1.75k Views
Introduction of Network Science . Prof. Cheng-Shang Chang ( 張正尚教授 ) Institute of Communications Engineering National Tsing Hua University Hsinchu Taiwan. Outline. What is network science? A brief history of network science Review of the mathematics of networks
E N D
Introduction of Network Science Prof. Cheng-Shang Chang (張正尚教授) Institute of Communications Engineering National TsingHua University Hsinchu Taiwan
Outline • What is network science? • A brief history of network science • Review of the mathematics of networks • Diffusion, distributed averaging, random gossip, synchronization • Network formation • Structure of networks (Community detection) • Conclusion
What is network science? • 2005 National Research Council of the National Academies • “Organized knowledge of networks based on their study using the scientific method” • Social networks, biological networks, communication networks, power grids, …
A visualization of the network structure of the Internet at the level of “autonomous systems” (Newman, 2003)
A food web of predator-prey interactions between species in a freshwater lake (Newman, 2003)
Power grid maphttp://www.treehugger.com/files/2009/04/nprs-interactive-power-grid-map-shows-whos-got-the-power.php
Citation networks http://www.public.asu.edu/~majansse/pubs/SupplementIHDP.htm
Two key ingredients • The study of a collections of nodes and links (graphs) that represent something real • The study of dynamic behavior of the aggregation of nodes and links
Definition of Network • G(t)={V(t), E(t), f(t): J(t)} • t: time • V: node (vertex, actor) • E: link (edge) • f: NxN topology (adjacency matrix) • J: algorithm for the evolution of the network (microrule)
Definition of Network Science (by Ted G. Lewis) • The study of the theoretical foundation of network structure/dynamic behaviors and the application of network to many subfields • Social network analysis (SNA) • Collaboration networks (citations, online social networks) • Emergent systems (power grids, the Internet) • Physical science systems (phase transition, percolation theory) • Life science systems (epidemics, metabolic processes)
A brief history: The pre-network period (1736-1966) • 1736 Leonhard Euler: seven bridge of Konigsberg problem • 1925 Yule: preferential attachment • An explanation for the evolution of the Internet and WWW • 1927 Kermack and McKendrick: epidemic model (diffusion of innovation, the spread of information) • 1959-1960 Erdos and Renyi: random graph model
The meso-network period (1967-1998) • 1967 Stanley Milgram • “Six degree of separation” • Communication project • If you do not know the target person, forward the request to a personal acquaintance • Small-world effect: the diameter of a network increases as ln(n)
The meso-network period (1967-1998) • 1972 Bonacich :influence network • Distributed consensus • Kirchhoff’s network: the value of a node is equal to the difference between the sum of values from input and output links • States and differential equations • Fixed point (steady state) • 1984 Kuramoto: synchronization in coupled linear systems
The modern period (1998-present) • 1998 Holland: emergence as the final state (of the fixed point problem) • 1998 Watts and Strogatz: a generative procedure of rewiring the links in a regular graph • The small-world model • Crossover point and phase transition
The modern period (1998-present) • 1999 M. Faloutsos, P. Faloutsos and C. Faloutsos: observed a power law in their graph of the Internet • 1999 Barabasi: math model for scale-free networks • 2000 Dorogovtsev: power law in many biological systems • 1999 Kleinberg: power law in webgraph • 2002 Girvan and Newman: community structure
The modern period (1998-present) • Atay network (a generalization of the Kirchhoff network) • Emergence and synchronization: • Heart beating • The chirping of crickets • Distributed consensus • Propagation of influence
Networks and their representations • A networks is a graph • Vertices (nodes, sites, actors) • Edges (links, bonds, ties) • n: number of nodes • m: number of edges • Multiedges • Self-edges (self-loops) • Simple network (simple graph): a network that has neither self-edges nor multiedges • Multigraph: a network with multiedges Self-edge Multiedge 1 2 4 3
Adjacency matrix • A: an n×n matrix • Aij=1 if there is an edge between vertices iand j. • Aij=0 otherwise. • For a network with no self-edges, the diagonal elements are all zero. • It is symmetric. 1 2 3 4 1 1 2 A= 2 4 3 4 3
Directed networks • Adjacency matrix: Aij=1 if there is an edge from j to i. • With self-edges: Aii=1 for a single edge from vertex i to itself in a directed network. 1 2 3 4 1 1 2 A= 2 4 3 4 3
Degree • The degree of a vertex is the number of edges connected to it. • ki: the degree of vertex i • m: number of edges • 2m ends of edges (every edge has two ends)
Mean degree • c: the mean degree of a vertex in an undirected graph • The maximum possible number of edges is (n-1)n/2.
Density • Density (connectance): the fraction of the maximum number of edges that actually present • For large network (n is very large)
Density • A (large) network is said to be dense if the density ρ tends to a constant as • On the other hand, it is said to be sparse if ρ tends to 0 as
Regular graphs • A regular graph is a graph in which all the vertices have the same degree. • k-regular graph: every vertex has degree k • 2-regular: ring • 4-regular: square lattice
Path • Path: a sequence of connected vertices • Self-avoiding path: a path that does not intersect itself • Length of a path: the number of edges in the path • If there is a path of length 2 from j to i via k, then AikAkj=1.
Paths and adjacency matrix • : the number of paths of length 2 from j to i • : the number of paths of length 3 from j to i
Geodesic paths • A geodesic path (shortest path) is a path between two vertices that no shorter path exists • Geodesic distance (shortest distance): the length of a geodesic path • The smallest value r such that • Geodesic paths are self-avoiding (Why?) • Geodesic paths are not necessarily unique
Diameter • The diameter d of a graph is the length of the longest geodesic paths between any pairs of vertices in a network. • Suppose that is the geodesic distance between vertices i and j
Components • A network is connected if there is a path from every vertex to any other vertex. • Disconnected networks can be separated into several components. • Components: • There is a path from every vertex in the subnetwork to any other vertex in the same subnetwork. • No other vextex can be added while preserving this property.
Diffusion • Diffusion is the process by which gas moves from regions of high density to regions of low, driven by the relative pressure of the different regions. • Diffusion in a network (Influence network): • The spread of an idea • The spread of a disease
Diffusion in a network • Suppose that we have some commodity on the vertices. • Let be the amount of the commodity at vertex i at time t • Suppose that community moves from vertex j to an adjacent vertex i at rate • C is called the diffusion constant.
Governing equation for diffusion in a network • is the degree of vertex i • is the Kronecker delta, which is 1 if i=j and 0 otherwise.
Governing equation for diffusion in a network • Let D be the diagonal matrix with vertex degrees along its diagonal. • Graph Laplacian: L=D-A • In matrix form, • A system of linear differential equations
Solving the system of linear differential equations • Suppose vi and λi are the ith eigenvector and eigenvalue. • Guess the solution has the form
Eigenvalues of the graph Laplacian • The Laplacian is symmetric. • It has real eigenvalues. • The Laplacian is positive-semidefinite. • All its eigenvalues are nonnegative. • The vector (1,1,…,1) is an eigenvector with eigenvalue 0.
Algebraic connectivity • The number of zero eigenvalues of the Laplacian is the number of components. • The Laplacian can be written in a block form. • The network is connected if and only if the second smallest eigenvalue of the Laplacian is nonzero. • Algebraic connectivity: the second smallest eigenvalue of the Laplacian
Distributed averaging consensus • Lin Xiao and Stephen Boyd, “Systems & Control Letters,” 53 (2004) 65 – 78. • Consider a network (connected graph) G=(V,E) • Each vertex i holds an initial scalar value xi(0) in R, and x(0)=(x1(0),…, xn(0)) • Two vertices can communicate with each other, if and only if they are neighbors. • The problem is to compute the average of the initial values, ,via a distributed algorithm
Motivation • Sensor networks (measuring temperature) • A flock of flying birds
Distributed linear iterations • Constant edge weights • In matrix form • L=D-A is the Laplacian of the graph
Distributed linear iterations • W=I- L • The vector (1,1,…,1) is an eigenvector with eigenvalue 0 of the Laplacain L. • L is symmetric for an undirected graph • W is a doubly stochastic matrix, i.e., all the row sums and column sums are all equal to 1. • If W is a nonnegative matrix, then W can be viewed as the probability transition matrix of a Markov chain and • where is a matrix with all its elements being 1.
Condition for convergence • As • The condition for W to be a nonnegative matrix, • ki is the degree of vertex i • Distributed linear iteration is guaranteed to converge if
Randomized gossip algorithms • Stephen Boyd, ArpitaGhosh, BalajiPrabhakar, and Devavrat Shah, IEEE Transactions on Information Theory, VOL. 52, NO. 6, pp. 2508-2530, JUNE 2006. • Gossip algorithm: an algorithm in which each node can communicate with no more than one neighbor in each time slot. • Consider a network (connected graph) G=(V,E) • Each vertex i holds an initial scalar value xi(0) in R, and x(0)=(x1(0),…, xn(0)) • The problem is to compute the average of the initial values, ,via a gossip algorithm
Asynchronous time model • Each vertex has a clock which ticks at the times of a rate 1 Poisson process. • Superposition of independent Poisson processes is also a Poisson process with the rate equal to the sum of the rates of the original Poisson processes. • Uniformization: consider a Poisson process with rate n for clock ticks (as there are n vertices). • With probability 1/n, a clock tick is chosen for vertex i.
Asynchronous time model • In the kth time slot, let node i’s clock tick and let it contact some neighboring node j with probability Pij. • Both vertices set their values equal to the average of their current values. • With probability , the random matrix W(k) is • where Q is the permutation matrix that interchange the ith and jth coordinates.
Spread of information • Other objective functions, e.g., max, min. • How fast is information distributed over a network via a randomized gossip algorithm? • Start from the initial state x(0)=(1,0,…,0), i.e., only the first vertex has the information. • If xi(t)>0, then vertex i must have been “visited” (at least once) by time t via the randomized gossip algorithm. • can be used to bound the probability that all the vertices received the information.
Influence network • xi(t): the degree of influence (power) of vertex i at time t • :the influence from j to i • We still have • But the weight matrix W is much more complicated. It may not be nonnegative, or doubly stochastic. • Convergence might be a problem.