300 likes | 429 Views
Programming for Geographical Information Analysis: Advanced Skills. Online mini-lecture: Introduction to Complex Networks Dr Andy Evans. This Lecture. Types of Network Random Spatial Scall-free Small-world Network Statistics. Network types.
E N D
Programming for Geographical Information Analysis:Advanced Skills Online mini-lecture: Introduction to Complex Networks Dr Andy Evans
This Lecture Types of Network Random Spatial Scall-free Small-world Network Statistics
Network types Various types of abstract graph have been suggested. We mentioned two in lecture four: the tree and the lattice. Some appear to be more useful for understanding real world social and environmental networks. The simplest of these is the Random Graph. Nodes are connected randomly in some manner.
Erdős–Rényi Construction Produces the simplest Random Graph. Edges are progressively added, with each node having the same probability of being involved.
Spatial Graphs Where the ability to connect between nodes is constrained by space. Generally this means a higher probability of connection to nearby nodes. Various types: including random-spatial.
Caveman Graphs Individual highly-connected groups. No connection between groups.
Network statistics Distribution/average of node degree. Distances: Eccentricity: distance from a node to the node furthest from it. Average path length: average eccentricity. Radius: minimum eccentricity in the graph. Diameter: maximum eccentricity in the graph. Global clustering: how many nodes are connected in complete connection triangles (triadic closures) as a proportion of the connected triplets in the graph.
Network statistics Trees Low average degree Narrow degree distribution Low clustering High APL Lattices Low average degree Narrow degree distribution Low clustering High APL
Network statistics Random Low average degree Normal degree distribution Low clustering Low APL Caveman High average degree Narrow degree range High clustering Infinite APL Spatial Medium average degree Narrow degree range Medium clustering Long APL
Scale-free Networks Barabási and Albert looked at the real networks, including the internet. They saw the distribution of links matched an inverse power law. Number of nodes of degree k = k-x This relationship is constant, whatever k, i.e. The distribution is scale-free.
Barabási–Albert construction Attach more edges to those nodes that already have more edges. Probability of attachment proportional to node degree. Produces a scale-free network.
Scale-free Networks Still a fairly high number of nodes of 5+ degree. These are known as Hubs. Basis (kinda) for the Google PageRank algorithm. Networks have a high resistance. High clustering, but degree of clustering relates to network size. Large networks = smaller clustering.
Scale-free Networks Scale-free networks seem like the kinds of networks that might be good for modelling people. But, does social clustering really change with size of network? There is some evidence that human group sizes are limited.
Dunbar Number Robin Dunbar suggests that human brain size suggests ~150 people, which seems to match pre-industrial communities. But others have found a wide range of figures. There is some evidence that once groups grow above this limit the core group doesn’t scale, but a new hierarchy of group management develops. Either way, the core group size is unlikely to scale with the network.
♫♪ It’s a small world afterall ♫♪ How is it we often meet complete strangers with whom we have a mutual acquaintance? It’s said that you’re only six mutual associates away from anyone in the world (“Six Degrees of Separation”). Stanley Milgram (1967) sent packages to people in Nebraska and Kansas, with instructions to pass them to people they thought might be closer to targets in Massachusetts. Took an average of 5 steps to arrive. How can this be possible given the following..? Every person knows only around a thousand people. There are six billion people on the globe.
The Kevin Bacon Game Can you link any actor to Bacon via co-stars in films? Anyone whose co-starred in a film with Kevin Bacon has a Bacon Number of one. Anyone who’s been in a film with a co-star of Bacon has a Bacon Number of two, etc.
Six Degrees of Kevin Bacon • Barbara Windsor has a Bacon number of three. • Barbara Windsor was in Comrades (1987) with Robert Stephens • Robert Stephens was in Chaplin (1992) with Diane Lane • Diane Lane was in My Dog Skip (2000) with Kevin Bacon Steve McFadden has a Bacon number of two Steve McFadden was in Buster (1988) with Phil Collins Phil Collins was in Balto (1995) with Kevin Bacon
Is Kevin Bacon the centre of the Universe? The Internet Movie Database has ~850,000 connected films. Each film has an average number of actors of 61. Yet the maximum Bacon Number found so far is only 12. The average number of films between any actor and Bacon is only 2.980 films. So why is this so? Because social groups are a form of network known as Small World graphs.
Small World graphs A mix of strongly Clustered groups with a few hub individuals who know many groups (cause the social groups to overlap). Fall between extremes in the level of local clustering and average path length like the scale-free networks. But, more realistic clustering – which doesn’t scale. Kevin
Watts and Strogatz construction Start with a ring network, with each point connected to its k neighbours (i.e. start with strong clustering). Rewrite each edge to one randomly picked, if some probability β is met.
More characteristics Average Path Length is proportional to ln(vertices). Average Path Length is inversely proportional to ln(associates). The Average Path Length decreases extremely rapidly as lynchpins / shortcuts increase slightly from nothing. Shortcuts cross vast areas of variable space to link with unexpected groups. Very robust to random losses – at worst flows will route to another hub.
Spatial graphs Shortcuts are rare (it’s easier to link to nearby nodes than stretch to the other side of a net) so they rarely show Small World characteristics. In such networks the Average Path Length scales more linearly with the number of vertices.
Example of a real network Disease spread. 2001 UK Foot and Mouth epizootic. Farm-to-farm spread by air: spatial network. Farm-to-farm spread by cattle movements: small-world network.
Foot and Mouth daily cases Source: BBC / MAFF 4 May 2001 Cutting movements improved on 1967. Cases decreased when probability of inflection lowered. 1967 24hr cull policy 50 Healthy cull policy Initial May 5th predictions 400d-1 40 30 20 10 0 24 Feb 10 Mar 24 Mar 7 Apr 29 Apr
Uses of Small World theory The spread of disease (Watts, 1999). Spreading is controlled by… The length of time that someone is infectious. The length of time someone is removed (sick but not infectious, or if infinite = immune or dead). The infection probability / rate between 0 and 1. People are either Susceptible, Infectious or Removed. Watts mapped the proportions of these groups in Small World societies and physically limited networks for different disease parameters.
Violent deadly diseasesSmallWorld 1 • Such diseases reach equilibrium when people are removed faster than the disease spreads. • There’s a massive difference in deaths dependent on shortcuts. • Hence cutting off diseased population is vital. Equilibrium fraction of Susceptible people Fraction of shortcuts = 0 Fraction of shortcuts = 0.9 0 0 1 Probability of infection Tipping point Disease takes off Everyone dies
Other characteristics of disease spread If the disease infects the whole population, the time to do so is also strongly dependent on the fraction of shortcuts. In physically limited graphs, however, the spread is about the same whatever the range over which vertices can connect. Diseases are worse in Small World situations, but more easily controlled.
Other uses of Small World theory Spread of information / fashion / “memes”. The resilience of networks to attack. The efficiency of distribution systems.
Software Masses of software E.g. Inflow Network Centrality Small-World Networks Cluster Analysis Network Density Prestige / Influence Structural Equivalence Network Neighborhood External / Internal Ratio Weighted Average Path Length Shortest Paths & Path Distribution
Other key statistics Centrality: various measures, including degree, but two are: Betweenness centrality: number of shortest paths passing through a node. Closeness centrality: average of shortest paths to all other nodes. Node degree (or other) correlation: how similar are nodes to their neighbours?