210 likes | 316 Views
Web Intelligence Complex Networks I. This is a lecture for week 6 of `Web Intelligence. Example networks in this lecture come from a fabulous site of Mark Newman, U of Michigan: http://www-personal.umich.edu/~mejn/. This part of the course: WI. Introductory Points.
E N D
Web IntelligenceComplex Networks I This is a lecture for week 6 of `Web Intelligence Example networks in this lecture come from a fabulous site of Mark Newman, U of Michigan: http://www-personal.umich.edu/~mejn/
Introductory Points • Graphs and networks are of central importance to us, because: • The web is a large and complex network • Major phenomena that underpin our existence, such as how • information spreads, how diseases develop, how economies evolve, • are best viewed mathematically as networks. • Networks have structural properties and behaviour. When we • analyse the structure of a network, we can reveal important clues • about its behaviour. E.g. • Predict how fast a virus, or rumour will spread on the web • Assess which are the most authoritative web sites • Predict how long it will take to search sections of the web • Predict how robust to damage an area of the www is, or a • cellular process is, etc.
This Week’s Material Basic Intro to graphs and networks, terminology, and so on. The interesting properties of real-world networks. Metrics and other structural properties that are currently used to analyse both the www and other networks. To support the understanding of metrics and properties, this week we cover basics of graphs and networks.
The very basics A graph is a set of two things: G = {V, E} V = a set of vertices (also called nodes) e.g. V = {A, B, C, D} E = a set of edges (also called arcs, or links) e.g. E = { {A,C}, {A,D}, {B,C}, {B, D} } in which each edge is a set of two vertices from V A B This graph is: C D
A B The very basics II An undirected edge between A and B: {A, B} (or {B, A}) A directed edge between A and B: (A, B) A loop at A: {A, A} or (A, A) A B A In an undirected graph, all edges are undirected. In a directed graph, all edges are directed.
The very basics III A B G C D E F The degree of a node, in an undirected graph, is the number of edges attached to it. In this one, the degrees are: A: 2 B: 3 C: 3 D: 3 E: 0 F: 1 G: 2 What is the mean degree of this graph?
The very basics IV A B G C D E F Nodes in directed graphs have in-degrees and out-degrees. Here: Node: in,out as follows: A: 1, 2 B: 1, 2 C: 2, 1 D: 2, 2 E: 1, 1 F: 1,2 G: 0, 2 A directed graph without cycles is called a DAG. Is this a DAG?
The very basics V This is a labelled graph. This is an unlabelled graph. graphs homepage teaching research Since labels and links have meaning, this one is different: It is exactly the same as (isormorphic to) this one: homepage graphs research teaching
Diversity of graphs: considering only loop-free graphs How many different 2-node, labelled undirected graphs are there? How many different 2-node, labelled directed graphs are there? How many different 3-node, labelled undirected graphs are there? Suppose there are G(k) possible undirected labelled graphs on k nodes. Whenever we add one extra node to an und. Lab. graph on k nodes: Any subset of the k existing nodes could link to it, and there are 2k such subsets. So the number of possible und. lab. graphs on k+1 nodes is 2k times what it is on k nodes.
More basics A B G C D E F If there is a path in the graph from each node to every other, the Graph is connected, else it is unconnected. This one?
More basics II The complete (undirected) graph on n nodes is the graph that contains all n(n1/)/2 possible edges. Is this one complete? A B C D Most graphs of interest and importance are far from complete – they tend to be called sparse. Think about the following graphs: 1: Nodes = students in this university; Edge {A,B} exists if A and B have the same birthday. 2. Nodes = web pages: Edge (A,B) exists if A links to B. 3. Nodes = types of molecules in our bloodstream, Edge(A,B) exists if A interacts with B. 4. Nodes = all living humans. Edge{A,B} exists if A and B have ever shaken hands.
More Structural Properties Diameter: length of the longest path between any two nodes Number of components: in undirected graphs Degree distribution: An interesting and important fingerprint of a graph that we will see more of. Modularity: A graph is highly modular if it has several clusters of nodes with many links within the clusters, but few links between the clusters. Hierarchical modularity. A graph seems to be hierarchically modular if it is modular, as above, but the modules are themselves modular.
Some Networks One of these is a network of protein interactions in yeast. The other is a visualisation of an outbreak of TB. What do the nodes and edges represent? And … which is which?
Is this: spread of HIV infection (node = person / link = HIV transfer) or is it: books about politics (node = book / link = one mentions the other)
Assignment 1 Read: Exploring Complex Networks, by Steven Stroglatz, Nature410, 268—276 Write: A 500-word `executive summary’ of most of this article. Leave out Box 1, and the section “Regular networks of coupled dynamical systems”, restart at “Complex network architectures”. AND Write: A 100-word account of what you assess to be the three main points conveyed by this article Write: A 200-word essay about the relevance of those points to the topic of your BSc or MSc (e.g. relevance to AI; relevance to IT(Business), etc..) Word limits in this assignment are important; over the limits means losing marks
Marking 30% of the marks: completeness and readability 30% of the marks: evidence of understanding the article, and generally making sense 30% of the marks: clarity of your arguments 10% of the marks: for making me say “Wow”
Next week Much more advanced, about: • Degree distributions • Cluster Co-efficients • Modularity and hierarchy • Random networks vs real networks • Some basic graph algorithms • Another article, much smaller, to read.