Lectures 6 & 7 Centrality Measures February 2, 2009

Lectures 6 & 7Centrality MeasuresFebruary 2, 2009 Monojit Choudhury monojitc@microsoft.com

A brief Intro to • Myself • Yourself • The course • The classes • Please ask questions • Don’t disturb otherwise • Please go back and read

I shall assume that you know • Basic graph theory • Adjacency matrix representation • Degree, in-degree, out-degree • Connected component, shortest paths • Basic linear algebra • Symmetric matrix, transpose • Vectors, multiplication of vectors with vectors and matrices, orthogonality • Eigenvectors and Eigenvalues

Lecture 5Centrality MeasuresFebruary 2, 2009 Monojit Choudhury monojitc@microsoft.com

Question 1: Information percolation 1 3 2 5 6 4 7 8 In this friendship network of 8 persons, suppose that someone comes to know about an interesting news. Who are most likely to receive this news fast?

Question 2: Searching the Web 1 3 2 5 6 4 7 8 In this hyperlinked network of webpages, which pages are most likely to contain authoritative information ?

Question 3: Spreading of STDs 1 3 2 5 6 4 7 8 In this hypothetical sexual interaction network, who are most likely to be affected by STDs such as AIDS?

A common answer to all the questions • Nodes which are most “CENTRAL” to the network • Centrality of a node measures its • Power, Prestige, Prominence & imPortance • The 4 “P”s

Degree Centrality • How many friends do you have? • Measure of centralization of the network • Star network – most centralized • Line graph – least centralized • Thus, the variance of degree centrality is the measure of (de)centralization of a network

How much is this network centralized? 1 3 2 5 6 4 7 8

When is centralization good/bad? • Fault tolerance • Centralized: bad • Decentralized: good • However, for random attacks • Centralized: good • What happens in a scale-free network?

Closeness Centrality • Reciprocal of the sum of shortest paths to all the nodes • Compute closeness centrality for nodes 3 and 6 1 3 2 5 6 4 7 8

Closeness Centrality • What does variance of closeness centrality indicate? • What would this variance be for • A Clique • A Tree • A Ring

Spreading of STDs 1 3 2 5 6 4 7 8 Who should be removed from this network to make this community less susceptible to spreading of STDs?

Joydeep Rich (in what?) Subrata Betweenness Centrality Joydeep has the opportunity to play a information broker – but Subrata doesn’t

Mathematical Definition Can be extended to edges v s t

Which networks have • Nodes with very small betweenness centrality • Node(s) with very high betweenness centrality • What is the betweenness centrality of the nodes in a complete bipartite network?

Question 2: Searching the Web 1 3 2 5 6 4 7 8 In this hyperlinked network of webpages, which pages are most popular?

The basic idea • I am popular if my friends are popular 1 3 2 5 6 4 7 8 p6 = p2 + p5 + p7 + p8

Computing Popularity 1 2 1 1 1 4 3 1 1 4 1 2 1 3 1 3

Computing Popularity Oops! Popularity grows unboundedly!! 2 6 1 4 4 9 10 3 13 4 2 6 3 10 10 3

A better approach 1/8 2/22 2/8 1/8 1/22 1/8 4/22 1/8 4/8 1/8 3/22 3/8 4/22 4/8 1/8 1/8 2/8 2/22 3/22 3/8 1/8 3/8 1/8 3/22

Computing popularity 2/22 6/68 6/22 1/22 4/68 4/22 9/68 4/22 9/22 10/22 10/68 3/22 13/68 13/22 4/22 2/22 6/22 6/68 10/68 10/22 3/22 10/22 10/68 3/22

Computing popularity 6/68 15/206 15/68 4/68 9/206 9/68 29/206 9/68 29/68 10/68 33/206 33/68 39/206 39/68 13/68 6/68 15/68 15/206 33/206 33/68 10/68 33/68 10/68 33/206

Is it converging? 15/206 1 9/206 3 29/206 2 5 33/206 39/206 6 15/206 4 33/206 8 33/206 7

Observations • The popularity values eventually converge • Nodes which are isomorphic have the same popularity • What happens when we start from a different initialization? • Does it converge for every graph? • What happens for a disconnected graph?

An alternative view to popularity • Random surfer model: • The surfer lands up on a random page • With probability w it stays in the same page, but with probability (1-w) it visits any other random link from the page 1 3 2 5 6 4 7 8

What’s the probability that the surfer is at node i? 1 3 2 5 6 4 7 8 p6 = wp6 + (1-w) [p2/4+ p5 + p7/3 + p8]

What’s the probability that the surfer is at node i? 1 3 2 5 6 4 7 8 pi = wpi+ (1-w)jajipj/dj

Therefore, popularity is • Eigenvector Centrality • Introduced by Bonacich (1972) • A slightly different variant is used as “PageRank” pi = (1-w)+ wjajipj/dj

Does all networks have  = 1 • Yes! • Actually, all stochastic matrices (aka Markov Matrices) have the largest Eigenvalue1 = 1 • Perron-Frobenius Theorem • If A is a positive matrix, so is its largest Eigenvalue 1 > all other | i |. Every component of the corresponding Eigenvector is also positive.

Lectures 6 & 7 Centrality Measures February 2, 2009