260 likes | 352 Views
The Very Small World of the Well-connected. (19 june 2008 ). Xiaolin Shi Department of EECS University of Michigan Ann Arbor, MI 48109-2121 shixl@umich.edu Matthew Bonner Department of EECS University of Michigan Ann Arbor, MI 48109-2121 mabonner@umich.edu. Lada Adamic
E N D
The Very Small World of the Well-connected.(19 june 2008 ) Xiaolin Shi Department of EECS University of Michigan Ann Arbor, MI 48109-2121 shixl@umich.edu Matthew Bonner Department of EECS University of Michigan Ann Arbor, MI 48109-2121 mabonner@umich.edu Lada Adamic School of Information University of Michigan Ann Arbor, MI 48109-1107 ladamic@umich.edu Anna C. Gilbert Department of Mathematics University of Michigan Ann Arbor, MI 48109-1043 annacg@umich.edu School of Information. University of Michigan Ann Arbor, MI 48109-1107
The Very Small World of the Well-connected. Introduction • PRELIMINARIES • Importance measures • Network datasets proprieties description • IMPORTANT VERTICES • Network properties and important vertices • Original vs. subgraph properties • Summary School of Information. University of Michigan Ann Arbor, MI 48109-1107
The Very Small World of the Well-connected. PRELIMINARIES • Importance measures Let the graph G (V,E ) have |V | = n vertices • 1 Degree D (vi ): • Is the number of edges incident to vi. • Degree reflects a local property of the vertices in the graph. School of Information. University of Michigan Ann Arbor, MI 48109-1107
The Very Small World of the Well-connected. PRELIMINARIES • Importance measures Let the graph G (V,E ) have |V | = n vertices • 1 Degree D (vi ). • 2 BetweennessB (vi ) : • a measure of how many pairs of vertices go through vi in order to connect through shortest paths in G: School of Information. University of Michigan Ann Arbor, MI 48109-1107
The Very Small World of the Well-connected. PRELIMINARIES • Importance measures Let the graph G (V,E ) have |V | = n vertices • 1 Degree D (vi ). • 2 Betweenness B (vi ). • 3 Closeness C (vi ): • a measure of the distances from all other vertices in G to vertex vi • closeness means that vertices that are in the “middle” of the network are important. School of Information. University of Michigan Ann Arbor, MI 48109-1107
The Very Small World of the Well-connected. PRELIMINARIES • Importance measures Let the graph G (V,E ) have |V | = n vertices • 1 Degree D (vi ). • 2 Betweenness B (vi ). • 3 Closeness C (vi ). • 4 PageRank : • a variant of the Eigenvector centrality measure and assigns greater importance to vertices that are themselves neighbors of important vertices School of Information. University of Michigan Ann Arbor, MI 48109-1107
The Very Small World of the Well-connected. PRELIMINARIES • Network datasets proprieties description • Data sets is a representative of web. • Data sets as an online social network data. • Data sets will be interested in examining the properties of important vertices and their graph synopsis. School of Information. University of Michigan Ann Arbor, MI 48109-1107
The Very Small World of the Well-connected. PRELIMINARIES • Network datasets proprieties description • prototypical random graph • 1 Erdos-Renyi random graph : • each pair of vertices having an equal probability p of being joined by an edge. • |V | = 10000 ; p = 0.001 ; d = p × |V | = 10. School of Information. University of Michigan Ann Arbor, MI 48109-1107
The Very Small World of the Well-connected. PRELIMINARIES • Network datasets proprieties description • prototypical random graph • 1 Erdos-Renyi random graph. • 2 Budyzoo dataset : • Considered as the first real-world network • producing an undirected graph from AOL Instant Messenger (AIM) • Users >> Nodes • Contact list >> edges School of Information. University of Michigan Ann Arbor, MI 48109-1107
The Very Small World of the Well-connected. PRELIMINARIES • Network datasets proprieties description • prototypical random graph • 1 Erdos-Renyi random graph. • 2 Budyzoo dataset. • 3 TREC (Text REtrieval Conference). • Considered as the second real-world graph • is a network of blog connections • It is a crawl of 100,649 RSS and Atom feeds collected • The TREC dataset contains • Hyperlinks, comments, trackbacks, etc. • removed • feeds and feeds without a homepage or permalinks are. • over 300 Technorati tags. which are in fact automatically generated • are not true indicators of social linking. School of Information. University of Michigan Ann Arbor, MI 48109-1107
The Very Small World of the Well-connected. PRELIMINARIES • Network datasets proprieties description • prototypical random graph • 1 Erdos-Renyi random graph. • 2 Budyzoo dataset. • 3 TREC (Text REtrieval Conference). • 4 Web graph dataset • 259,794 websites • 50 million pages • Collected in 1998 School of Information. University of Michigan Ann Arbor, MI 48109-1107
The Very Small World of the Well-connected. PRELIMINARIES • Network datasets proprieties description • prototypical random graph • 1 Erdos-Renyi random graph. • 2 Budyzoo dataset. • 3 TREC (Text REtrieval Conference). • 4 Web graph dataset the decade old website-level data set == Similarity == applicable to larger, ore current webcrawls recent blog datasets School of Information. University of Michigan Ann Arbor, MI 48109-1107
IMPORTANT VERTICES The Very Small World of the Well-connected. Network properties and important vertices • 1 Degree distributions. • The degree distributions of online networks due to the limitation of The data sampling Type : social networks School of Information. University of Michigan Ann Arbor, MI 48109-1107
IMPORTANT VERTICES The Very Small World of the Well-connected. Network properties and important vertices • 1 Degree distributions. • 2 Correlation of importance values of different measures. • relationships of importance measures in different networks. • Analysis of correlation • Higher : degree, betweenness and PageRank • Lower : closeness. School of Information. University of Michigan Ann Arbor, MI 48109-1107
IMPORTANT VERTICES The Very Small World of the Well-connected. Network properties and important vertices • 1 Degree distributions. • 2 Correlation of importance values of different measures. • 3 Assortativity. • The concept of assortativity or assortative mixing is defined as the preference of the vertices in a network to have edges with others that are similar. School of Information. University of Michigan Ann Arbor, MI 48109-1107
The Very Small World of the Well-connected. • Assortativity : School of Information. University of Michigan Ann Arbor, MI 48109-1107
The Very Small World of the Well-connected. Important vertices in their subgraphs. School of Information. University of Michigan Ann Arbor, MI 48109-1107
The Very Small World of the Well-connected. • Connectivity School of Information. University of Michigan Ann Arbor, MI 48109-1107
The Very Small World of the Well-connected. • Density School of Information. University of Michigan Ann Arbor, MI 48109-1107
IMPORTANT VERTICES The Very Small World of the Well-connected. Network properties and important vertices • 1 Degree distributions. • 2 Correlation of importance values of different measures. • Assortativity. • 3 Important vertices in their subgraphs. • Connectivity • Density School of Information. University of Michigan Ann Arbor, MI 48109-1107
IMPORTANT VERTICES The Very Small World of the Well-connected. • Original vs. subgraph properties • 1 Density • 2 distance. • 3 Relative importance. School of Information. University of Michigan Ann Arbor, MI 48109-1107
The Very Small World of the Well-connected. • Density School of Information. University of Michigan Ann Arbor, MI 48109-1107
The Very Small World of the Well-connected. • distance School of Information. University of Michigan Ann Arbor, MI 48109-1107
The Very Small World of the Well-connected. • Relative importance School of Information. University of Michigan Ann Arbor, MI 48109-1107
IMPORTANT VERTICES The Very Small World of the Well-connected. Original vs. subgraph properties • 1 Density. • 2 distance. • 3 Relative importance. School of Information. University of Michigan Ann Arbor, MI 48109-1107
Summary and conclusion The Very Small World of the Well-connected. • two overall observations about the four networks: • Different importance measures yield subgraphs of varying density and topology • However, in spite of these differences, “important vertices” in the online networks have some properties that agree with each other • Thus, we know that in the real online networks, in contrast to random graph model • the important vertices tend to preserve information about the relationships among important vertices • we can use • the subgraphs to study the properties of important vertices in the original graphs. School of Information. University of Michigan Ann Arbor, MI 48109-1107