Milena Mihail Georgia Tech. with Stephen Young, Giorgos Amanatidis, Bradley Green

Flexible Models for Complex Networks Milena Mihail Georgia Tech. with Stephen Young, Giorgos Amanatidis, Bradley Green

, but no sharp concentration: frequency Erdos-Renyi 100 2 4 10 degree Sparse graphs with large degree-variance. “Power-law” degree distributions. Small-world, i.e. small diameter, high clustering coefficients. The Internet is constantly growing and evolving giving rise to new models and algorithmic questions.

However, in practice, there are discrepancies … , but no sharp concentration: frequency Erdos-Renyi 100 2 4 10 degree Sparse graphs with large degree-variance. “Power-law” degree distributions. Small-world, i.e. small diameter, high clustering coefficients. A rich theory of power-law random graphs has been developed [ Evolutionary & Configurational Models, e.g. see Rick Durrett’s ’07 book ].

“Flexible” models for complex networks: exhibit a “large” increase in the properties of generated graphs by introducing a “small” extension in the parameters of the generating model.

Models with power law and arbitrary degree sequences with additional constraints, such as specified joint degree distributions (from random graphs, to graphs with very low entropy). Models with semantics on nodes, and links among nodes with semantic proximity generated by verygeneral probability distributions. • RANDOM DOT PRODUCT GRAPHS • KRONECKER GRAPHS Talk Outline 1. Structural/Syntactic Flexible Models Generalizations of Erdos-Gallai / Havel-Hakimi 2. Semantic Flexible Models Generalizations of Erdos-Renyi random graphs

Assortativity: Talk Outline Models with power law and arbitrary degree sequences with additional constraints, such as specified joint degree distributions (from random graphs, to graphs with very low entropy). small large The networking community proposed that [Sigcomm 04, CCR 06 and Sigcomm 06], beyond the degree sequence , models for networks of routers should capture how many nodes of degree are connected to nodes of degree .

Networking Proposition [CCR 06, Sigcomm 06]: A random graph with same average degree as G. A real highly optimized network G. A random graph with same degree sequence as G. A graph with same number of links between nodes of degree and as G. A graph with same as G. A random graph with same degree sequence as G. A graph with same as G. A random graph with same degree sequence as G. A random graph with same degree sequence as G. A random graph with same degree sequence as G. A random graph with same degree sequence as G. A random graph with same degree sequence as G.

The Joint-Degree Matrix Realization Problem is: connected, mincost, random connected, mincost, random Definitions The (well studied) Degree Sequence Realization Problem is:

Open: Mincost, Random realizations of The Joint-Degree Matrix Realization Problem is: Theorem [Amanatidis, Green, M ‘08]: The natural necessary conditions for an instance to have a realization are also sufficient (and have a short description). The natural necessary conditions for an instance to have a connected realization are also sufficient (no known short description). There are polynomial time algorithms to construct a realization and a connected realization of , or produce a certificate that such a realization does not exist.

Degree Sequence Realization Problem: Given arbitrary Is this degree sequence realizable ? If so, construct a realization. Advantages:Flexibilityin: enforcing or precluding certain edges, adding costs on edges and finding mincost realizations, close to matching  close to sampling/random generation. Reduction to perfect matching: 2

Theorem[Erdos-Gallai]: A degree sequence is realizable iff the natural necessarycondition holds: moreover, there is a connectedrealization iff the natural necessary condition holds:

1 1 4 1 3 2 2 [ Havel-Hakimi ] Construction Algorithm: Greedy: any unsatisfied vertex is connected with the vertices of highest remaining degree requirements. 0 0 delete add 0 1 2 3 0 add Connectivity, if possible, attained with 2-switches. 0 1 2 0 delete 0

Theorem [Cooper, Frieze & Greenhill 04]:The Markov chain corresponding to a general 2-link switch is rapidly mixing for degree sequences with . Random generation of graphwith a given degree sequence:

Theorem[Feder,Guetz,M,Saberi 06]:The Markov chain corresponding to a local 2-link switch is rapidly mixing if the degree sequence enforces diameter at least 3, and for some . Random generation of connected graph with a given degree sequence:

Theorem, Joint Degree Matrix Realization[Amanatidis, Green, M ‘08]: Proof [sketch]:

Note:This may NOT be a simple “augmenting” path. Balanced Degree Invariant: Example Case Maintaining Balanced Degree Invariant: add add delete delete add

Theorem, Joint Degree Matrix Connected Realization [Amanatidis, Green, M ‘08]: Proof [sketch]:

The algorithm explores vertices of the same degree in different components, transforming the graph to bring it to a form amenable to rewiring by 2-switch, if possible . Main Difficulty: Two connected components are amenable to rewiring by 2-switches, only using twovertices of the same degree. connected component connected component

4 2 3 9 available edges & 11 vertices. There are not enough edges to connect all the vertices! Certificates of non-existence of connected realizations result from contractions of subsets of performed by the algorithm (as it was searching for transformations amenable to 2-switch rewirings across connected components.) 4 1 1 1 2 1 2 2

Open ProblemsforJoint Degree Matrix Realization • Construct mincost realization. • Construct random realization. • Satisfy constraints between arbitrary subsets of vertices. • Is there a reduction to matching or flow or some other well understood combinatorial problem? • Is there evidence of hardness?

Talk Outline 1. Structural/Syntactic Flexible Models Models with semantics on nodes, and links among nodes with semantic proximity generated by verygeneral probability distributions. Generalizations of Erdos-Gallai / Havel-Hakimi 2. Semantic Flexible Models Generalizations of Erdos-Renyi random graphs • RANDOM DOT PRODUCT GRAPHS • KRONECKER GRAPHS

RANDOM DOT PRODUCT GRAPHS Kratzl,Mickel,Sheinerman 05 Young,Sheinerman 07 Young,M 08

SUMMARY OF RESULTS • A semi-closed formula for degree distribution and graphs with a wide variety of densities and degree distributions, including power-laws. • Diameter characterization (determined by Erdos-Renyi for similar average density) • Positive clustering coefficient, depending on the “distance” of the generating distribution from the uniform distribution. Remark: Power-laws and the small world phenomenon are the hallmark of complex networks.

Theorem ( removing error terms) [Young, M ’08] A Semi-closed Formula for Degree Distribution Theorem [Young, M ’08]

Example: (a wide range of degrees, except for very large degrees) indicating a power-law with exponent between 2 and 3. This is in agreement with real data.

Theorem [Young, M ’08] Remark: If the graph can become disconnected. It is important to obtain characterizations of connectivity as approaches . This would enhance model flexibility Re Diameter Characterization

Clustering Characterization Theorem [Young, M ’08] Remarks on the proof

Open Problems forRandom Dot Product Graphs • Fit real data, and isolate “benchmark” distributions . • Characterize connectivity (diameter and conductance) as approaches . • Similarity functions beyond inner product (e.g. Kernel functions). • Algorithms: navigability, information/virus propagation, etc. • Do further properties of X characterize further properties of ?

0 1 0 0 0 1 1 1 0 0 1 1 0 1 0 1 1 1 1 1 KRONECKER GRAPHS [Faloutsos, Kleinberg,Leskovec 06] 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 1 0 0 0 0 0 1 0 1 0 0 0 0 1 1 1 1 0 0 0 1 0 0 0 1 0 0 1 1 0 0 1 1 0 1 0 1 0 1 0 1 1 1 1 1 1 1 1 1 Another “semantic” “ flexible” model, introducing parametrization. Several properties characterized.

a b aaa baa aa caa baa bab bab aab ab cab aba bba ba cba bba bbb bb bbb abb cbb b c aab bab bab ab cab bac ac bac aac cac bbb abb bb cbb bbb bbc bbc abc cbc bc cba bba bba aba ba abb bbb bb bbb cbb bca ca cca bca bcb acb bcb ccb cb abb bbb cbb bb bbb bbc abc bc cbc bbc acd cd bcd bcd ccd acc bcc bcc ccc cc STOCHASTIC KRONECKER GRAPHS [ Faloutsos, Kleinberg, Leskovec 06] aca Several properties characterized (e.g. multinomial degree distributions). Large scale data set have been fit.

Summary 1. Structural/Syntactic Flexible Models Generalizations of Erdos-Gallai / Havel-Hakimi 2. Semantic Flexible Models Generalizations of Erdos-Renyi random graphs

Where it all started: Kleinberg’s navigability model ? Moral: Parametrization is essential in the study of complex networks Theorem [Kleinberg]: The only value for which the network is navigable isr =2.

Milena Mihail Georgia Tech. with Stephen Young, Giorgos Amanatidis, Bradley Green

Milena Mihail Georgia Tech. with Stephen Young, Giorgos Amanatidis, Bradley Green

Presentation Transcript

GREEN TECH Breakout

Georgia Tech Counseling Center

Georgia Tech Green Purchasing

Georgia Tech

Milena Mihail mihail@cc.gatech.edu

The Stephen Green Foundation

Case Study: Georgia Tech

Georgia Tech Discussion

Online @ Georgia Tech

Georgia Tech RescueBot

Milena Mihail Georgia Tech. with Stephen Young, Giorgos Amanatidis, Bradley Green

Peng Zhao, Georgia Tech Dr. Zhigang Peng, Georgia Tech

LOCKSS at Georgia Tech

Stephen Gaw, MS Georgia Tech MSPO April 23, 2008

Georgia Tech

Georgia Tech Campus Catering

MINERVA GROUP @ Georgia Tech

International @ Georgia Tech

Giorgos Georgiadis