480 likes | 666 Views
Diversity of Graphs with Highly Variable Connectivity*. David Alderson Operations Research Department Naval Postgraduate School *Joint work with Lun Li (Caltech) Acknowledgments: John Doyle, Walter Willinger, Daniel Whitney IPAM Workshop: Random and Dynamic Graphs and Networks May 8, 2007.
E N D
Diversity of Graphs with Highly Variable Connectivity* David Alderson Operations Research Department Naval Postgraduate School *Joint work with Lun Li (Caltech) Acknowledgments: John Doyle, Walter Willinger, Daniel Whitney IPAM Workshop: Random and Dynamic Graphs and Networks May 8, 2007
Random and Dynamic Graphs and Networks objective: characterize the structure and behavior of a large, complex network approach: focus on graph theoretic properties • measure the connectivity statistics of real networks • develop generative models to explain what is observed • consider dynamics • dynamics of the network: changes to the network itself • dynamics on the network: separate dynamical processes constrained by a given network structure implicit assumption: graph theoretic properties adequately capture key system features in order to serve as a basis for comparison and contrast IPAM Workshop: Random and Dynamic Graphs and Networks
What can go wrong? Potential pitfall #1: attempting to use a simple graph to represent a complex system involving • heterogeneous components • layered architectures • feedback dynamics Possible result: modeling artifacts lead to misinterpretation and misrepresentation of what “matters” for system function References: • The “robust yet fragile” nature of Internet topology [Doyle et al, PNAS 102, 14497 (2005)] • Cellular metabolism [Tanaka, Phys Rev Lett 94, 168101 (2005)] this will not be the focus of this talk IPAM Workshop: Random and Dynamic Graphs and Networks
What can go wrong? Potential pitfall #2: ignoring the fact that many different processes for network formation can give rise to the same structural properties Equivalently: assuming that the ability to reproduce an observed structural property of a graph is evidence that a particular mechanism “explains” the presence of that property Example: preferential attachment power laws Reference: Li, Alderson, Doyle, Willinger. Toward a Theory of Scale-Free Networks: Definition, Properties, and Implications.Internet Mathematics 2(4), 2006. this will not be the focus of this talk IPAM Workshop: Random and Dynamic Graphs and Networks
What can go wrong? Potential pitfall #3: assuming that a particular statistical description is sufficient to characterize graph structure Equivalently: failing to recognize that there can be great diversity even among graphs having the same statistics Example: degree distributions, particularly when heavy-tailed Ref: D. Alderson and L. Li. Diversity of graphs with highly variable connectivity. Phys Rev E 75, 046102 (2007) this will be the focus of this talk IPAM Workshop: Random and Dynamic Graphs and Networks
basic notation IPAM Workshop: Random and Dynamic Graphs and Networks
graphs with degree sequence D IPAM Workshop: Random and Dynamic Graphs and Networks
graphs with degree sequence D • restriction to a particular D: popular for graph generation • Configuration Model (CM) as a null hypothesis • it yields graphs that are maximally random (in the sense of maximum entropy) • selected references • Bender and Canfield (1978) • Molloy and Reed (1995) • Aiello et al (2000) • Newman (2002) • Chung and Lu (2003) • Here, we always restrict attention to a particular D IPAM Workshop: Random and Dynamic Graphs and Networks
degree sequences and correlation general recognition: degree sequence of a graph provides only a simplistic characterization of its properties recently: consider more sophisticated descriptions of network connectivity, with emphasis on correlation • simple notions of network clustering (i.e., connectivity correlations between vertex triplets) • more general degree-degree correlations (also called the joint degree distribution or JDD) • spectral methods a growing literature on the importance of correlation structure in networks IPAM Workshop: Random and Dynamic Graphs and Networks
correlation (Pearson) coefficient • graph assortativity: how likely will a vertex connect to another having similar degree? • the correlation coefficient summarizes the joint distribution P(k,k') that a randomly selected link in the network will connect vertices having degree values k and k' IPAM Workshop: Random and Dynamic Graphs and Networks
correlation (Pearson) coefficient • a summary statistic for the graph’s correlation profile • consistently positive for some kinds of networks • consistently negative for others • often cited as a key feature distinguishing various classes of complex networks • several explanations have been offered • Maslov and Sneppen (2002) • Newman and Park (2003) • evidence suggesting that correlation coefficient is constrained by the degree sequence of the graph IPAM Workshop: Random and Dynamic Graphs and Networks
Basic Questions • How does the degree sequence of a graph dictate connectivity features, including correlation structure? • What kind of diversity exists among graphs having the same degree sequence? • What are the implications for the use of degree-based graph generation techniques as models of real systems? • Can the graph theoretic properties of networks from different application contexts be directly compared? • How should one interpret the graph theoretic properties of a network when studied in isolation? IPAM Workshop: Random and Dynamic Graphs and Networks
a structural metric Implicitly, s(g) measures the extent to which the graph g has a “hub-like” core and is maximized when high-degree vertices are connected to other high-degree vertices. IPAM Workshop: Random and Dynamic Graphs and Networks
s-metric: extreme points IPAM Workshop: Random and Dynamic Graphs and Networks
the restricted space G(D) IPAM Workshop: Random and Dynamic Graphs and Networks
properties of s(g) and smax • s(g) easily computed for any graph g • depends only on the structural features of g, not how it was generated In G(D): • high degree nodes in the smax graph have high centrality (a monotonic relationship in trees) • smax graphs are self-similar under appropriately defined operations of trimming and coarse graining • thesmax graph has the highest likelihood of being generated by the Generalized Random Graph (GRG) model [Chung and Lu 2003] IPAM Workshop: Random and Dynamic Graphs and Networks
measuring graph diversity • We will use s-values to measure diversity among graphs having the same degree sequence D. • The difference smax – smin provides a simple bound on the possible diversity equivalent in practice IPAM Workshop: Random and Dynamic Graphs and Networks
How different are the sminand smaxvalues? Answer: it depends on the variability in D itself. IPAM Workshop: Random and Dynamic Graphs and Networks
a chain a star reference graphs: chains and stars IPAM Workshop: Random and Dynamic Graphs and Networks
reference graphs: exponential and scaling IPAM Workshop: Random and Dynamic Graphs and Networks
variability in degree sequence high low IPAM Workshop: Random and Dynamic Graphs and Networks
a numerical experiment Graph generation via preferential attachment: Given a choice for n and p, a single experiment yields: • A connected tree with unspecified degree sequence D • Given D: solve analytically for smin and smax within G(D) • Given D: computesmaxinG(D)via deterministic algorithm • Given D: computesmininG(D)heuristically Repeating this experiment for different values of p yields a systematic means for generating different degree sequences IPAM Workshop: Random and Dynamic Graphs and Networks
a numerical experiment Graph generation via preferential attachment: Note: one can obtain the reference graphs from different p • p -∞ yields Dchain • p= 0 yields Dexp • p= 1 yields Dscaling • p ∞ yields Dstar We are more interested in the degree sequences D than the values of p that generated them. IPAM Workshop: Random and Dynamic Graphs and Networks
smaxin G(D) smax in G(D) smin in G(D) sminin G(D) 5 10 4 10 3 10 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 C (D) V numerical results: trees of size n=100 IPAM Workshop: Random and Dynamic Graphs and Networks
measuring diversity with s(g) • Raw values of s(g) may not be informative • Consider normalized versions of s(g) IPAM Workshop: Random and Dynamic Graphs and Networks
smaxin G(D) smax in G(D) smin in G(D) sminin G(D) 1 5 10 smax smin 0.9 4 10 0.8 3 10 s/smax 0.7 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 C (D) V 0.6 0.5 0.4 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 CV(D) numerical results: trees of size n=100 “normalized”valuess/smaxinG(D) IPAM Workshop: Random and Dynamic Graphs and Networks
measuring diversity with s(g) • Raw values of s(g) may not be informative • Consider normalized versions of s(g) IPAM Workshop: Random and Dynamic Graphs and Networks
assortativity revisited s(g) ??? IPAM Workshop: Random and Dynamic Graphs and Networks
a perfect zero assortativity “graph” IPAM Workshop: Random and Dynamic Graphs and Networks
Pearson coefficient revisited IPAM Workshop: Random and Dynamic Graphs and Networks
smaxin G(D) smax in G(D) smin in G(D) sminin G(D) 1 5 10 smax smin 0.9 4 10 0.8 s/smax 3 10 0.7 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 C (D) V 0.6 0.5 0.4 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 CV(D) numerical results: trees of size n=100 IPAM Workshop: Random and Dynamic Graphs and Networks
rmax rmin numerical results: trees of size n=100 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 CV(D) IPAM Workshop: Random and Dynamic Graphs and Networks
smaxin G(D) smax in G(D) smin in G(D) sminin G(D) 1 0.6 5 10 smax rmax rmin smin 0.4 0.9 4 10 0.2 0.8 0 s/smax 3 10 0.7 -0.2 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 C (D) V -0.4 0.6 -0.6 0.5 -0.8 0.4 -1 0 0 0.5 0.5 1 1 1.5 1.5 2 2 2.5 2.5 3 3 3.5 3.5 4 4 4.5 4.5 5 5 CV(D) CV(D) numerical results: trees of size n=100 IPAM Workshop: Random and Dynamic Graphs and Networks
Pearson coefficient and background sets The implicit use of G(D) as background set for r(g) means: • For degree sequences D with high Cv(D), r(g) is always negative and tends to hide differences in s(g) • For degree sequences D with low Cv(D), r(g) is very sensitive to small structural changes and tends to exaggerate differences in s(g) IPAM Workshop: Random and Dynamic Graphs and Networks
1 10 Node Rank 8 18 8 8 128 26 26 8 8 14 8 13 32 64 43 11 21 0 10 128 13 1 2 10 10 64 43 8 8 Node Degree 8 18 16 64 128 12 21 12 32 16 26 12 8 32 8 8 8 14 14 8 128 16 43 18 11 8 32 21 16 13 64 26 12 14 11 21 18 43 13 11 8 four graphs with the same D smax = 77350 rmax = -0.4243 s = 29876 s/smax = 0.3862 S = 0.022 r = -0.4815 s = 74010 s/smax = 0.956 S = 0.931 r = -0.4283 s = 33959 s/smax = 0.4390 S = 0.106 r = -0.4766 s = 60271 s/smax = 0.7792 S = 0.648 r = -0.4449 IPAM Workshop: Random and Dynamic Graphs and Networks
8 18 8 8 128 26 26 8 8 14 8 13 32 64 43 11 21 128 13 64 43 8 8 8 18 16 64 128 12 21 12 32 16 26 12 8 32 8 8 8 14 14 8 128 16 43 18 11 8 32 21 16 13 64 26 12 14 11 21 18 43 13 11 8 Source: Doyle et al, PNAS (2005) “HOTnet” “poor design” “HSFnet” “random” s = 29876 s/smax = 0.3862 S = 0.022 r = -0.4815 s = 74010 s/smax = 0.956 S = 0.931 r = -0.4283 s = 33959 s/smax = 0.4390 S = 0.106 r = -0.4766 s = 60271 s/smax = 0.7792 S = 0.648 r = -0.4449 IPAM Workshop: Random and Dynamic Graphs and Networks
Recap • considerable diversity among graphs having same D • sequence D constrains the possible values of s(g) • variability in D itself • background sets: implications for interpretation • r(g) as a normalization of s(g) in G(D) • structural differences can be hidden or exaggerated Questions • How does a “random” graph compare against smin and smax values? • Implications for use of random graphs as a basis for comparison? IPAM Workshop: Random and Dynamic Graphs and Networks
numerical experiment revisited • For a given attachment exponent p, generate a tree having n = 100 nodes (with corresponding D) • For the resulting degree sequence D: • Solve analytically for smin and smax within G(D) • ComputesmaxinG(D)via deterministic algorithm • ComputesmininG(D)heuristically • Generate an ensemble of “random” graphs having D • degree preserving rewiring in G(D) • degree preserving rewiring in G(D) • configuration method (CM), implicitly in G(D) IPAM Workshop: Random and Dynamic Graphs and Networks
degree sequence D 2 vertex rank 10 CV(D)=0.6380 2 2 2 2 1 2 10 2 2 2 2 2 2 4 4 2 1 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 2 5 3 2 4 2 2 2 2 2 2 4 4 4 2 4 2 2 4 4 2 3 4 2 3 2 3 3 2 3 2 3 0 4 10 2 4 6 6 2 2 2 2 2 2 2 2 3 vertex degree 3 2 4 4 2 6 3 2 3 2 4 2 3 6 2 2 4 2 5 2 2 3 3 2 3 4 2 2 2 6 4 2 3 4 5 3 2 3 3 5 3 6 3 2 2 3 2 2 3 4 6 2 6 2 3 3 5 2 3 2 3 3 2 2 2 2 3 3 2 2 4 2 2 3 3 4 2 3 2 3 2 2 3 4 2 2 4 5 2 3 2 2 2 2 2 6 1 2 3 4 5 6 uniform attachment (p = 0) the smax graph in G(D) the original graph the smin graph in G(D) (a) smax= 843, s/smax = 1, S = 1,rmax = 0.34 sorig = 765, s/smax = 0.91, S = 0.71,rorig = 0.01 s=572, s/smax=0.68, S=0.04, r = -0.82 Cumulative Distribution of graphs having degree D r-values 1 0.8 0.6 0.4 smin sorig smax 0.2 s-values 0 550 600 650 700 750 800 850 900 950 IPAM Workshop: Random and Dynamic Graphs and Networks
degree sequence D 2 vertex rank 10 CV(D)=1.4121 1 2 10 10 2 2 2 2 4 2 8 2 13 5 10 3 2 2 2 2 2 3 2 4 2 3 0 10 2 4 2 0 1 2 5 10 10 10 2 4 1 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 3 5 vertex degree 3 23 2 5 8 23 2 4 2 13 2 2 2 2 2 5 3 3 13 3 2 3 2 2 3 23 3 2 2 2 2 3 2 5 2 5 2 2 2 3 8 3 2 10 2 3 2 2 4 5 2 2 2 2 4 2 5 2 2 4 4 2 2 2 linear preferential attachment (p 1) the smax graph in G(D) the original graph the smin graph in G(D) (b) sorig=1894, s/smax=0.71, S=0.50,rorig= -0.31 smax = 2659 , s/smax = 1, S = 1,rmax= -0.16 s=1182, s/smax=0.44, S=0.03, r = -0.45 Cumulative Distribution of graphs having degree D r-values 1 0.8 0.6 0.4 smin sorig smax 0.2 s-values 0 8000 0 1000 2000 3000 4000 5000 6000 7000 IPAM Workshop: Random and Dynamic Graphs and Networks
degree sequence D 2 vertex rank 10 CV(D)=2.5104 2 2 4 2 1 2 2 5 10 3 5 8 3 8 2 2 2 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 4 2 2 3 2 0 10 2 0 1 2 47 10 10 10 19 2 3 5 2 vertex degree 3 47 3 3 2 2 2 2 19 2 19 2 2 3 2 2 3 2 2 2 2 2 2 8 3 2 3 2 2 2 3 4 47 2 2 2 superlinear attachment (p > 1) the smax graph in G(D) the original graph the smin graph in G(D) (c) sorig=4623, s/smax=0.90, S=0.78,rorig= -0.44 smax = 5131, s/smax = 1, S = 1,rmax= -0.43 s=2844, s/smax=0.55, S=0.03, r= -0.49 Cumulative Distribution of graphs having degree D r-values 1 0.8 0.6 0.4 sorig 0.2 smax smin s-values 0 0 0.5 1 1.5 2 2.5 3 4 x 10 very unlikely that a “random” graph will be in G(D) IPAM Workshop: Random and Dynamic Graphs and Networks
observations • For each D, there is considerable diversity • smin is very “chain like” • smax is very “star like” • Range for G(D) is greater than for G(D), and this increases with variability in D • Assortativity r(g) hides some of these differences, while s(g) highlights them • Generating an ensemble of graphs using random rewiring is unlikely to obtain the smin and smax values • Good correspondence between random rewiring in G(D) and CM, with values largely centered on r(g)=0 • The distribution of graphs in G(D) is consistently shifted toward larger s-values than those in G(D) IPAM Workshop: Random and Dynamic Graphs and Networks
numerical experiment: non-trees • For a given attachment exponent p, generate a tree having n = 100 nodes (with corresponding D) • initial graph: n nodes, n-1 links • add an additional k(n-1) links using same (k) • For the resulting degree sequence D: • Solve analytically for smin and smax within G(D) • ComputesmaxinG(D)via deterministic algorithm • ComputesmininG(D)heuristically • Compute rmin and rmax in G(D) accordingly • Repeat many times IPAM Workshop: Random and Dynamic Graphs and Networks
Takeaway Message #1 Considerable diversity exists among graphs having the same degree sequence. Open question: To what extent does a similar story hold for higher order descriptions, including correlation structure? IPAM Workshop: Random and Dynamic Graphs and Networks
Takeaway Message #2 Graphs that arise from different contexts may not be directly comparable using structural metrics unless defined in terms of an appropriate and consistent background set. The differences between the “unconstrained” space G(D) and the space of simple, connected graphs G(D) may be more important in determining graph properties than other features as measured by aggregate statistics. IPAM Workshop: Random and Dynamic Graphs and Networks
Takeaway Message #3 While it is clear that the evaluation of a graph based on its structural properties may be appropriate only in relation to the corresponding background set, understanding the implication of those structural features (e.g., in terms of function) remains an open question. For example, it remains unclear what, if anything, the relative placement of a graph within the range [smin , smax] actually says about the graph itself. IPAM Workshop: Random and Dynamic Graphs and Networks
selected references • D. Alderson and L. Li. Diversity of Graphs With High Variability.Phys Rev E 75, 046102 (2007) • D. Alderson, H. Chang, M. Roughan, S. Uhlig, and W. Willinger. The Many Facets of Internet Topology and Traffic.AIMS Journal on Networks and Heterogeneous Media, 4(1), Dec. 2006. • L. Li, D. Alderson, J.C. Doyle, W. Willinger. Toward a Theory of Scale-Free Networks: Definition, Properties, and Implications.Internet Mathematics 2(4), 2006. • D. Alderson, L. Li, W. Willinger, J.C. Doyle. Understanding Internet Topology: Principles, Models, and Validation.IEEE Trans. on Networking. 13(6): Dec 2005. • J.C. Doyle, D. Alderson, L. Li, S. Low, M. Roughan, S. Shalunov, R. Tanaka, and W. Willinger. The "robust yet fragile" nature of the Internet.PNAS. October 4, 2005. • D. Alderson and W. Willinger. A contrasting look at self-organization in the Internet and next-generation communication networks. IEEE Comm. Magazine. July 2005. • L. Li, D. Alderson, W. Willinger, and J. Doyle, A first-principles approach to understanding the Internet’s router-level topology,Proc. ACM SIGCOMM 2004. • D. Alderson, J. Doyle, R. Govindan, and W. Willinger. Toward an Optimization-Driven Framework for Designing and Generating Realistic Internet Topologies. In ACM SIGCOMM Computer Communications Review, January 2003. IPAM Workshop: Random and Dynamic Graphs and Networks