1 / 45

Graphs Definitions, Measures, Graphology, Pathology, Degree Distributions

Graphs Definitions, Measures, Graphology, Pathology, Degree Distributions. Network Science: Graph Theory 2012. COMPONENTS OF A COMPLEX SYSTEM. components : nodes , vertices N. interactions : links, edges L. system : network, graph (N,L).

Download Presentation

Graphs Definitions, Measures, Graphology, Pathology, Degree Distributions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Graphs Definitions, Measures, Graphology, Pathology, Degree Distributions Network Science: Graph Theory 2012

  2. COMPONENTS OF A COMPLEX SYSTEM • components: nodes, vertices N • interactions: links, edges L • system: network, graph (N,L) Network Science: Graph Theory 2012

  3. NETWORKS OR GRAPHS? • network often refers to real systems • www, • social network • metabolic network. • Language: (Network, node, link) • graph: mathematical representation of a network • web graph, • social graph (a Facebook term) • Language: (Graph, vertex, edge) • We will try to make this distinction whenever it is appropriate, but in most cases we will use the two terms interchangeably. Network Science: Graph Theory 2012

  4. A COMMON LANGUAGE friend Movie 1 co-worker Mary Actor 2 Peter Actor 1 Albert Actor 4 Movie 3 friend brothers Movie 2 Albert Actor 3 Protein 2 Protein 1 Protein 5 N=4 L=4 Protein 9 Network Science: Graph Theory 2012

  5. CHOOSING A PROPER REPRESENTATION The choice of the proper network representation determines our ability to use network theory successfully. In some cases there is a unique, unambiguous representation. In other cases, the representation is by no means unique. For example,, the way we assign the links between a group of individuals will determine the nature of the question we can study. Network Science: Graph Theory 2012

  6. UNDIRECTED VS. DIRECTED NETWORKS Undirected Directed Links: undirected (symmetrical) Graph: Links: directed (arcs). Digraph = directed graph: L A D M B An undirected link is the superposition of two opposite directed links. F C I D G B E G A H C F Directed links : URLs on the www phone calls metabolic reactions Undirected links : coauthorship links Actor network protein interactions Network Science: Graph Theory 2012

  7. NODE DEGREES Node degree: the number of links connected to the node. A Undirected B D B C • In directed networks we can define an in-degree and out-degree. The (total) degree is the sum of in- and out-degree. • Source: a node with kin= 0; Sink: a node withkout= 0. Directed E G A F

  8. Graph density • No. of the connections that may exist between n nodes • directed graph emax= n*(n-1) • undirected graph emax= n*(n-1)/2 • What fraction are present? • density = e/ emax • For example, out of 12 possible connections, this graph has 7, giving it a density of 7/12 = 0.583

  9. Graph density • Would this measure be useful for comparing networks of different sizes (different numbers of nodes)? • As n → ∞, a graph whose density reaches • 0 is a sparse graph • a constant is a dense graph

  10. AVERAGE DEGREE Recall: <x> = (x1 + x1 + ... + xN)/N = Σi xi /N Undirected j • N – the number of nodes in the graph • L – the number of links in the graph i D B C Directed E A F Network Science: Graph Theory 2012

  11. COMPLETE GRAPH The maximum number of links a network of N nodes can have is: • A graph with degree L=Lmaxis called a complete graph, and its average degree is <k>=N-1 Network Science: Graph Theory 2012

  12. REAL NETWORKS ARE SPARSE Most networks observed in real systems are sparse: L << Lmax or <k> <<N-1. WWW (ND Sample): N=325,729; L=1.4 106Lmax=1012 <k>=4.51 Protein (S. Cerevisiae): N= 1,870; L=4,470 Lmax=107 <k>=2.39 Coauthorship (Math): N= 70,975; L=2 105 Lmax=3 1010 <k>=3.9 Movie Actors: N=212,250; L=6 106 Lmax=1.8 1013 <k>=28.78 (Source: Albert, Barabasi, RMP2002) Network Science: Graph Theory 2012

  13. ADJACENCY MATRIX Aij=1 if there is a link between node i and j Aij=0 if nodes i and j are not connected to each other. 4 4 3 3 2 2 1 1 Undirected is not symmetric. Directed is symmetric. Network Science: Graph Theory 2012

  14. ADJACENCY MATRIX a bcdefgh a 0 1 0 0 1 0 1 0 b 1 0 1 0 0 0 0 1 c 0 1 0 1 0 1 1 0 d 0 0 1 0 1 0 0 0 e 1 0 0 1 0 0 0 0 f0 0 1 0 0 0 1 0 g1 0 1 0 0 0 0 0 h0 1 0 0 0 0 0 0 a e h b d f g c Network Science: Graph Theory 2012

  15. ADJACENCY MATRIX AND NODE DEGREES 4 Undirected 3 2 1 Directed 4 3 2 1 Network Science: Graph Theory 2012

  16. GRAPHOLOGY • 1 Undirected Directed 4 4 1 1 2 2 3 3 Actor network, protein-protein interactions WWW, citation networks Network Science: Graph Theory 2012

  17. GRAPHOLOGY • 2 Unweighted (undirected) Weighted (undirected) 4 4 1 1 2 2 3 3 protein-protein interactions, www Call Graph, metabolic networks Network Science: Graph Theory 2012

  18. GRAPHOLOGY • 3 Self-interactions • Multigraph • (undirected) 4 4 1 1 2 2 3 3 Protein interaction network, www Social networks, collaboration networks Network Science: Graph Theory 2012

  19. GRAPHOLOGY • 4 Complete Graph (undirected) 4 1 2 3 Actor network, protein-protein interactions Network Science: Graph Theory 2012

  20. GRAPHOLOGY: Real networks can have multiple characteristics WWW > directed multigraph with self-interactions Protein Interactions > undirected unweighted with self-interactions Collaboration network > undirected multigraph or weighted. Mobile phone calls > directed, weighted. Facebook Friendship links > undirected, unweighted. Network Science: Graph Theory 2012

  21. BIPARTITE GRAPHS bipartite graph(or bigraph) is a graph whose nodes can be divided into two disjoint setsU and V such that every link connects a node in U to one in V; that is, U and V are independent sets. Examples: Hollywood actor network Collaboration networks Disease network (diseasome) Network Science: Graph Theory 2012

  22. PATHS A path is a sequence of nodes in which each node is adjacent to the next one Pi0,in of length nbetween nodes i0 and in is an ordered collection of n+1 nodes and n links B • A path can intersect itself and pass through the same link repeatedly. Each time a link is crossed, it is counted separately • A legitimate path on the graph on the right: ABCBCADEEBA • In a directed network, the path can follow only the direction of an arrow. A E C D Network Science: Graph Theory 2012

  23. DISTANCE IN A GRAPH Shortest Path, Geodesic Path The distance (shortest path, geodesic path) between two nodes is defined as the number of edges along the shortest path connecting them. *If the two nodes are disconnected, the distance is infinity. In directed graphs each path needs to follow the direction of the arrows. Thus in a digraph the distance from node A to B (on an AB path) is generally different from the distance from node B to A (on a BCA path). B A C D B A C D Network Science: Graph Theory 2012

  24. NUMBER OF PATHS BETWEEN TWO NODES Adjacency Matrix • Nij, number of paths between any two nodes i and j: • Length n=1:If there is a link between i and j, then Aij=1 and Aij=0 otherwise. • Length n=2:If there is a path of length two between i and j, then AikAkj=1, and AikAkj=0 otherwise. • The number of paths of length 2: • Length n: In general, if there is a path of length n between i and j, then Aik…Alj=1 and Aik…Alj=0 otherwise. • The number of paths of length n between i and j is Network Science: Graph Theory 2012

  25. FINDING DISTANCES: BREADTH FIRST SEARCH • Distance between node1and node 4: • Start at1. 1 1 1 3 4 3 1 2 4 3 2 1 1 1 3 1 4 2 3 3 4 4 1 3 4 4 4 2 2 3 Network Science: Graph Theory 2012

  26. FINDING DISTANCES: BREADTH FIRST SEATCH • Distance between node1and node 4: • Start at1. • Find the nodes adjacent to1. Mark them as at distance 1. Put them in a queue. 3 4 3 2 4 3 2 1 1 1 1 1 1 3 4 2 3 3 4 4 1 1 3 4 4 4 2 2 3 Network Science: Graph Theory 2012

  27. FINDING DISTANCES: BREADTH FIRST SEATCH • Distance between node1and node 4: • Start at1. • Find the nodes adjacent to1. Mark them as at distance 1. Put them in a queue. • Take the first node out of the queue. Find the unmarked nodes adjacent to it in the graph. Mark them with the label of 2. Put them in the queue. 3 4 3 2 2 4 3 1 1 2 2 1 1 1 1 1 1 3 4 2 2 3 3 4 4 1 1 1 3 4 4 4 2 2 2 2 3 Network Science: Graph Theory 2012

  28. FINDING DISTANCES: BREADTH FIRST SEATCH • Distance between node1and node 4: • Repeat until you find node 4 or there are no more nodes in the queue. • The distance between1and4is the label of4or, if4does not have a label, infinity. 3 4 3 2 4 3 2 1 1 1 3 4 2 3 3 4 4 1 3 4 4 4 2 2 3 Network Science: Graph Theory 2012

  29. NETWORK DIAMETER AND AVERAGE DISTANCE • Diameter:dmaxthe maximum distance between any pair of nodes in the graph; along the shortest path. • Average path length/distance, <d>, for a connected graph: • where dijis the distance from node i to node j Network Science: Graph Theory 2012

  30. THE BRIDGES OF KONIGSBERG 1736: Konigsberg (Prussia) 7 bridges over river Pregelconnected 4 land masses. Can one walk across the seven bridges and never cross the same bridge twice? Euler PATH or CIRCUIT: return to the starting point by traveling each link of the graph once and only once. Network Science: Graph Theory 2012 http://www.numericana.com/answer/graphs.htm

  31. EULERIAN GRAPH: it has an Eulerian path Every vertex of this graph has an even degree, therefore this is an Eulerian graph. Following the edges in alphabetical order gives an Eulerian circuit/cycle. http://en.wikipedia.org/wiki/Euler_circuit Network Science: Graph Theory 2012

  32. EULER CIRCUITS IN DIRECTED GRAPHS B D A C E If a digraph is strongly connected and the in-degree of each node is equal to its out-degree, then there is an Euler circuit G F Otherwise there is no Euler circuit. in a circuit we need to enter each node as many times as we leave it. Network Science: Graph Theory 2012

  33. 1 1 • PATHOLOGY: summary 2 2 5 5 Path Shortest Path 3 3 4 4 A sequence of nodes such that each node is connected to the next node along the path by a link. The path with the shortest length between two nodes (distance). Network Science: Graph Theory 2012

  34. 1 1 • PATHOLOGY: summary 2 2 5 5 Diameter Average Path Length 3 3 4 4 The longest shortest path in a graph The average of the shortest paths for all pairs of nodes. Network Science: Graph Theory 2012

  35. 1 1 • PATHOLOGY: summary 2 2 5 5 Cycle Self-avoiding Path 3 3 4 4 A path with the same start and end node. A path that does not intersect itself. Network Science: Graph Theory 2012

  36. 1 1 • PATHOLOGY: summary 2 2 5 5 Eulerian Path Hamiltonian Path 3 3 4 4 A path that traverses each link exactly once. A path that visits each node exactly once. Network Science: Graph Theory 2012

  37. CONNECTIVITY OF UNDIRECTED GRAPHS Connected (undirected) graph: any two vertices can be joined by a path. A disconnected graph is made up by two or more connected components. B B A Largest Component: Giant Component A C F D C F D F The rest: Isolates F G G Bridge: if we erase it, the graph becomes disconnected. Network Science: Graph Theory 2012

  38. CONNECTIVITY OF UNDIRECTED GRAPHS Adjacency Matrix The adjacency matrix of a network with several components can be written in a block-diagonal form, so that nonzero elements are confined to squares, with all other elements being zero: Figure after Newman, 2010 Network Science: Graph Theory 2012

  39. THREE CENTRAL QUANTITIES IN NETWORK SCIENCE Degree distribution pk Average path length <d> Clustering coefficient C Network Science: Graph Theory 2012

  40. STATISTICS REMINDER We have a sample of values x1, ..., xN Distribution of x (a.k.a. PDF): probability that a randomly chosen value is xP(x) = (# values x) / NΣiP(xi) = 1 always! Histograms >>> Network Science: Graph Theory 2012

  41. pdf -- the continuous case Will work on the board Network Science: Graph Theory 2012

  42. P(k) k • DEGREE DISTRIBUTION Degree distributionP(k): probability that a randomly chosen vertex has degree k Nk = # nodes with degree k P(k) = Nk / N ➔ plot 0.6 0.5 0.4 0.3 0.2 0.1 1 2 3 4 Network Science: Graph Theory 2012

  43. DEGREE DISTRIBUTION discrete representation: pkis the probability that a node has degree k. continuum description: p(k) is the pdf of the degrees, where represents the probability that a node’s degree is between k1 and k2. Normalization condition: where Kmin is the minimal degree in the network. Network Science: Graph Theory 2012

  44. CLUSTERING COEFFICIENT • Clustering coefficient: What portion of your neighbors are connected? Captures the degree to which the neighbors of a given node link to each other. • Node i with degree Ki • ei is the number of links between the Kineighbors of node i • Ciin [0,1] Network Science: Graph Theory 2012

  45. THREE CENTRAL QUANTITIES IN NETWORK SCIENCE A. Degree distribution: pk B. Path length: <d> C. Clustering coefficient: Network Science: Graph Theory 2012

More Related