1 / 40

Introduction to ERGM/p* model

Introduction to ERGM/p* model. Kayo Fujimoto, Ph.D. Based on presentation slides by Nosh Contractor and Mengxiao Zhu. Four parts of ERGM. Observed network data Network statistics (or counts) of each configuration ERG Modeling Conditional probability and Change statistics

locke
Download Presentation

Introduction to ERGM/p* model

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to ERGM/p* model Kayo Fujimoto, Ph.D. Based on presentation slides by Nosh Contractor and Mengxiao Zhu

  2. Four parts of ERGM Observed network data Network statistics (or counts) of each configuration ERG Modeling Conditional probability and Change statistics Estimation and Simulation Estimate Parameters by Simulation Method: MCMC ML estimation Goodness of fit test (convergence t-test) Compare observed and simulated graphs Recent development in ERGM New model specification

  3. Exponential Random Graph Model(ERGM) • ERGMs take the form of a probability distribution of graphs: • Y is a set of tie indicator variables Y • y is a realization, the observed network • g(y) is a vector of network statistics • θ is a parameter vector corresponding to g(y) • k(θ) is a normalizing factor calculated by summing up exp{θ’g(y)} over all possible network configurations

  4. Observed network Graph statistics (or counts) of each configuration

  5. Network Statistics Examplesfor Undirected Networks Example: Edge: 6 2-Star: 1+3+1+6+0=11 3-Star: 0+1+0+4+0=5 4-Star: 1 Triangle: 2 b c a e d

  6. A Simple Example of ERGM Homogeneous Assumption Directed Network: Number of configurations: Undirected Network:

  7. A Simple ERG model • Predict network using edge count • θcan take different values: • θ = 0, θ = -0.69, θ = 0.69 • L(y) can the following values: • L(y) = 0, L(y) = 1, L(y) = 2, L(y) = 3

  8. Example 1: θ = 0, L=0 θ = 0 Probability of getting networks with 0 edge Model: ERGM Formula

  9. Example 1: θ = 0, L=1 θ = 0 Probability of getting networks with 1 edge Model: ERGM Formula

  10. Example 1: θ = 0, L=2 θ = 0 Probability of getting networks with 2 edge Model: ERGM Formula

  11. Example 1: θ = 0, L=3 θ = 0 Probability of getting networks with 3 edge Model: ERGM Formula

  12. Example 1: θ = 0 θ = 0 Model: ERGM Formula

  13. Example 2: θ = -0.69 θ = -0.69 Model: ERGM Formula

  14. Example 3: θ = 0.69 θ=0.69 θ = 0.69 Model: ERGM Formula

  15. Why Change Statistics? Num of configurations: Huge Sample Space

  16. ERG modeling Conditional Probability and Change Statistics

  17. Conditional Probability vs. Total Probability • Total probability of the whole network • It is impossible to calculate when the size of the network gets large • Introduce the Conditional Probability of edges • Reduce sample space

  18. Avoid the Calculation on Sample Space Conditional Probability of an Edge to exist Conditional Probability of an Edge to be absent is Logit p* model: model log odds ratio of Yij exists

  19. Change Statistics (logit p* model) From the end of last slide, we have: Define Change Statistics as: Model log odds of a tie being present to absent:

  20. Estimation and Simulation (Monte Carlo Markov Chain Maximum Likelihood Method)

  21. Review: Maximum Likelihood Estimation (MLE) • Likelihood functions • Estimate parameter θgiven the observed network. • Maximum Likelihood Estimation • Find θvalues such that the observed statistics are equal to the expected statistics • Approximate MLE by simulation

  22. Procedures for simulating ERG distribution • Markov Chain Monte Carlo Maximum Likelihood Estimation (MCMCMLE) • 1. Simulate a distribution of random graphs from a starting set of parameter values • 2. Refine the parameter values by comparing the distribution of graphs against the observed graph • 3. Repeat this process until the parameter estimate stabilize

  23. Convergence T-statistics • Test adequacy of parameter values estimated • T-statistics for each configuration • T <|.1| good fit NOTE: If the parameter estimates do not converge, the model is degenerate

  24. A Simple Example of MCMCMLE • Model: • Observed Network y: • Goal: Findθvalue such that the observed number of edges are equal to the expected number of edges

  25. If θ can be chosen from the following 3 cases, θ=-0.69 is preferred because it gives the highest probability for the observed network • Given the observed Network y:

  26. Markov dependence (Frank and Strauss, 1986) • Potential ties are dependent only if they share a common actor • Two possible network ties are conditionally independent unless they share a common actor • Once homogeneity assumption is imposed, we obtain the following configurations…

  27. Markov random graph models(non-directed networks) Two-star(2) Density or edge() Triangle() Three-star(3)

  28. Problems of degeneracy for Markov random models • Certain parameter values place almost all of the probability mass on either the empty or the fullgraph • Simulation studies showed that Markov random graph models are degenerate for many empirical networks with high level of clustering • A few very high degree nodes • Some regions of high triangulation

  29. Two possibilities for the degeneracy problem (Snijders, et al 2006) • Makov dependence assumption may be too restrictive • The representation of transitivity by the total number of triangles might be too simplistic •  New specification of higher order network dependency

  30. New development in ERGM Partial conditional dependence assumption and new model specification

  31. Partial conditional dependence(Social circuit dependence) • Two possible network ties being conditionally dependent if their observation would lead to a 4-cycle i k = possible edges = observed edges j l

  32. Partial conditional dependence(Example) Daughter B Daughter A Father B Father A

  33. Difference between the two types of dependence assumptions Markov dependence assumptions Partial conditional dependence assumptions i k k i j l l j = potential tie = ties which affect the formation of the potential tie = ties with no effect on the potential tie

  34. New Specifications of ERGM • Represent structural parameters similar to the Markov parameters • Effects are incorporated within the one configuration parameter • Three new statistics for non-directed network • Alternating k-stars • Alternating k-triangles • Alternating independent two-paths

  35. Examples of new specifications • Alternating k-star configuration (degree dist’n): • Alternating k-triangle (tendency to form triads): • Alternating k-two-path (tendency to form cycles)

  36. Interpretation of the parameter • Positive alternating k-star parameter • Networks with some higher degree nodes are highly probable.  Core-periphery structure • Positive alternating k-triangle parameter • Triangulation in the network as well as tendencies for triangles themselves group together in larger higher order “clump” • Positive alternating k-path parameter • Tendency for 4-cycles in the network

  37. Summary for model construction • Random variables • Each network tie (Yij) among nodes of a network • A random tie variable Yij=1 if a tie form i to j exist, Yij=0 otherwise • yij the observed value of the variable Yij • Dependence assumptions • Define contingencies among network variables • Determine the type of parameters in the model • Ties also depends on node-level attributes (homophily) • Homogeneity assumption • Simplify parameters by imposing homogeneity constraints. • Estimation procedures • Find the best parameter values based on the observed network • Use simulation (MCMLE)

  38. Software for ERGM • SIENA (Snijders, and colleagues) • PNet (Robbins, and colleagues) • Statnet (Butts, and colleagues)

  39. Reference • Harrigan, Nicholas. “ Exponential Rnadom Graph (ERG) models and their application to the study of corporate elites. • Robins, Garry (manuscript). Exponential Random Graph (p*) models for social Networks, published in Melnet website. • Robins, G., Pattison, P. Kalish, y. Lusher, D. (2007). “An introduction to exponential random graph (p*) models for social networks”. Social Networks, 29, 173-191. • Snijders, T.A.B., Pattison, P., Robins, G, Hancock M. (2006). “New specifications for exponential random graph models. Sociological Methodology, 36: 99-153.

  40. Thank you for your attention Any questions?

More Related