2.22k likes | 2.23k Views
MIS 644 Social Newtork Analysis 2014/2015 Fall. Chapter 7 Network Models: Exponential Random Graph Model. Outline. Introduction ERGM Theory Logic behind p* Models A General Framework for Model Construction General form of the ERGM Dependence assumptions and models Estimation Example
E N D
MIS 644Social Newtork Analysis2014/2015 Fall Chapter 7 Network Models: Exponential Random Graph Model
Outline • Introduction • ERGM Theory • Logic behind p* Models • A General Framework for Model Construction • General form of the ERGM • Dependence assumptions and models • Estimation • Example • References
Introduction • statistical models of social networks • inference • tie-based models • account for presence or absence of ties • How and why • ties arise • Models SNs – small local tie –based structures • scuh as reciprocated ties, triangles • SN build patterns of ties • Network configurations • Local social processes
Empirical • given observed network structure • Summery measures on G, z(G) • Statistical • Theory driven complex, inteersecting, potentially conflicting • theoretical reasons • Why emprically obserged ties in SN have arisen
Introduce ERGms • Basic theoretical asupmptions underlaying the models • Specifications • Sttistical details • Examples • How verious substantive rsearch questions are invetigated and tested emprically
Introduction • Empirical • given observed network structure • Summery measures on G, z(G) • #edges, centralitity, • ERGM assigns probablity based on these metrics • P(G) = e1z1(G)+2z2(G)+..+ pzp(G), • : normalization constant • prob of G - sum of network statistics weighted by parameters
network statistics: • counts of network configrations in a given network G • or some functions of these counts • configrations • small, local subgraphs • probability of a network • how nany from which configrations present • parameters • importance of these configrations
specifies an ERGM • choocing a set of confugrations • theoretical interest • apply a particular model • observed social network • parameters – estimated • inference about configrations • social processes generating the network
Specification of ERGM • Just like – variables and functional form of a regression • configurations relevant to the network structure • some standard ways • But – thories about how ties come into being and appear in patterns • ERGM – metatheory about networks • conceptualization of a SN • and how it is created
ERGM Theory Assumptions about SNs • SNs are locally emergent • Network ties • self-organized: dependencies between ties • actor attributes • exogenuous factors • patterns – evidence ongoing structural processes • multiple processes simultaneously • SNs structured - stochastic
Monge and Contractor 2003 • SN research multitheoretical • examine multi theoretical perspectives at the same time • Ties depend on one another – self organization • presence of one tie may affect presence of others • Specification of one or more theories in SN terms – configurations – isolation or combinations
Why model social networks • stochastic models • regularities in processes giving rise to ties • variability • uncertainty – observed outcomes • inference – certain substructures more important or by chance • different social processes – structure • e.g., clustering – homophily or structural balance • localized structure, structure – global patterns
logic behind p* models • observed network • one realization from several networks with similar important characteristics – same # of actors • outcome of some (unknown) stochastic proceses • one particular pattern of ties out of a large set of possible patterns • propose a plausable, theoretically principled hypothesis for the process
logic behind p* models research question: observed net there are significantly more (less) structural characteristics than expected by chance outcome of local social processes e.g., observed network show strong tendency for reciprocity over and above chance? structural characteristic (reciprocated ties) – outcome of reciprocated ties individuals choosing to reciprocate choices of others 16
stochastic model with two parameters • propensity for ties • additional probensity for reciprocation • strutural characteristics question – form of the model • asumption - reciprocity • index of level of reciprocity – parameter • the model - assigs a probability to all possible networks • good model – more reciprocated ties thean expected by chance
when friendship - observed network • probability distribution depends on parameters • graphs with high reciprocation more probable • which parameters - probabilities • best values by estimation • e.g., reciprocity parameter • observed network as a guide • zero: observed reciprocated ties by chance • positive: more than by chance
Example friendship • defining a pdf – sampled graphs • compare observed graph with samples • on any other characteristics of interest • if the model - a good one? • observed net resembles sampled from simulations • in many different respects
observed network – friendship relations • all possible network structures from a classroom • some likely some unlikely • probability distribution of graphs • not compare it to other classroom networks • the network generated by stochastic process • relational ties come into being • shaped by • presence or absence of other ties • actor characteristics
The network conceptualized - self-organizing relational ties • local social processes • generate dyadic relatgions • may depend on actor attributes – actors with similar characteristics is likely to form friendship relations
A General Framework for Model Construction • fife steps • Each network tie is regarded as a random variable • A dependence hypothesis • A particular form to the model • Simplification of parameters • Estimation
Step 1 • Each network tie is regarded as a random variable • stochastic framework • n: number of nodes fixed • for each i and j • Yij is a random variable 1 if there is a tie between i and j 0 o.w. yij is the observed value of Yij, • directed or undirected
Step 2 • a dependence hypothesis is proposed, defining contingencies among thenetwork variables • embodies - local social processes • assumed to generate the networkties. • e.g., ties - independent of each other • e.g., reciprocity processes: • school classroom, if studentA likes student B, then student B will quite probably like student A implying some form of dyadicdependence. • ties - node-level attributes • homophily effects in the classroom
processes - represented • small-scale graph configuration: • e.g., a reciprocated tie, • or a tie betweentwo girls.
Step 3 • the dependence hypothesis implies a particular form to the model • Each parameter - configuration • a small subset of possible network ties and/or actor attributes—. • structural characteristics • e.g., reciprocatedties
The model - a distribution of random graphs • “built up” from the localized patterns represented by the configurations. • e.g., • a single tie - configuration, • reciprocated tie (in a directed graph), • a transitivetriad • a two-star. • Parameters - configurations • observed graph
Step 4 • simplification of parameters through homogeneity or other constraints • reduce the number of parameters. • homogeneity constraints. • e.g., one parameter • reciprocity effect across the entire network
Step 5 • estimate and interpret model parameters • complicated - dependence structure is complex, • parameter estimates, estimates of theuncertainty • the range of network outcomes predictedby the model, • make inferences about model parameters. • any parameteris significantly different from zero - configuration is present in the model greater than expected by chance, • given other parametervalues.
General form of the ERGM P(Y=y) = (1/)exp(AAgA(y)) • where • the summation is over all configurations A; • ηA is the parameter corresponding to theconfiguration A gA(y) =yijAyij, network statistic corresponding to configurationA • gA(y) = 1 if the configuration is observed in the network y, and is 0 otherwise • κ: normalizing quantity
generalprobability distribution of graphs on n nodes. • The probability of observing any particular graph y depends on • the statisticsgA(y) • the various non-zero parameters ηA, • for all configurations in the model: • reciprocated ties • Transitive triads
if a set of possible edges represents a configuration, then • any subset of possible edges is also a configuration. • Thus, single edges - configurations
A configuration A – • subset of tie variables • A small network structure • E.g., directed net – diadic dependnce assuption • reciprocty parameters • some configuration (Y12, Y21), (Y13, Y31), • every dyad is a configuration • The configuration statistics gA(y) whether the condiguration is observed in the network or not
Constraints on parameters • different configurations for different nodes • y12y21 y23y32 different • different parameters • n(n-1)/2 reciprocity too many parameters • homogeneity assumption • equating params for same type of conf. • e.g., Mary and Paul – different tendecies for erciprocity • assume equal – • single tendency of reciprocity across the net
A less radical assuption: • depending on node characteristics • isomorphic configurations • different reciprocity parameters for • girl-girl • girl-boy • boy-boy
Homogeneity assuption • isomorhpic configurations refer to generic effects • e.g., overall reciprocity effect • Statistics - count of corresponding configurations
logit models • dependent variable • binary 1,0 P(y=1) = 1/(1+e-z) =ez/(1+ez) P(y=0) = 1/(1+e-z) =e0/(e0 +ez) • normalization factor: (e0 +ez) • z=a+bx or k(x-x0) • log(p/1-p) = z = a + bx • Log of odds ratio – linear function of explanatory variables
Multinomial logit • Dependent variable – more then two states P(y=i) =ez1/(ez1+ez2+ez3+…+ezp) • normalized to 1 • In ERGM • Every possible network has a probability • n nodes n(n-1)/2 posible ties
Dependence assumptions and models • Bernoulli graphs • Dyadic models • Markov random graphs • Node-level variables
Bernoulli graphs • assumption- edges are independent, • a fixed probability p • configuration: • single edges {Yij}. • general model: P(Y=y) = (1/)exp(ijijyij) • single possible edge Yij – configuration • Statistics gA(y) = gij(y) = yij, whether the configuration is observer or not
Bernoulli graphs • assumption- edges are independent, • a fixed probability p • configuration: • single edges {Yij}. • general model: P(Y=y) = (1/)exp(L(y)) • where L(y) = i,jyij: number of arcsin the graph y • θ :edge or density parameter p = e/(1+e)
actors in two a prioriblocks • block homogeneity, • ηij = θ11 if both i and j are in block 1, ηij = θ12 • if i is in block 1 and j in block 2, • and so on P(Y=y)=(1/) exp(θ11L11(y)+θ21L21(y)+θ12L12(y)+θ22L22(y)) • where • L11(y) :# arcs within the first block • L12(y) :# arcs fromblock 1 to block 2,
Dyadic models • directed networks • Dyads - configurations • single edges • reciprocated edges. P(Y=y) =(1/)exp(θijyij+ ijyijyji) =(1/)exp(θL(y) + M(y)) • where • L(y):#ties in y • M(y)=ijyijyji#mutual ties in y.
p2 model • Lazega and van duijn 1997 • van Duijn et al. 2004 • Dyadic independence conditioned on node levelattribute effects • more realistic • when attribute effects are strong
Markov random graphs • Bernoulli and dyadic dependence structures - unrealistic • empirically and theoretically. • a possible tie from i to j is assumed to be contingent on any other possible tie involvingi or j, • the two ties areconditionally dependent, given the values of all other ties. • Markov dependence • two possible network ties are conditionally dependentwhen they have a common actor.
Example • e.g., the relationship between Peter and Mary • dependent on • the presence or absence of a relationship • Mary and John • Conditionaldependence between the possible ties Ypm and Ymj. These two possible ties are conditionally • dependent because they share the node m (Mary).
Assumption - homogeneity • Configurations with parameters • Dircted and non-directed nets • Fig 1 of RPKL • Corresponds to well-known structurla regularities in ntwork literature
Directed networks: • edge (τ15) and reciprocity (τ11) parameters from theBernoulli and dyadic independence models. • Verious two-star effects: • the two-out-starparameter (τ12) - relating to expansiveness, • the two-mixed-star parameter(τ13) relates to two-paths • the two-in-star parameter (τ14) relates to popularity • important transitivity and cyclic configurations (τ9 and τ10)
undirected network P(Y=y)= (1/) exp(θL(y)+2S2(y)+3S3(y)+ T(y)) • where • S2(y) and S3(y) are the numbers of two-stars and three-stars in y • T(y) is the number of triangles in y. • starts higher than three are ignored • the statistics - related to each other, • some are higher-order to others. • e.g., a three-star in a nondirected • network • three two-stars (and three • edges) also centered on i.