620 likes | 782 Views
Multilevel Models in Public Policy Research Brandon Bartels GWU Department of Political Science bartels@gwu.edu. Introduction. Exciting methodological toolkit Multilevel modeling is not monolithic There are lots of different types of model specifications that fall under the umbrella.
E N D
Multilevel Models in Public Policy ResearchBrandon BartelsGWU Department of Political Sciencebartels@gwu.edu
Introduction • Exciting methodological toolkit • Multilevel modeling is not monolithic • There are lots of different types of model specifications that fall under the umbrella. • Various specifications carry different substantive interpretations.
Outline I. Multilevel and hierarchical data • Motivation and Core Issues III. Modeling approaches IV. Statistical specifications – what you can do with these models V. Applications
Multilevel Data • Contain multiple levels of analysis, with each level consisting of distinct units of analysis. • Most common form of multilevel data: hierarchical data. • Two-level structure: Units from the lowest level of analysis (level-1 units) are nested within units from a higher level of analysis (level-2 units) • Data are “clustered” • Level-2 units are referred to as “clusters” • Three-level structure: Third level is present
Multilevel Data • Examples • Education: students (level-1 units) nested within schools (level-2 units) • Three levels: students nested within schools nested within states • Individuals nested within cities • Voters nested within congressional districts • Voters nested within time (or temporal contexts) • Panel data and time-series cross-sectional (TSCS) data
Multilevel Data • X1 and X2 are level-1 variables • X3 and X4 are level-2 variables. • Balanced data: cluster sizes are equal
Multilevel Data • Multilevel data that is non-hierarchical: cross-classified data. • Lower level units are cross-classified; belong to two or more higher-level units that are themselves non-nested.
Motivation • Types of phenomena we’re interested in are multilayered and complex. • Incorporating these layers enhances our substantive explanations of phenomena. • People don’t make choices or behave in a vacuum; there’s a context in which they act. • This contextual, or situational, variation may have consequences for how people behave. • Most simple cross-sectional data ignores this structure; “naïve pooling”
Motivation • Parsing explained variance in the DV between individual versus aggregate levels of analysis. • Student versus school effects on performance.
Core Issues • Unobserved heterogeneity (UH) in the response • Between-cluster variation in the response (i.e., the DV) that is not accounted for by observed heterogeneity (i.e., measured IVs). • Unobserved factors specific to each cluster that influence the outcome; factors are shared by observations within each cluster. • UH represents conditional differences between clusters (conditional on observed heterogeneity). • Multilevel models separate the error term into a within-cluster (level 1) and between-cluster (level 2) error. • UH in a cross-sectional context: yi = b0 + b1x1i + b2x2i + ei
Core Issues • Pooling • Degree to which parameters (e.g., intercept, effects of IVs) are “pulled” toward the pooled (global) effect or reflect within-cluster variation. Spectrum: No Partial Pooling Complete Pooling ------------------------------------------- Pooling • Distinguish within-cluster, between-cluster, and total variation.
Three General Modeling Approaches 1. Complete Pooling: • Ignores clustering/hierarchical structure • Between-cluster UH unaccounted for • Doesn’t distinguish within- versus between-cluster variation • Generalization: global, pooled effect across all observations • Estimation technique: Plain-vanilla pooled regression (e.g., OLS) 2. No Pooling: • Effects are unpooled. • Between-cluster UH accounted for completely • Within-cluster variation is all that’s left. • Estimation technique: fixed-effects (within) estimator • Or…separate models for each cluster.
Three General Modeling Approaches 3. Partial Pooling: • Weighted average between no pooling and complete pooling extremes. • Borrows information from completely pooled effects to generate refined estimate of within-cluster effects (problem with small cluster sizes) • Estimation technique: random intercept (aka, random effects) model; random coefficient model • What most people think of when they talk about a “multilevel model.”
Model Specification Level-1 units indexed i=1, 2, …N. Level-2 units indexed j=1, 2, …J. [Level-1 equation] [Level-2 equation] Reduced form version: • zj= unobserved heterogeneity (between-cluster) • Key specification decision: How we treat zj is directly connected to the three approaches just discussed. • Complete pooling:zjdisappears from model; UH unaccounted for • No pooling: zjtreated as “fixed”; each cluster gets its own intercept • Fixed effects, “within” approach; UH completely accounted for. • Partial pooling: zj treated as “random” • Random effects, or random intercept model
Partial Pooling: Random Intercept Model Level-1 units indexed i=1, 2, …N. Level-2 units indexed j=1, 2, …J. N level-1 units nested within J level-2 units. [Level-1 equation] [Level-2 equation] Reduced form • Assumptions: • Errors normally distributed • No correlation between observed IVs and error terms • CONTROVERSIAL ASSUMPTION: cov(xij, zj) = 0 • Var(zi)=y : Between-cluster error variance (UH). • Var(eij)=q : Within-cluster error variance. • Intraclass correlation: r = y / (y + q)
Estimation of Linear Random Intercept Model • Can be estimated via GLS and ML; both yield similar results. • Foundation: Estimates of b are a weighted average of the pooled and within estimates of b. Partial pooling of coefficients. What regulates this weighting? • Pooling factor: • Recall: q=within-cluster error variance; y=between error variance • If w = 0, bRI reduces to bWithin(FE). • If w = 1, bRI reduces to bOLS. • Degree of pooling depends on how informative the within-cluster variation in the data is; the less informative, the more it borrows from the between-cluster variation in the data. • As cluster size (n) increases, there’s less pooling. • As q decreases, there’s less pooling. • As y increases (cluster differentiation), there’s less pooling.
Considerations for the Random Intercept Model How do we interpret effects from each approach? What do the pooled and RI approaches assume about the within- and between-cluster effects of a level-1 variable? They’re equal. Justifiable? Controversial assumption: Correlation between random effect and X at level 1.
Applications: Things You Can Do • High School and Beyond Data (1982) • Nationally representative survey of U.S. public and Catholic high schools. • Subsample of the 1982 HSB data • Hierarchical structure: 7,185 students nested within 160 schools • DV: math achievement • Software: Stata, HLM, R, WinBUGS • Stata: • Continuous DVs: xtreg, xtmixed • Binary response: xtlogit, xtprobit, xtmelogit
Describing Data DV Level 1 Level 1 Level 1 Level 2 Level 2 Level 2
Unconditional Random Intercept Model Between-school (level-2) error s.d. Within-school (level-1) error s.d. Between-school (level-2) error variance (y) Within-school (level-1) error variance (q)
Unconditional Random Intercept Model Intraclass Correlation Coefficient
Random Intercept Model with IVs Degree of UH Test of RI v. OLS
Random Intercept Model with IVs Level-2 error variance (y) Level-1 error variance (q)
Cluster Confounding • Cluster confounding: When the within-cluster and between-cluster effects of an independent variable at level-1 differ. • If they do differ, but you don’t account for that difference, then effect you get confounds within-cluster and between-cluster variation into an “averaged” effect. • This issue only applicable to level-1 variables • Level-2 variables only vary between clusters, not within. • Underpinning to “controversial assumption.” • One can estimate within-cluster and between-cluster effects of a level-1 variable. • This satisfies the controversial assumption.
Accounting for Cluster Confounding Think of it this way (Method 1): • What is the correlation now between the within-clusterxijandzj? • Identical model, different interpretation (Method 2): • Can perform Hausman-like test for equality of between and within estimates. d represents the difference between with the within- and between-cluster effects. • This method is akin to Hausman test for the equality of RE and FE estimates. • With this procedure, we’re testing the equality of the within and between effects. It’s more direct than the Hausman.
Accounting for Cluster Confounding Within effect Between effect
Accounting for Cluster Confounding Within effect Difference between within and between effects (d)
Causal Heterogeneity • Causal heterogeneity • When the relationship between X and Y varies across clusters • How higher level variables shape lower-level relationships. • Method: Random coefficient model Y X
Random Coefficient Model with Cross-Level Interactions Causal heterogeneity, in addition to heterogeneity in the response. [Level-1 equation] [Level-2 equations]