Analyzing Regression-Discontinuity Designs with Multiple Assignment Variables: A Comparative Study

Analyzing Regression-Discontinuity Designs with Multiple Assignment Variables:A Comparative Study of Four Estimation Methods Vivian C. Wong Northwestern University

The Regression-Discontinuity Design (RDD) A visual depiction Comparison

RDD Visual Depiction Comparison Treatment

RDD Visual Depiction Discontinuity, or treatment effect Counterfactual regression line Comparison Treatment

Two Rationales for the Validity of the RDD • Selection process is completely known and can be modeled through a regression line of the assignment and outcome variables • Untreated portion of the AV serves as a counterfactual • Use parametric approach to estimate treatment effects • It is like an experiment around the cutoff • Observations just to the left and right of cutoff are exchangeable • Use non-parametric approach to estimate treatment effects

Required Assumptions for a Valid RD Design Assumptions • Probability of treatment receipt must be discontinuous at the cutoff • No discontinuity in potential outcomes in the cutoff (often referred to as the “continuity restriction”) Threats to Design Assumptions • Overrides to the cutoff (“fuzzy” discontinuity) • Can address by using the assignment variable and cutoff as an instrumental variable for treatment receipt • Manipulation of the assignment scores • No solution but can probe the data to assess whether manipulation occurred

RD in Recent Education Evaluation Studies • Class size (Angrist & Lavy, 1999) • State pre-kindergartens (Gormley & Phillips, 2005; Wong, Cook, Barnett, & Jung, 2008) • Head Start (Ludwig & Miller, 2006)

Regression-Discontinuity Designs with Multiple Assignment Variables

Distribution of Units in an RD with Two Assignment VariablesA visual depiction

Multivariate RDD with Two Assignment VariablesA visual depiction

Multivariate RDD with Two Assignment VariablesTreatment effects estimated τR τMRD τM

The Average Treatment Effect along the Cutoff Frontier (τMRD) τMRD is the weighted average of conditional expectations given the single frontiers FR and FM: Where Gi is the average size of the discontinuity at the R and M cutoff frontiers, and f(r,m) is the joint density function for assignment variables R and M.

Frontier-specific Effect (τR) • Where g(r, m) is the treatment function for the R frontier along the M assignment variable, and fr(ri = rc, m) is the conditional density function for the FR. • To get the conditional expectation FR, we integrate the treatment function with the conditional density function along FR. • Note that no weights are needed because there is no pooling of treatment effects across FR and FM. • Average treatment effect for the M frontier is calculated in a similar with corresponding treatment and density functions.

Treatment Weights for τMRD Weights wr and wm reflect the probabilities for observing a subject at the R- or M-frontiers However, note that weights are sensitive to the scaling and distribution of the assignment variables.

Requirements for a valid Multivariate RDD Similar to RD case with single assignment mechanism • A discontinuity in probability of treatment receipt across the frontier; • Continuity in the expectations of potential outcomes at FR and FM.

Recent Education Examples of RDDs with Multiple Assignment Variables • College financial aid offer (van der Klaauw, 2002; Kane, 2003) • Remedial education (Jacob & Lefgren, 2004a) • Teacher professional development (Jacob & Lefgren, 2004b) • High school exit exams (Martorell, 2005; Papay et al. 2010) • No Child Left Behind (Gill et al., 2007)

Estimating Treatment Effects Four Proposed Approaches • Frontier approach • Centering approach • Univariate approach • IV Approach

Frontier Approach Estimates the discontinuity along each frontier simultaneously, and applies appropriate weights to obtain the overall effect. First, estimate the treatment function, which is the average size of the discontinuity along the cutoff frontiers using parametric, semi-parametric, or non-parametric approaches. Second, estimate the joint density function by using a bivariate kernel density estimator or by estimating conditional density functions for R and M separately for observations that lie within a narrow bandwidth around the frontier. Third, numerically integrate the product of treatment and joint density functions at the cutoff frontiers to obtain conditional expectations across both frontiers. Third, apply appropriate treatment weights to each discontinuity frontier. Estimates τMRD , τM , τR

Centering Approach Procedure allows researcher to address the “curse of dimensionality” issue by collapsing multiple assignment scores for unit i to a single assignment variable. First, for each unit i, center assignment variables r and m to their respective cutoffs, that is ri– rcand mi – mc. Second, choose the minimum centered value zi = min(ri– rc, mi – mc) is chosen as the unit’s sole assignment score. Third, pool units and analyze as a standard RD design with z as the single assignment variable. Estimates τMRD

Univariate Approach Addresses dimensionality problem by estimating treatment effects for each frontier separately. First, exclude all observations with r values less than its respective cutoff (rc), and choosing a single assignment variable (say, m) and cutoff (mc). Second, estimate treatment effects by measuring size of discontinuity of the conditional outcomes at the cutoff for the designated assignment variable using parametric, semi-parametric, or non-parametric approaches Estimates τR or τM

IV Approach (1) Rather than exclude observations assigned to treatment by alternative mechanisms, delegate these cases as “fuzzy” units. First, designate a treatment assignment mechanism serves as the instrument for treatment receipt. Second, estimate the local average treatment effect, which is the difference in conditional mean outcomes for treatment and comparison groups divided by the difference in treatment receipt rates for both groups within a neighborhood around the cutoff.

IV Approach (2) Continuous potential outcomes Discontinuous potential outcomes Estimates the local average treatment effect along the R cutoff. Requires continuous potential outcomes. Estimates τR-IV or τM-IV

Monte Carlo Study Wong, Steiner, and Cook (2010) examines the performance of the four approaches when the following factors are varied: • Complexity of the true response surface • Distribution and scale of the assignment variables • Methodological approach (frontier, centering, univariate, and IV) for analyzing MRDDs • Simulations based on 500 replications with a sample size of 5,000 for each repetition.

Results from Simulation Study (1) • In general, all four approaches replicated the theoretical true effects when their analytic assumptions are met • The frontier approach produced unbiased effects for τMRD,τR, and τM when the treatment function was correctly modeled • The univariate approach produced unbiased effects for τR and τMwhen the functional form of the response function was correctly specified

Results from Simulation Study (2) • The centering approach was prone to producing small but significant biases for τMRD • Pooling units from different frontier increases heterogeneity in the outcome, which requires larger bandwidths for the nonparametric estimates and increases the complexity of the response function • The IV approach produced biased effects when potential outcomes are discontinuous along either cutoff frontier • This is most likely to happen to when the different cutoffs result in heterogeneous treatments • The frontier approach produced the most efficient effect estimates and the IV the least efficient (but this need not always be the case).

Implications for Practice • Which approach to use? • Use univariate approach first for estimating τR and τM and to assess whether treatment effects are constant. • If treatment effects are constant, use frontier or centering approach for estimating τMRD. • Use frontier approach only if functional form of the response surface is known • For the centering approach, may reduce heterogeneity in the outcome by using difference scores for the outcome • The IV approach is not recommended because we do not know when the potential outcomes are discontinuous and because of reduced efficiency

Vivian C. WongNorthwestern Universityvivianw@northwestern.edu Working draft of paper available at the Institute for Policy Research website (working paper WP-10-02) http://www.northwestern.edu/ipr/publications/workingpapers/wpabstracts10/wp1002.html

Extra Slides

MRDD with Two Assignment Variables R and M

Implications for Practice (2) • Standardize or not standardize assignment variables? • Scale-dependency of the joint density and weights at the cutoff frontier: By rescaling R such that the ratio of weights—represented by the ratio of the two areas along the frontier—changes.

Implications for Practice (3) • In MRD designs, treatment contrast is limited to comparisons along the cutoff frontier • But this may not be the treatment contrast of interest …

Analyzing Regression-Discontinuity Designs with Multiple Assignment Variables: A Comparative Study