Statistical Analysis of the Regression-Discontinuity Design

Statistical Analysis of the Regression-Discontinuity Design

Analysis Requirements C O X O C O O • Pre-post • Two-group • Treatment-control (dummy-code)

Assumptions in the Analysis • Cutoff criterionperfectly followed. • Pre-post distribution is a polynomial or can be transformed to one. • Comparison grouphas sufficient variance on pretest. • Pretest distribution continuous. • Program uniformly implemented.

The Curvilinearilty Problem If the true pre-post relationship is not linear... 8 0 7 0 6 0 f f e t s 5 0 o p 4 0 3 0 2 0 0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 1 0 0 p r e

The Curvilinearilty Problem and we fit parallel straight lines as the model... 8 0 7 0 6 0 f f e t s 5 0 o p 4 0 3 0 2 0 0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 1 0 0 p r e

The Curvilinearilty Problem and we fit parallel straight lines as the model... 8 0 7 0 6 0 f f e t s 5 0 o p The result will be biased. 4 0 3 0 2 0 0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 1 0 0 p r e

The Curvilinearilty Problem And even if the lines aren’t parallel (interaction effect)... 8 0 7 0 6 0 f f e t s 5 0 o p 4 0 3 0 2 0 0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 1 0 0 p r e

The Curvilinearilty Problem And even if the lines aren’t parallel (interaction effect)... 8 0 7 0 6 0 f f e t s 5 0 o p The result will still be biased. 4 0 3 0 2 0 0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 1 0 0 p r e

Model Specification • If you specify the model exactly, there is no bias. • If you overspecify the model (add more terms than needed), the result is unbiased, but inefficient • If you underspecify the model (omit one or more necessary terms, the result is biased.

Model Specification For instance, if the true function is yi = 0 + 1Xi + 2Zi

Model Specification For instance, if the true function is yi = 0 + 1Xi + 2Zi And we fit: yi = 0 + 1Xi + 2Zi + ei

Model Specification For instance, if the true function is: yi = 0 + 1Xi + 2Zi And we fit: yi = 0 + 1Xi + 2Zi + ei Our model is exactly specified and we obtain an unbiased and efficient estimate.

Model Specification On the other hand, if the true function is yi = 0 + 1Xi + 2Zi

Model Specification On the other hand, if the true model is yi = 0 + 1Xi + 2Zi And we fit: yi = 0 + 1Xi + 2Zi + 2XiZi + ei

Model Specification On the other hand, if the true function is yi = 0 + 1Xi + 2Zi And we fit: yi = 0 + 1Xi + 2Zi + 2XiZi + ei Our model is overspecified; we included some unnecessary terms, and we obtain an inefficient estimate.

Model Specification And finally, if the true function is yi = 0 + 1Xi + 2Zi + 2XiZi + 2Zi 2

Model Specification And finally, if the true model is yi = 0 + 1Xi + 2Zi + 2XiZi + 2Zi 2 And we fit: yi = 0 + 1Xi + 2Zi + ei

Model Specification And finally, if the true function is: yi = 0 + 1Xi + 2Zi + 2XiZi + 2Zi 2 And we fit: yi = 0 + 1Xi + 2Zi + ei Our model is underspecified; we excluded some necessary terms, and we obtain a biased estimate.

Overall Strategy • Best option is to exactly specify the true function. • We would prefer to err by overspecifying our model because that only leads to inefficiency. • Therefore, start with a likely overspecified model and reduce it.

Steps in the Analysis 1. Transform pretestby subtracting the cutoff. 2. Examine the relationship visually. 3. Specify higher-order termsand interactions. 4. Estimate initial model. 5. Refine the model by eliminating unneeded higher-order terms.

Transform the Pretest ~ Xi = Xi - Xc • Do this because we want to estimate the jump at the cutoff. • When we subtract the cutoff from x, then x=0 at the cutoff (becomes the intercept).

8 0 7 0 6 0 f f e t s 5 0 o p 4 0 3 0 2 0 0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 1 0 0 p r e Examine Relationship Visually Count the number of flexion points (bends) across both groups...

8 0 7 0 6 0 f f e t s 5 0 o p 4 0 3 0 2 0 0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 1 0 0 p r e Examine Relationship Visually Count the number of flexion points (bends) across both groups... Here, there are no bends, so we can assume a linear relationship.

Specify the Initial Model • The rule of thumb is to include polynomials to(number of flexion points) + 2. • Here, there were no flexion points so... • Specify to 0+2 = 2 polynomials (i.E., To the quadratic).

The RD Analysis Model ~ ~ ~ ~ yi = 0 + 1Xi + 2Zi + 3XiZi + 4Xi + 5Xi Zi + ei 2 2 yi = outcome score for the ith unit 0 = coefficient for the intercept 1 = linear pretest coefficient 2 = mean difference for treatment 3 = linear interaction 4 = quadratic pretest coefficient 5 = quadratic interaction Xi = transformed pretest Zi = dummy variable for treatment(0 = control, 1= treatment) ei = residual for the ith unit where:

8 0 7 0 6 0 f f e t s 5 0 o p 4 0 3 0 2 0 0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 1 0 0 p r e Data to Analyze

Initial (Full) Model The regression equation is posteff = 49.1 + 0.972*precut + 10.2*group - 0.236*linint - 0.00539*quad + 0.00276 quadint Predictor Coef Stdev t-ratio p Constant 49.1411 0.8964 54.82 0.000 precut 0.9716 0.1492 6.51 0.000 group 10.231 1.248 8.20 0.000 linint -0.2363 0.2162 -1.09 0.275 quad -0.005391 0.004994 -1.08 0.281 quadint 0.002757 0.007475 0.37 0.712 s = 6.643 R-sq = 47.7% R-sq(adj) = 47.1%

Without Quadratic The regression equation is posteff = 49.8 + 0.824*precut + 9.89*group - 0.0196*linint Predictor Coef Stdev t-ratio p Constant 49.7508 0.6957 71.52 0.000 precut 0.82371 0.05889 13.99 0.000 group 9.8939 0.9528 10.38 0.000 linint -0.01963 0.08284 -0.24 0.813 s = 6.639 R-sq = 47.5% R-sq(adj) = 47.2%

Final Model The regression equation is posteff = 49.8 + 0.814*precut + 9.89*group Predictor Coef Stdev t-ratio p Constant 49.8421 0.5786 86.14 0.000 precut 0.81379 0.04138 19.67 0.000 group 9.8875 0.9515 10.39 0.000 s = 6.633 R-sq = 47.5% R-sq(adj) = 47.3%

8 0 7 0 6 0 f f e t s 5 0 o p 4 0 3 0 2 0 0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 1 0 0 p r e Final Fitted Model

Statistical Analysis of the Regression-Discontinuity Design

Statistical Analysis of the Regression-Discontinuity Design

Presentation Transcript

An Evaluation of the Performance of Regression Discontinuity Design on PROGRESA

Regression Discontinuity

Regression Discontinuity

Regression Discontinuity Design

Regression Discontinuity Design

Regression Discontinuity Design

Introduction of Regression Discontinuity Design (RDD)

Regression Discontinuity Design

Regression Discontinuity (RD)

What is: regression discontinuity design?

Regression-Discontinuity Design

Regression Discontinuity Design Case Study : National Evaluation of Early Reading First

Using A Regression Discontinuity Design (RDD) to Measure Educational Effectiveness:

Statistical Analysis of the Regression Point Displacement Design (RPD)

Regression Discontinuity Design Using Maimonides’ rule

Statistical Analysis Regression - Correlation

Regression Discontinuity Design

LT7: Regression Discontinuity

Statistical Analysis of the Nonequivalent Groups Design

Statistical Analysis of the Regression Point Displacement Design (RPD)