300 likes | 415 Views
Statistical Analysis of the Regression-Discontinuity Design. Analysis Requirements. C O X O C O O. Pre-post Two-group Treatment-control (dummy-code). Assumptions in the Analysis. Cutoff criterion perfectly followed. Pre-post distribution is a polynomial or can be transformed to one.
E N D
Analysis Requirements C O X O C O O • Pre-post • Two-group • Treatment-control (dummy-code)
Assumptions in the Analysis • Cutoff criterionperfectly followed. • Pre-post distribution is a polynomial or can be transformed to one. • Comparison grouphas sufficient variance on pretest. • Pretest distribution continuous. • Program uniformly implemented.
The Curvilinearilty Problem If the true pre-post relationship is not linear... 8 0 7 0 6 0 f f e t s 5 0 o p 4 0 3 0 2 0 0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 1 0 0 p r e
The Curvilinearilty Problem and we fit parallel straight lines as the model... 8 0 7 0 6 0 f f e t s 5 0 o p 4 0 3 0 2 0 0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 1 0 0 p r e
The Curvilinearilty Problem and we fit parallel straight lines as the model... 8 0 7 0 6 0 f f e t s 5 0 o p The result will be biased. 4 0 3 0 2 0 0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 1 0 0 p r e
The Curvilinearilty Problem And even if the lines aren’t parallel (interaction effect)... 8 0 7 0 6 0 f f e t s 5 0 o p 4 0 3 0 2 0 0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 1 0 0 p r e
The Curvilinearilty Problem And even if the lines aren’t parallel (interaction effect)... 8 0 7 0 6 0 f f e t s 5 0 o p The result will still be biased. 4 0 3 0 2 0 0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 1 0 0 p r e
Model Specification • If you specify the model exactly, there is no bias. • If you overspecify the model (add more terms than needed), the result is unbiased, but inefficient • If you underspecify the model (omit one or more necessary terms, the result is biased.
Model Specification For instance, if the true function is yi = 0 + 1Xi + 2Zi
Model Specification For instance, if the true function is yi = 0 + 1Xi + 2Zi And we fit: yi = 0 + 1Xi + 2Zi + ei
Model Specification For instance, if the true function is: yi = 0 + 1Xi + 2Zi And we fit: yi = 0 + 1Xi + 2Zi + ei Our model is exactly specified and we obtain an unbiased and efficient estimate.
Model Specification On the other hand, if the true function is yi = 0 + 1Xi + 2Zi
Model Specification On the other hand, if the true model is yi = 0 + 1Xi + 2Zi And we fit: yi = 0 + 1Xi + 2Zi + 2XiZi + ei
Model Specification On the other hand, if the true function is yi = 0 + 1Xi + 2Zi And we fit: yi = 0 + 1Xi + 2Zi + 2XiZi + ei Our model is overspecified; we included some unnecessary terms, and we obtain an inefficient estimate.
Model Specification And finally, if the true function is yi = 0 + 1Xi + 2Zi + 2XiZi + 2Zi 2
Model Specification And finally, if the true model is yi = 0 + 1Xi + 2Zi + 2XiZi + 2Zi 2 And we fit: yi = 0 + 1Xi + 2Zi + ei
Model Specification And finally, if the true function is: yi = 0 + 1Xi + 2Zi + 2XiZi + 2Zi 2 And we fit: yi = 0 + 1Xi + 2Zi + ei Our model is underspecified; we excluded some necessary terms, and we obtain a biased estimate.
Overall Strategy • Best option is to exactly specify the true function. • We would prefer to err by overspecifying our model because that only leads to inefficiency. • Therefore, start with a likely overspecified model and reduce it.
Steps in the Analysis 1. Transform pretestby subtracting the cutoff. 2. Examine the relationship visually. 3. Specify higher-order termsand interactions. 4. Estimate initial model. 5. Refine the model by eliminating unneeded higher-order terms.
Transform the Pretest ~ Xi = Xi - Xc • Do this because we want to estimate the jump at the cutoff. • When we subtract the cutoff from x, then x=0 at the cutoff (becomes the intercept).
8 0 7 0 6 0 f f e t s 5 0 o p 4 0 3 0 2 0 0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 1 0 0 p r e Examine Relationship Visually Count the number of flexion points (bends) across both groups...
8 0 7 0 6 0 f f e t s 5 0 o p 4 0 3 0 2 0 0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 1 0 0 p r e Examine Relationship Visually Count the number of flexion points (bends) across both groups... Here, there are no bends, so we can assume a linear relationship.
Specify the Initial Model • The rule of thumb is to include polynomials to(number of flexion points) + 2. • Here, there were no flexion points so... • Specify to 0+2 = 2 polynomials (i.E., To the quadratic).
The RD Analysis Model ~ ~ ~ ~ yi = 0 + 1Xi + 2Zi + 3XiZi + 4Xi + 5Xi Zi + ei 2 2 yi = outcome score for the ith unit 0 = coefficient for the intercept 1 = linear pretest coefficient 2 = mean difference for treatment 3 = linear interaction 4 = quadratic pretest coefficient 5 = quadratic interaction Xi = transformed pretest Zi = dummy variable for treatment(0 = control, 1= treatment) ei = residual for the ith unit where:
8 0 7 0 6 0 f f e t s 5 0 o p 4 0 3 0 2 0 0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 1 0 0 p r e Data to Analyze
Initial (Full) Model The regression equation is posteff = 49.1 + 0.972*precut + 10.2*group - 0.236*linint - 0.00539*quad + 0.00276 quadint Predictor Coef Stdev t-ratio p Constant 49.1411 0.8964 54.82 0.000 precut 0.9716 0.1492 6.51 0.000 group 10.231 1.248 8.20 0.000 linint -0.2363 0.2162 -1.09 0.275 quad -0.005391 0.004994 -1.08 0.281 quadint 0.002757 0.007475 0.37 0.712 s = 6.643 R-sq = 47.7% R-sq(adj) = 47.1%
Without Quadratic The regression equation is posteff = 49.8 + 0.824*precut + 9.89*group - 0.0196*linint Predictor Coef Stdev t-ratio p Constant 49.7508 0.6957 71.52 0.000 precut 0.82371 0.05889 13.99 0.000 group 9.8939 0.9528 10.38 0.000 linint -0.01963 0.08284 -0.24 0.813 s = 6.639 R-sq = 47.5% R-sq(adj) = 47.2%
Final Model The regression equation is posteff = 49.8 + 0.814*precut + 9.89*group Predictor Coef Stdev t-ratio p Constant 49.8421 0.5786 86.14 0.000 precut 0.81379 0.04138 19.67 0.000 group 9.8875 0.9515 10.39 0.000 s = 6.633 R-sq = 47.5% R-sq(adj) = 47.3%
8 0 7 0 6 0 f f e t s 5 0 o p 4 0 3 0 2 0 0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0 1 0 0 p r e Final Fitted Model