330 likes | 351 Views
Learn about Regression Discontinuity (R.D.), a statistical tool to analyze correlations between variables and detect jumps or changes due to quirks in law or nature. Examples include PSAT/NMSQT, School Class Size, Union Elections, U.S. House Elections, and Air Pollution and Home Values.
E N D
Regression Discontinuity 10/13/09
What is R.D.? • Regression--the econometric/statistical tool social scientists use to analyze multivariate correlations Where Y is some sort of dependent variable, alpha’s a constant, the X’s are a bunch of independent variables, the beta’s are coefficients, and the e is the error term.
Discontinuity Some sort of arbitrary jump/change thanks to a quirk in law or nature.
Discontinuity Some sort of arbitrary jump/change thanks to a quirk in law or nature. We’re interested in the ones that make very similar people get very dissimilar results.
Discontinuity Examples • PSAT/NMSQT • Basically the top 16,000 test-takers get a scholarship. • A small difference in test score can means a discontinuous jump in scholarship amount.
Discontinuity Examples • School Class Size • Maimonides’ Rule--No more than 40 kids in a class in Israel. • 40 kids in school means 40 kids per class. 41 kids means two classes with 20 and 21. (Angrist & Lavy, QJE 1999)
Discontinuity Examples • Union Elections • If employers want to unionize, NLRB holds election. 50% means the employer doesn’t have to recognize the union, and 50% + 1 means the employer is required to “bargain in good faith” with the union. (DiNardo & Lee, QJE 2004)
Discontinuity Examples • U.S. House Elections • Incumbency advantage. If you’re first past the pole in the previous election, even by just one vote, you get a huge advantage in the next election. (David Lee, Journal of Econometrics 2007)
Discontinuity Examples • Air Pollution and Home Values • The Clean Air Act’s National Ambient Air Quality Standards say if the geometric mean concentration of 5 pollutant particulates is 75 micrograms per cubic meter or greater, county is classified as “non-attainment” and are subject to much more stringent regulation. (Ken Chay, Michael Greenstone, JPE 2005)
Combine the “R” and the “D” Run a regression based on a situation where you’ve got a discontinuity. Treat above-the-cutoff and below-the-cutoff like the treatment and control groups from a randomization.
Why are we doing this? Why do we have to look for quirks like this? Can’t we just control for whatever we want using OLS or some other line-fitting tool? Just get a bunch of people’s salaries and PSAT scores. PSAT’s are X, income is Y, run a regression in Stata, and we have causal inference, right? Higher test scores cause people to earn more later in life.
No. The statistical methods we use are based on lot of assumptions. Importantly, the error terms (which is really full of things we can’t measure, the unobservables) are supposed to be uncorrelated with the X’s and normally distributed. In reality, those conditions probably hasn’t been met in any of the previous situations.
No. The statistical methods we use are based on lot of assumptions. Importantly, the error terms (which is really full of things we can’t measure, the unobservables) are supposed to be uncorrelated with the X’s and normally distributed. In reality, those conditions probably hasn’t been met in any of the previous situations. For example, class size is probably correlated with some type of neighborhood quality. Please turn to your neighbor and discuss what is probably wrong with each of the previous 5 examples (PSAT, class size, union elections, house elections, air pollution)
No. • Higher PSAT kids might have higher ability. • Crowded classrooms might be in poorer schools. (Or special needs students might be in small classes.) • Unionized workers might work for certain types of firms. • Incumbent politicians might be better. They won before, didn’t they? • Pollution might be correlated to economic growth, which could increase home values.
Controlling for everything? Focus on the Israeli schools for a second. We can try and control for neighborhood poverty level. Does that solve the problem? No. If neighborhood poverty level (observables) are correlated with the X of interest (class size) why would you think it’s safe to assume that the unobservables aren’t correlated? Have you magically controlled for every single thing that’s correlated with the X of interest? Probably not.
Controlling for everything? Focus on the Israeli schools for a second. We can try and control for neighborhood poverty level. Does that solve the problem? No. If neighborhood poverty level (observables) are correlated with the X of interest (class size) why would you think it’s safe to assume that the unobservables aren’t correlated? Have you magically controlled for every single thing that’s correlated with the X of interest? Probably not. So let’s find a bandwidth in which these things are uncorrelated.
A Bandwidth of Randomness Test scores aren’t random, and neither is class size, nor air pollution. But is a kid in the 94.9th percentile really that different from the 95th percentile kid? Is a school with 40 kids that different from a school with 41? Right around the cutoff, there’s a good chance things are random.
No Sorting - Observables Don’t take my word for it. Look at the averages of the observables in your below-cutoff group, and the averages of the observables in the above-cutoff group. Are they the same? Hopefully. Do people know about this cutoff? Are they doing some endogenous sorting? When deciding where to live, did good moms look for schools where their kids would be the 41st kid? Did certain types of polluters look for counties where they’d be below the cutoff? These things can be checked to some degree--look at the average observables above and below the cutoff.
No Sorting - Clumping In addition to checking the observables on either side of the cutoff, we should check the density of the distribution. Is it unusually low/high right around the cutoff? If there’s some abnormally large portion of people right around the cutoff, it’s quite possible that you don’t have random assignment.
No Sorting - Clumping You’re totally cheating. Please stop. Emily Conover & Adriana Camacho “Manipulation of Social Program Eligibility”
GSP--Multiple Analyses “Incentives to Learn,” Ted Miguel, Michael Kremer, Rebecca Thornton Girls Scholarship Program, Busia Kenya. Randomize holding a scholarship competition across schools in Busia and Teso districts.
GSP--Multiple Analyses “Incentives to Learn,” Ted Miguel, Michael Kremer, Rebecca Thornton Girls Scholarship Program, Busia Kenya. Randomize holding a scholarship competition across schools in Busia and Teso districts. Treatment: If a girl finishes in the top 15% in her district on the end-of-year exam, she wins a two-year scholarship. Randomization Analysis: Does attending a school with the competition make you work harder/improve schooling outcomes? RD Analysis: Does winning the award improve schooling outcomes?
P-900 in Chile “The Central Role of Noise in Evaluating Interventions That Use Test Scores to Rank Schools” Kenneth Y. Chay, Patrick J. Mcewan, Miguel Urquiola, AER 2005 Mean Reversion: Sophomore Slump, SI Cover Curse, Heisman Trophy Curse, Madden curse, and in the opposite direction.
THIS IS THE MOST AMAZING THING EVER! Look at the educational outcomes of treatment schools in 1990, compared to those same schools in 1988, before the program. AMAZING! FANTABULOUS!
Oh, wait. Hmm. That’s kind of disappointing.
So how do we actually do this? • Draw two pretty pictures • Eligibility criterion (test score, income, or whatever) vs. Program Enrollment • Eligibility criterion vs. Outcome
So how do we actually do this? 2. Run a simple regression. (Yes, this is basically all we ever do, and STATA can run the calculation in almost any situation, but before we do it, it’s necessary to make sure the situation is appropriate and draw the graphs so that we can have confidence that our estimates are actually causal.) Outcome as a function of test score (or whatever), with a binary (1 if yes, 0 if no) variable for program enrollment.
Is it really that simple? Not quite. You could totally have a situation where the outcome is some sort of quadratic or cubic or nth polynomial function of the test score. Try controlling for that. This is going to depend on the situation and is somewhat arbitrary.
Wait, “somewhat arbitrary?” Lame, I know. But two things aren’t universally clear: 1. How wide a bandwidth around the cutoff are we looking at? We’re really only confident in our estimate for people that are close to the cutoff. This is a LOCAL AVERAGE TREATMENT EFFECT. We can confidently say that a school right around the cutoff would improve average test scores by X if they received the treatment, but we’re not so confident that already-awesome schools would get the same benefit.
Wait, “somewhat arbitrary?” 2. Without the program, what shaped function would there be naturally? What sort of function do we throw in to control for the fact that even if there was no National Merit Semifinalist scholarship, smarter kids are likely to earn more later in life? The solution: SHOW YOUR WORK
Fake Programs In addition to showing your work, another good robustness check is to test for the effects of non-existent programs.
Conclusion • Find a threshold • Look at people just above and just below • Make sure there’s no sorting • It’s only a local effect