Regression Discontinuity

Regression Discontinuity

Basic Idea • Sometimes whether something happens to you or not depends on your ‘score’ on a particular variable e.g • You get a scholarship if you get above a certain mark in an exam, • you get given remedial education if you get below a certain level, • a policy is implemented if it gets more than 50% of the vote in a ballot, • your sentence for a criminal offence is higher if you are above a certain age (an ‘adult’) • All these are potential applications of the ‘regression discontinuity’ design

More formally.. • assignment to treatment depends in a discontinuous way on some observable variable W • simplest form has assignment to treatment being based on W being above some critical value w0- the discontinuity • method of assignment to treatment is the very opposite to that in random assignment – it is a deterministic function of some observable variable. • But, assignment to treatment is as ‘good as random’ in the neighbourhood of the discontinuity – this is hard to grasp but I hope to explain it

Basics of RDD Estimator • Suppose average outcome in absence of treatment conditional on W is: • Suppose average outcome with treatment conditional on W is: • This is ‘full outcomes’ approach. • Treatment effect conditional on W is g1(W)-g0(W):

How can we estimate this? • Basic idea is to compare outcomes just to the left and right of discontinuity i.e. to compare: • As δ→0 this comes to: • i.e. treatment effect at W=w0

Comments • the RDD estimator compares the outcome of people who are just on both sides of the discontinuity - difference in means between these two groups is an estimate of the treatment effect at the discontinuity • says nothing about the treatment effect away from the discontinuity - this is a limitation of the RDD effect. • An important assumption is that underlying effect on W on outcomes is continuous so only reason for discontinuity is treatment effect

E(y│W) w0 W Some pictures – underlying relationship between y and W is linear

E(y│W) β w0 W Now introduce treatment

The procedure in practice • If take process described above literally should choose a value of δ that is very small • This will result in a small number of observations • Estimate may be consistent but precision will be low • desire to increase the sample size leads one to choose a larger value of δ

Dangers • If δ is not very small then may not estimate just treatment effect – look at picture • As one increases δ the measure of the treatment effect will get larger. This is spurious so what should one do about it? • The basic idea is that one should control for the underlying outcome functions.

If underlying relationship linear • If the linear relationship is the correct specification then one could estimate the ATE simply by estimating the regression: • But no good reason to assume relationship is linear and this may cause problems

g0(W) E(y│W) g1(W) w0 W Suppose true relationship is:

g0(W) E(y│W) g1(W) w0 W Observed relationship between E(y) and W

one would want to control for a different relationship between y and W for the treatment and control groups • Another problem is that the outcome functions might not be linear in W – it could be quadratic or something else. • The researcher then typically faces a trade-off: • a large value of δ to get more precision from a larger sample size but run the risk of a misspecification of the underlying outcome function. • Choose a flexible underlying functional form at the cost of some precision (intuitively a flexible functional form can get closer to approximating a discontinuity in the outcomes).

In practice • it is usual for the researcher to summarize all the data in the graph of the outcome against W to get some idea of the appropriate functional forms and how wide a window should be chosen. • But its always a good idea to investigate the sensitivity of estimates to alternative specifications.

An example • Lemieux and Milligan “Incentive Effects of Social Assistance: A regression discontinuity approach”, Journal of Econometrics, 2008 • In Quebec before 1989 childless benefit recipients received higher benefits when they reached their 30th birthday

The Picture

The Estimates

Note • Note that the more flexible is the underlying relationship between employment rate and age, the less precise is the estimate

Regression Discontinuity