180 likes | 293 Views
Doing Experiments. An introduction. Empirical social science, including economics, is largely nonexperimental , using data from situations occurring in natural situations. Lab experiments criticized on several grounds Participant pools are often unrepresentative Samples are too small
E N D
Doing Experiments An introduction
Empirical social science, including economics, is largely nonexperimental, using data from situations occurring in natural situations. • Lab experiments criticized on several grounds • Participant pools are often unrepresentative • Samples are too small • Unrealistic data lacks relevance for the real world • Generally field experiments, like the Rand Health Insurance experiment of the 1970s, considered superior • Falk and Heckman disagree, arguing that lab experiments are often superior
Why do Falk and Heckman argue for lab experiments? What are the major advantages of labs experiments over field experiments and surveys?
Why might lab experiments be better? • Labs provide controlled variation • Can specify complete contracts, something rarely done in real-world environments • Even though the situation is unrealistic, the results reveal things about human nature • Gift exchange game shows that higher payoffs elicit more worker effort, contradicting the self-interested worker theory • We have experimental evidence that the paradigm of rational selfishness does not hold • Similar evidence for loss aversion, present-value bias, and social approval bias
Lab experiments allow testing precise predictions from game theoretic behavior • Measured behavior is reliable and real • Lab experiments are relatively inexpensive to implement • Realism of field experiments does not necessarily make them superior to lab experiments. Real issue is the best way to isolate causal factors
The problem: We have Y=f(X1, …,Xn) where we want to know the causal effect of X1 on Y, which means we need to vary X1 holding X2,…Xn constant. • If f(.) is separable in X1 so Y=g(X1)+h(X2,…,Xn) then varying X1 provides the causal impact of X1 on Y. • Even if separable, the impact depends both on the level of X1 and the magnitude change in X1 • If f(.) is not separable in X1, then the causal impact of X1 on Y also depends on the level of (X2,…,Xn) • Lab experiment allows better control over the level and magnitude change in X1, and level of other Xs.
Field experiments suffer from population specificity just as do lab experiments • “Natural situation” of sports card traders is as specific a population as volunteer college students • Field experiments usually have restrictive populations, like the Rand health insurance experiment which used people on public assistance • Hence, field experiments give accurate causal impacts only if the relationship is separable in X1 and, if the causal relationship is linear • But in that case, lab experiments give equally valid causal inferences that transfer across populations • Field experiments do offer greater variability in other Xs, which offers complementary information
Other objections • Experiments with students don’t produce representative results about economic theories • But most theories are independent of assumptions about participant pools • Stakes are trivial • Never clear what “right” stakes should be • Stakes can be varied • Economic theory is based on epsilon changes. • Samples are too small • Statistical measures exist for analyzing small samples
Other objections (continued) • Experiments do not distinguish between experienced and nonexperienced participants • Can be better controlled in the lab than in the field • Participants in lab experiments behave differently because they are being scrutinized • Not exclusive to labs • Repeated experiments can “average this out” • Self-selection into experiments • Provides information about preferences • Problem for field experiments too, which also suffer from adherence and attrition, which lab experiments do not
Go for complimentarity. Combine what we learn from lab experiments with field experiments and large surveys • Lab experiments offer carefully controlled environments, and provide important insights into preference heterogeneity • Field experiments offer broader participation and wider variety • Surveys can generate large and representative data sets that provide statistical power
Big issue in the analysis of a treatment is the missing counterfactual • Idea of a treatment is to see how it changes an individual behavior, but usually only observe a person once, either with or without the treatment • Hence, need to see the case of the treatment being applied only with random assignment • Then, average differences, once other covariates are controlled for, can be assigned to the treatment • Goal is to maximize the variance of the treatment while controlling for other heterogeneity
So what are we looking for in a treatment experiment? We are looking for the effect of the treatment Suppose Y = XB + cT + a+e where Y is the outcome, X are observed characteristics, a are unobserved characteristics, T is treatment, and e is a random term. We assume person specific treatment effects are nonexistent, so we are looking for the magnitude and sign of c.
Ideally, we would observe the same decision making unit (observation) when T=1 (it has the treatment) and T=0 (it doesn’t have the treatment). • This is the counterfactual • Easy to achieve in lab sciences • Rarely or never achieve it in social science experiments • The average treatment effect (ATE) is the difference for the same person in Y with and without the treatment, hence = c.
But if the division into treatment or not is correlated with any of the unobserved variables, the estimate of c is biased • This is the idea behind selectivity • Randomization provides the appropriate counterfactual, as I indicated earlier • So comparing the average of those in the treatment group to those not in the treatment group gives us a statistically valid measure of c.
Two special issues in (mostly field) experiments are adherence (fidelity to the treatment) and attrition (dropping out of the experiment). These can bias the results. • Lack of fidelity often means while it appears someone had the treatment, they really didn’t. This biases results downward. • Attrition may be tied to the lack of success of the treatment. This biases the results upward.
Hence, the ATE may be biased. • Solution is to do an Intent to Treat Analysis (ITT). Compare the outcomes based on the initial treatment intent, not on the treatment eventually administered. • Is pragmatic, focusing on the outcome that could be expected as the treatment is really applied, as opposed to only on the real treatment effect. • The hypothesis that an ITT analysis addresses is pragmatic–the effectiveness of therapy when used on autonomous individuals.
List, et al paper offers specifics on achieving these goals so the data are statistically reliable. It comes up with several rules of thumb • With continuous outcome, treatment and control groups should be the same only if the sample variances of the outcome means are expected to be equal. • If sample variances are not equal, the ratio of sample sizes should be equal to the ratio of the standard deviations. • If the cost of sampling varies across treatment cells, ratio of sample sizes should be inversely related to the square root of the relative costs.
List, et al (continued) • When the unit of randomization is different from the unit of analysis (for example, randomizing treatment by school, but measuring outcome by student) you need to worry about correlations within the cluster than decides the randomization. • When treatment variable is not discrete (so there are a limited number of treatments), but instead is continuous, then the number of cells should equal the order of the treatment plus one. • If the expected impact is linear, the sample is divided into no treatment and full treatment. • If expected impact is quadratic, the sample is divided into three cells; no treatment, an intermediate level near the middle of treatment intensity, and full treatment