390 likes | 480 Views
Remaining Weeks. Next week: Diff-n-Diff Nov. 17: Power calculations. Nov. 24: summary, in class presentations. Dec. 1: Guests, more presentations. Motivation: Causality. AP Headline Today: Teen pregnancies tied to tastes for sexy TV shows. Real-World Complications. Attrition
E N D
Remaining Weeks • Next week: Diff-n-Diff • Nov. 17: Power calculations. • Nov. 24: summary, in class presentations. • Dec. 1: Guests, more presentations.
Motivation: Causality. AP Headline Today: Teen pregnancies tied to tastes for sexy TV shows
Real-World Complications Attrition Data Quality Cars Stuck in the Mud, Employees Robbed
Practical Problems Language Culture Being around the same four Westerners 24/7 without going crazy. Solutions: Having had a real job? Management skills
Actual Organizations • CEGA (Our Sponsor) http://cega.berkeley.edu • Poverty Action Lab (J-PAL) http://www.povertyactionlab.org • Innovations for Poverty Action http://www.poverty-action.org • Blum Center for Developing Economies http://blumcenter.berkeley.edu
CEGA-related Faculty • Alain de Janvry • Paul J. Gertler • David I. Levine • Edward Miguel • Nancy Padian • Elisabeth Sadoulet http://cega.berkeley.edu/template.php?page=people
Larger NGO-types • The World Bank • Center for Global Development • International Food Policy Research Institute many, many more
Human Subjects UC Berkeley Committee for the Protection of Human Subjects http://cphs.berkeley.edu In-country organization as well, for example: Kenya Medical Research Institute http://www.kemri.org
Attrition Randomized trials often require that we get data from the subjects twice--once before the experiment and once after. What if we can’t find them afterwards?
Worksheet How might you expect people we couldn’t find to differ from those we could easily find? What could cause people to go missing?
Attrition Create Lower/Upper Bound for our estimates by assuming the worst about the people we couldn’t find. (Ummm, I can’t remember this reference. Sorry.) In our case, we’ll just say it’s important to find as many people as possible to get good data.
Attrition in KLPS Kenyan Life Panel Survey 2003-2005 follow-up to Deworming (1998-2000) 7500 of the original 30,000 were randomly selected to be surveyed.
Attrition in KLPS First, go their old school and ask around. Second, try and go find their house. Third, travel far and wide.
Attrition in KLPS Using two-part regular and intensive tracking just like in “Moving to Opportunity.” After finding as large a portion as you can, select random sub-sample of everyone remaining. ERR=MRR+SRR*(1-MRR)
Attrition in KLPS End Results: 84% successfully contacted 83% successfully surveyed
Attrition in KLPS 4 different types of being “found,” by treatment and gender
Where’d we find them?--19% Outside Busia--14% Outside Neighboring Areas--25% Overall (Non-Snapshot)
So, We Got 84%, Are We Cool? • Is treatment correlated with attrition?
So, We Got 84%, Are We Cool? • Is treatment correlated with attrition? Probably Not. We found 83.9% to 85.0% in all treatment groups.
Was it worth it? • We spent a lot of money to find the emigrants.
Did we need to bother? • Migrants are 1.7 cm shorter than non-migrants, and an additional year of treatment increased migrant height by .4 cm and only .1 cm for the full sample.
The Nuts & Bolts of Building the Dataset • Written on hard-copy of survey. • Sub-sample checked for mistakes. • Data-entry place double enters. • We check for correlation of two entries. • We re-enter 5% sample and check against their work, accept if error rate below threshold. • That’s the “raw” data
The Nuts & Bolts of Building the Dataset • Depressed grad students spend whole summers in windowless Unix lab on the 6th floor of the 2nd ugliest building on campus writing cleaning files, which checks for blanks and skip-pattern violations. • Send the list of flagged entries to location of hard copies • Hard-copies checked against soft-copy. Soft-copy corrected, mistake flag lowered. • Feel free to use the data.
Data Quality Fine, we correctly recorded what the respondent said, but should we really trust what they said? That is, if you were 16 and had a miscarriage a year ago, would you really want to tell an older man that’s a stranger about it?
Do Kids Know What They’re Talking About? • Disregard the respondent/enumerator relationship. Do the kids really know what they’re talking about? • Depends on the question.
What’s Reliable? • We sample 5% to be resurveyed, successully resurveyed about 4%. 3 months later on average. Baseline: If we ask “what tribe are you?” It stays the same 95% of the time.
Fraction Matching • Sub-Tribe 95% • Age in 1998 76% • Grade in 2002 86% • Ever left local area 91% • Mom/Dad Education 51-53%
What Determines Remembering? • Tables 22 and 23 show what characteristics are correlated with giving the same answer about Mom/Dad’s education in both survey and re-survey.
Conclusion • Field work is great; go do some. • Try and find everyone. • Especially if you’re more/less likely to find them thanks to your intervention. • Do your Field Officers effect the answers given? • Does the respondent really know the right answer in the first place?