1 / 32

Everything you wanted to know about the returns to education – but were afraid to ask

Everything you wanted to know about the returns to education – but were afraid to ask. Ian Walker Lancaster University Management School and Paul Bingley SFI Copenhagen. Lancaster. Top 100/10/5 Economics research strengths in Education / Labour, Macro / forecasting, IO / applied micro

madra
Download Presentation

Everything you wanted to know about the returns to education – but were afraid to ask

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Everything you wanted to know about the returns to education – but were afraid to ask Ian Walker Lancaster University Management School and Paul Bingley SFI Copenhagen

  2. Lancaster • Top 100/10/5 • Economics research strengths in Education / Labour, Macro / forecasting, IO / applied micro • If you are a good MSc student • We have PhD scholarships available • 3 ESRC CASE awards – alcohol, u/e, absenteeism • Open studentships (2 ESRC, 2 School) • Contact ian.walker@lancaster.ac.uk • If you are finishing your PhD • We have two lectureships available • One macro, one any area • See you at RES job market event? Or email me

  3. Background • Thousands of studies of wage determination • Strong focus on the effect of “S” on “w” • Harmon et al Labour Economics 2003 meta • This talk: • Not a survey • But an attempt to illustrate many of the issues in one dataset • Isolates what we (think we) know • from we could know, • And from what we know will be hard to know

  4. Important issues • One big issue has been endogeneity of S • coeff on S picks up not just the effect of S on w • But also the effect of other factors not included that are correlated with S (like “ability”, A) • OLS biased upwards • A smaller issue has been measurement error • If S contaminated by ME then OLS coeff “attenuated” (biased towards 0) • Estimating returns to S (and unobserved skills) over time has been a very big issue • Many specification issues

  5. Becker’s HC Earnings Function • Workhorse model of wage determination • wi= Xi β + αSi + uiwhere X includes a quadratic in experience (or age) • But ui = γAi + ei and if cov(Ai,Si) >0 • then plimαOLS = α+ γ(σAS2/ σS2) > αifγ>0 • Note that if S = S + v (measurement error) • then plimαOLS =α.(1 - σ2v / σ2S ) <αifσ2v > 0 • We (think we) have learned quite a lot about all of this from IV studies • But possibly not the ATE of S on w

  6. Minority issues • A unit of S is the same for everyone • may be quality differences (correlated with S) • αmay also depend on S • Nonlinearity, qualifications, “sheepskin” • “Separability” assumption • Effect of S on w is independent of age • αis assumed independent of everything • but it may depend on other things, αi=α(Vi)+vi • Observed and unobserved heterogeneity • V might include institutions and “grades” • Some of v may be “luck”, some may be “productivity” • We know very little about any of this

  7. Motivation: increasing inequality • α changing over time? • The rich are getting richer • US (73+), UK (77+), and even in Denmark (89+)

  8. Motivation: understanding why? • In our simple model • wi = Xi β + αSi + ui • where ui = γAi + ei and cov(Ai,Si)>0 • then plim αOLS = α+ γ(σAS2/ σS2) • Rising var(w), given S, X, β , andunobserved A, could be due to: • α higher returns to education • γhigher returns to unobservable skills • σe2 more measurement error in wages • σAS greater selectivity in schooling

  9. Alternatives to OLS estimation • Eliminate “ability bias” by controlling for A • But A and S highly correlated • And our measures of A are often affected by S • Matching methods assume problem away • No selection on unobservables • Unbiased estimate of αiffσAS= 0 is true • But γnot identified if σAS= 0 is true • IV • But IV estimation does not estimate ATE • IVs may affect people observed from different years and cohorts differently • so interpretation of LATE varies across time

  10. Existing literature • Juhn et al (JPE 1993) - rising var(w) in US • When var(w) rose, where, and for who • Beaudry (JoLE 05), Lemieux (AER 06) • Cawley et al in Arrow et al (eds) 2000 • A * S * t interaction - hard to identify • But all US CPS studies problematic • Because S imputed from data on “some college”, “college”, HS graduation • Induces changes in ME in S • And changes in w data • Workers have more complex remuneration in more recent data

  11. The twins solution • Estimate returns to bothS and A • over time (and across cohorts) • Huge panel – identical MZ’s, fraternal DZ’s • MZ’s may allow us to identify causal effect of S • Small, but rising fast since early 90’s • DZ’s then may allow us to infer returns to ability • Large, but falling slowly since early 90’s • Measurement error problem in S • Important problem for us • but qualifications are accurately measured • Address endogenous ΔS with credible (?) IV • Suggests that “ability bias” in MZ diffs is small

  12. Twins methodology • Δwi = ΔXi β + αΔSi + γΔAi + Δei • where Δ is the within-twin pair difference • If the twins are MZs, then we eliminate (?) the unobservables, i.e. ΔAi = 0 • and usually most ΔX’s=0 • so regress Δw on ΔS for unbiased estimate of α • But, within-twin differencing exacerbates ME in S • ME in ΔSi may be large • Use IV based on alternative measures of S

  13. Existing MZ twins literature

  14. Double trouble: Bound & Solon, Neumark(EconEducRev 1999) • Measurement error in S • S=S*+v where S* is true S • Differencing S data exaggerates ME bias • Our S is “normed” from qualifications data • so ME in ΔSis probably very large • plim(αWT) =α(1 - r/(1-ρ)) • wherer = σ2v/σ2S , butρ= cov(S1,S2) ≈ 1 • Need to use IV to deal with ME • Princeton work uses cross-reported ΔSas IV • We have lots of x-reports

  15. More double trouble: • Why do identical twins differ in S? • ΔS may not be random • individual-specific component of A may remain • Need to instrument ΔS for this reason (even if ME=0) • Education reforms may not work as IVs • Twins have same values of the Z’s? • Family background probably won’t either • Twins have same • Bonjour et al (AER 2004) • But we (think we) do have an IV idea • and DZ estimate of α provides tighter upper bound on the true α than OLS does.

  16. Our contribution 2: Dealing withmeasurement error • S comes from admin registers • Low measurement error in qualifications • So get unattenuated estimates of college premium • But to get S we need to “norm” the quals data • High var(S) associated with any highest qual • So we probably have very large ME in S • Alternative measures of S • Princeton work uses x-reported S • We have twin’s spouse’s S as well as conventional x-reports from survey data

  17. Our contribution 3: Dealing with endogenous ΔSi • Sdifferencesmay not be random • Individual component of A not differenced out • Need an instrument to purgeΔSiofremaining A diffs • something that affects twin 1’s S but not twin 2’s • School sizeaffects if twins can be separated • Important in DK - teacher gets fixed in grade 1 • Twins in 1-class school smaller Δ(ΔS) than in 2+ class • Expect bigger effect from instruments for DZs • Since more ΔAiremains than for MZs

  18. Danish data • Merge several administrative databases via CPR • Use 1970 Census to link children to mums • dob identifies multiple births • 1970+ match via birth records • About 1000 Danish multiple pregnancies each year • More Danish triplets than Princeton has twins • Twins odds about 1 in 80 • Triplet odds about 1 in 8000

  19. Twins sample selection • Over ½ m twin-year working age obs • Around 24k pairs over up to 25 years • Drop the triplets, quads…. • Select MZs, same-sex DZs, age 25-55 • Select if earnings observed (at least twice) between 1980-2005 • Select working full-time and full-year • to reduce the problem that we have only annual earnings not hourly wage rate • 4185 MZ pairs, 6343 DZ pairs

  20. Variables • Income data comes from tax returns • we don’t have good hours of work data • Schooling data • Not available for “special” schools • Few IVF cases yet, and no immigrants • Zygosity questionnaire • 4 “peas in a pod?” type questions • 96% match to DNA in small subsample • Christensen et al (Twins Research 2003) • Christensen et al (BMJ 2006) • similar test scores at 16 as singletons • Even though they average 900 grams lighter

  21. Summary statistics

  22. Distribution of ΔS

  23. Basic resultstwins and singletons compared • Pool data across waves, adjust std errs • Singleton (we have these too) estimates • αmS=0.031 (0.0005) αfS=0.037 (0.0005) • Treating twins as singletons we get • αmMZ=0.030 (0.0005) αfMZ=0.037 (0.0006) • Almost same for DZs • Twins are just like singletons • IV (using twin spouse S as IV) • αmMZ=0.065 (0.0011) αfMZ=0.054 (0.0014) • Conclusion • Implied very low reliability – 0.5 for m, 0.7 for f

  24. Basic FE results – twin differences • Expect huge attenuation bias in OLS on twin differences (ie FE estimation) • MZsαm= 0.005 (0.001) αf= 0.009 (0.001) • DZsαm= 0.018 (0.001) αf= 0.025 (0.001) • So FEIV estimates much higher, especially for DZs • MZsαm= 0.045 (0.010) αf= 0.044 (0.008) • DZsαm= 0.095 (0.006) αf= 0.054 (0.006) • Conclusion • Large returns (by DK standards) of 4½ % on average over 80’s and 90’s

  25. Returns over time • Rolling 10 year window over 1980-2002 • MZs yield αt(return to observed skills) • DZs yield αt+ γt (σAS2/σS2)c • whereγ = return to unobserved skill • (σAS2/σS2)c is fixed for all members of cohort c • So difference between MZ and DZ estimates is proportional to return to unobserved skills • We have a long panel • So its also possible to distinguish cohort effects

  26. Returns over time (no cohort effects)

  27. Extension 1:Time, cohort and lifecycle effects • Time dimension of data identifies time variation in γt • If the panel were balanced then we could treat (σAS /σS2) as a constant • Estimate γtfrom the balanced part of the data • Different birth cohorts identifies σASc. • Estimates suggest that recent cohorts have lower σASc • γtgets correspondingly higher in recent years • But still not significantly rising

  28. Extension 2:Endogenous ΔS • Problem that ΔS might be correlated with ΔA • A is not (entirely) a family effect • So αbiased upwards because of ΔA bias • Need an IV for ΔS (even if no ME)? • Usual suspects won’t work • Need var that affects twin 1’s S but not twin 2’s • Different classes • We don’t know if twins were separated • But twins could be separated if 2+ classes • 46% of schools have single class entry

  29. Extension 2: Endogenous ΔS • Twins in 1-class schools have smaller ΔS • MZs Δ1,2+ΔS = -0.30 male, -0.22 female • DZsΔ1,2+ΔS = -0.19 male , -0.14 female • 1-class twins have same Δw | ΔS as 2-class • Classes affects w only through S • IV estimation eliminates remaining A bias • MZsαm = 0.040 (0.009) αf = 0.041 (0.009) • DZsαm = 0.043 (0.015) αf = 0.043 (0.021) • Conclusion: • very small ΔA-bias in MZ FE, larger in DZ FE

  30. Extension 3:Nonlinear effects • Nonlinear schooling effects • interaction between twin average S and ΔS • α(S) significantlydecreasing in S • No change in convexity over time • Returns to college vs high school • No ME in college degree reporting • 1990’s returns to “Bachelor” about 30% • 1990’s returns to “Masters” about 15% • Rising college premium over time • With strong cohort effects • No rise in returns to unobserved skills

  31. Extension 4:Self and cross reported S • Available from the new twins omnibus survey • Match to register data via CPR • All DK twins included in survey • Response rate 80%+ of pairs • Little attenuation when using conventional x-reports as IVs for self-reported S • OLS MZsαm = 0.038 (0.011) αf = 0.039 (0.013) • IV MZsαm = 0.041 (0.017) αf = 0.042 (0.019) • Other useful information • Childhood illnesses, birth weight, best friend’s background and behaviour ......

  32. Conclusion • There is a lot that we know (or can know) from the data we have • But there is a lot we still don’t know • Only better data will enable us to know more • Diminishing returns to econometric ingenuity have set in • With much better data there is not much that cannot be known • Slides available from Liam • PS Remember - apply to Lancaster

More Related