Selection-on-observables methods (matching)

Selection-on-observablesmethods (matching) Nicolas STUDER (DREES)

Contents • Reminder • Gerfin, Lechner, Steiger (2005) • Sianesi (2004) • Conclusions

Reminder - Evaluation • Evaluation = missing data problem (counterfactual) • In practice, identify a group (control) of individuals who didn’t participate in the program and would exhibit the same results as the individuals who participated if it would participate (same potential effect)

Reminder - Matching on observables • Rubin causal model (no externalities, no general equilibrium effects) • For every « treated » individual, look for a non-treated one with the same caracteristics (or close) • The causal effet is identified if the CIA (conditional independence assumption) holds: Y0i = (Y0j| Ti =1, Tj= 0, Xi = Xj)

Reminder - Propensity score matching • The CIA requires a huge amount of conditioning variables to hold, then the matching is very bad and the estimator doesn’t converge • The score s(X)=Prob(T=1|s(X)=s) allows to reduce the dimensionalitysi= sj is enough for the CIA to hold • « Balancing score »: the treated group and non treated group with the same score should be similar

Does subsidized temporary employment get the unemployed back to work? (Gerfin, Lechner, Steiger, 2005) • 3 types of programs in Schwitzerland:- executive education (courses)- subsidized temporary job TEMP- job in non-profit organizations EP • Programs take place simultaneously • Compare the programs’ effects on:- « good » reemployment (>3 continuous months, >90% of last earnings) at date t - earnings at date t (0 si sans emploi)- months of unemployment in the following year

Public policy context • A number of active labour market policy instruments in different countries • France : PPE, allègements de charges, emplois jeunes, emplois tremplins • Rationale: Increase human capital or fight again its depreciation (Lazarsfeld and al., 1932), show one’s motivation, testing • Stigmatization, creation of a parallel labour market?

Method and data • « Propensity score matching » • Multinominal Probit (Imbens, 2000): EP, TEMP or no program • Mahalanobis distance, only one « match », but the same observation may be the « match » of several • Administrative data (social security) : history over last 10 years and future over 24 months • Sample = unemployed for less than a year on December, 31 1997 aged 25-55, first program in 98 • 3 proxies of inobservables:- motivation = benefits sanctions- abilities = last earnings- personal appearance = counsellor’s (placement officer’s) subjective evaluation

Descriptive statistics

Results – Which program is the best?

Heterogeneity - Skills • EP may be bad for those with high skills • No long-term effect for EP and TEMP • For those with low skills, TEMP has a positive compared to EP and NOTHING

Heterogeneity – Unemployment duration • One expects a bigger effect if unemployment duration is already high • True for both programs • Stronger « lock-in » if < 180 jours • No evidence of an EP stigma, positive signalling for TEMP

Discussion – Internal validity (1) • Conditional on CIA • CIA needs a lot of control variables to hold • Here two different selection processes:- EP based on counsellor’s decision- unemployed need to find themselves a TEMP job • Counsellor’s evaluations may be colinear to observables characteristics • Matching on inobservables (treatment’s instrumentation) more suitable

Discussion - Validité interne (2) • No standard deviations, must be estimated by bootstrapping • Difference of groups in size = small groups are over-weighted • No robustness checks, especially for propensity score’s specification • « n nearest neighbors » and « kernel » approaches more robust • Heckman’s specification test (1989) of the propensity score = use history

Discussion critique – External validity • Matching only possible on common support (small loss here: 3% only) • Bigger restriction on population (20%) for homogeneisation purposesResults apply only for individuals aged 25-55, without other occupations, unemployed for the first time • Swiss context : low unemployment rate = lower competition on the labour market • General equilibrium effects = negative externalities on the non-treated because of competition and stigmatisationOne could look how the program’s effect varies with the number of spots available in the district

Comparison with randomized controlled trials (RCT) • Two non-parametrical (flexible) methods • Internal validity:- RCT = « golden standard » if the protocol is strictly enforced in spite of Henry and Hawthorne effects- Matching on observables = CIA, needs lots of data (different points in time), dependent on score’s specification, bias - Attrition, externalities and general equilibrium effects are a problem for both methods • External validity:- Both methods provide a local estimator- Often larger sample with matching, in spite of common support restriction- But CIA will not hold with a very heterogeneous population

Doing better (?) : Sianesi (2004) • Unemployment duration on entrance in the program taken into accountImportant because participation renew entitlment of benefits • Compare participating at T to not participate for t <= T : modelisation of sequential choices • Data allow to follow individuals over 6 years, survey on factors influencing choices, data on local labour market situation • Re-weighting within common support • Robustness checks concerning attrition et misqualification problems

Context • Activation policy in Sweden during the 90-s • In addition to placement, unemployed can take part in training et « motivation » activities who are considered as jobs and thus renew entitlement of benefits • « Generous » unemployment benefits : up to 80% of last wage during 60 semaines (if employed more than 5 months during last 12) + possibility of 30 additionnal weeks (KAS) • Programs are considered as a whole, « treatement » = date of entrance in the first program during the first unemployment period • Sample of individuals who became unemployed in 94 (recession peak)

Factors influencing choices • Subjective probability of finding a job (Harckman, 2000) • Depends on unemployment duration, part-time occupations, sociodemographic characteristics (age, gender, nationality), human capital • Data on all this • Counsellor’s evaluation for appearance and motivation • CIA  Myopia cond. on observables so that one control for last job caracteristics and the month of entrance in employment

Results (1)

Results (2)

« Managing » attrition • Results show that attrition is differential • If misclassification rate (« lost » who found a job) is 50% (Bring et Carling, 2000), the effect would be halved • 2 alternatives: considering each individual with misclassification probability >= u as an employed one, counting a individual with probability as 1/ pi of an employed one • Assumes Prob(employed|lost) equal among treated and non-treated, in practice one look at best and worst cases

A disincentive ? (1) • The fact that program’s participation renew entitlement of benefits created an opportunistic behaviour • Effect on employment is not significative for individuals who enter the program after 15 months of unemployment

A disincentive ? (2) • Heterogeneous effect between entitled and non-entitled • « Compensation cycle »: the fact that an entitled individual enters a programme after 15 months increase its probability to enter a programme 14 months later

Conclusion (1) • Activation policies- Lock-in effect in the short term - Subsidized private sector job more efficient, especially for those with low qualification- Effect is stronger on long-term unemployed- In Sweden, positive effect in the short term on participation in other programs, on employment in the long run- No effect on individuals who are at the end of their entitlement period = evidence of an opportunistic behaviour

Conclusion (2) • Selection-on-observables methods- Propensity score matching is almost a « must » for the CIA to hold and the estimator to be convergent- CIA credibility depends on selection process and data richnessFirst-differencing allows to control for individual (fixed) effects and improve the results. - Reweighting and common support are important source of bias - Specification of score is important (Smith et Todd, 2005), « kernel » most robust- Attrition => best and worst cases • Matching on inobservables: need to specify the joint distribution of treatement and potential output

Essays • Huber, Lechner, Wunsch and Walter, 2009, « Do german welfare-to-work programmes reduce welfare and increase work », IZA Discussion Paper No. 4090 • Blundell, Dearden, Sianesi, 2003, « Evaluation the impact of education on earnings in the UK: Results from the NCDS », IFS, WP03/20 • Dearden, Emmerson, Frayne, Meghir, 2005, « Education Subsidies and School Drop-Out Rates », IFS, WP05/11

Selection-on-observables methods (matching)