A stochastic dominance approach to program evaluation

A stochastic dominance approach to program evaluation Felix NascholdCornell University & University of Wyoming Christopher B. BarrettCornell University Cornell University Nutrition and Food Science & Technology seminar March 7, 2011 And an application to child nutritional status in arid and semi-arid Kenya

Motivation • Program Evaluation Methods • By design they focus on mean. Ex: “average treatment effect” (ATE) • In practice, often interested in broader distributional impact • Limited possibility for doing this by splitting sample • Stochastic dominance • By design, look at entire distribution • Now commonly used in snapshot welfare comparisons • But not for program evaluation. Ex: “differences-in-differences” • This paper merges the two  Diff-in-Diff (DD) evaluation using stochastic dominance (SD) to compare changes in distributions over time between intervention and control populations

Main Contributions • Proposes DD-based SD method for program evaluation • First application to evaluating welfare changes over time • Specific application to new dataset on changes in child nutrition in arid and semi-arid lands (ASAL) of Kenya • Unique, large dataset of 600,000+ observations collected by the Arid Lands Resource Management Project (ALRMP II) in Kenya • (One of) first to use Z-scores of Mid-upper arm circumference (MUAC)

Main Results • Methodology • (relatively) straight-forward extension of SD to dynamic context: static SD results carry over • Interpretation differs (as based on cdfs) • Only feasible up to second order SD • Empirical results • Child malnutrition in Kenyan ASALs remains dire • No average treatment effect of ALRMP expenditures • Differential impact with fewer negative changes in treatment sublocations • ALRMP a nutritional safety net?

Program evaluation (PE) methods • Fundamental problem of PE: want to but cannot observe a person’s outcomes in treatment and control state • Solution 1: make treatment and control look the same (randomization) • Gives average treatment effect as • Solution 2: compare changes across treatment and control (Difference-in-Difference) • Gives average treatment effect as:

New PE method based on SD • Objective: to look beyond the ‘average treatment effect’ • Approach: SD compares entire distributions not just their summary statistics • Two advantages • Circumvents (highly controversial) cut-off point. Examples: poverty line, MUAC Z-score cut-off • Unifies analysis for broad classes of welfare indicators

Cumulative % of population FB(x) FA(x) xmax 0 MUAC Z-score Stochastic Dominance First order: A FOD B up to iff Sth order: A sth order dominates B iff

SD and single differences • These SD dominance criteria • Apply directly to single difference evaluation (across time OR across treatment and control groups) • Do not directly apply to DD • Literature to date: • Single paper: Verme (2010) on single differences • SD entirely absent from the program evaluation literature (e.g., Handbook of Development Economics)

Expanding SD to DD estimation - Method Practical importance: evaluate beyond-mean effect in non-experimental data Let , and Gdenote the set of probability density functions of Δ, with The respective cdfs of changes are GA(Δ) and GB(Δ) Then A FOD B iff A Sth order dominates B iff

Expanding SD to DD: interpretation differences 1. Cut-off point in terms of changes not levels. Cdf orders change from most negative to most positive  ‘initial poverty blind’ or ‘initial malnutrition blind’. (Partial) remedy: run on subset of ever-poor/always-poor 2. Interpretation of dominance orders FOD: differences in distributions of changes between intervention and control sublocations SOD: degree of concentration of these changes at lower end of distributions TOD: additional weight to lower end of distribution. Is there any value todoing this for welfare changes irrespective of absolute welfare? Probably not.

Setting and data • Arid and Semi-arid districts in Kenya • Characterized by pastoralism • Highest poverty incidences in Kenya, high infant mortality and malnutrition levels above emergency thresholds • Data • From Arid Lands Resource Management Project (ALRMP) Phase II • 28 districts, 128 sublocations, June 05- Aug 09, 602,000 child obs. • Welfare Indicator: MUAC Z-scores • Severe malnutrition in 2005/6: • Median child MUAC z-score -1.22/-1.12 (Intervention/Control) • 10 percent of children had Z-scores below -2.31/-2.14 (I/C) • 25 percent of children had Z-scores below -1.80/-1.67 (I/C)

The pseudo panel • Sublocation-specific pseudo panel 2005/06-2008/09 • Why pseudo-panel? • Inconsistent child identifiers • MUAC data not available for all children in all months • Graduation out of and birth into the sample • How? • 14 summary statistics for annual mean monthly sublocation-specific stats: mean & percentiles and ‘poverty measures’ • Focus on malnourished children • Thus, present analysis median MUAC Z-score of children z ≤ 0 • Control and intervention according to project investment

Results: DD Regression Pseudo panel regression model No statistically significant average program impact

DD regression panel results Robust p-values in parentheses *** p<0.01, ** p<0.05, * p<0.1 District dummy variables included.

SD Results Three steps: • Steps 1 & 2: Simple differences • SD within control and treatment over time: No difference in trends. Both improved slightly. • SD control vs. treatment at beginning and at end: Control sublocations dominate in most cases, intervention never dominates. • Step 3: SD on Diff-in-Diff (results focus for today)

Expanding SD to DD –controlling for covariates • In regression Diff-in-Diff: simply add (linear) controls • In SD-DD need a two step method • Regress outcome variable on covariates • Use residuals (the unexplained variation) in SD-DD • In application below, use first stage controls for drought (as reflected in remotely-sensed vegetation measure, NDVI)

For (drought-adjusted) median MUAC z-scores: Below z=0.2, intervention sites FOD control sites, although not at 5% statistical significance level. ALRMP interventions appear moderately effective in preventing worsening nutritional status among children.

Similar results at other quantile breaks

Conclusions • Existing program evaluation approaches focus on estimating theaverage treatment effect. In some cases, that is not really the impact statistic of interest. • This paper introduces a new SD-based method to evaluate impact across entire distribution for non-experimental data • Results show the practical importance of looking beyond averages • Standard Diff-in-Diff regressions: no impact at the mean • SD DD: intervention locations had fewer negative observations and of smaller magnitude, especially median and below • ALRMP II may have functioned as nutritional safety net (though only correlation, there is no way to establish causality)

Thank you for your time,interest and comments

SD, poverty & social welfare orderings (1) 1. SD and Poverty orderings • Let SDs denote stochastic dominance of order s and Pα stand for poverty ordering (‘has less poverty’) • Let α=s-1 • Then A Pα B iff A SDs B • SD and Poverty orderings are nested • A SD1 B  A SD2 B  A SD3B • A P1 B  A P2 B  A P3 B

SD, poverty & social welfare orderings (2) 2. Poverty and Welfare orderings (Foster and Shorrocks 1988) • Let U(F) be the class of symmetric utilitarian welfare functions • Then A Pα B iff A Uα B • Examples: • U1 represents the monotonic utilitarian welfare functions such that u’>0. Less malnutrition is better, regardless for whom. • U2 represents equality preference welfare functions such that u’’<0. A mean preserving progressive transfer increases U2. • U3 represents transfer sensitive social welfare functions such that u’’’>0. A transfer is valued more lower in the distribution • Bottom line: For welfare levels tests up to third order make sense

The data (2) – extent of malnutrition

DD Regression 2 Individual MUAC Z-score regression To test program impact with much larger data set Still no statistically significant average program impact

Results – DD regression indiv data Robust p-values in parentheses *** p<0.01, ** p<0.05, * p<0.1 District dummy variables included.

Full SD results

A stochastic dominance approach to program evaluation