Political economy of aid evaluation: How to build a sustainable and effective movement

Political economy of aid evaluation: How to build a sustainable and effective movement Lant Pritchett (with Salimah Samji) Harvard Kennedy School Feb. 5, 2009

Outline of the presentation • How technocratic approaches can get it completely wrong—or “how I got my lunch eaten” (and that by people with no academic credentials) • MeE: An alternative to an exclusive focus on “Big E” evaluation that might actually work • (if there is time) What the poor do actually say…

My Indonesia Story • Crises of 1997/98, currency crisis, economic meltdown, Soeharto resignation, poverty rising rapidly • July 1998 budget of new government allows (insists on) fiscal space for “Safety Net” programs • August 1998 I arrive in Indonesia working for the World Bank and end up in charge of a cumulative 1.2 billion dollar loan to support the design and finance crisis mitigation safety net programs (aside: Total loans of Grameen Bank in 2002 ≈200 million)

Safety Net Programs in Indonesia, 1998-2000

Benefit Incidence of the Indonesian Crisis Safety Net Programs Benefits relative to the poorest Quintile of income Uniform transfer (dollar to each person) 1 1 Actual performance Of the Safety Net Programs Only to the poor— “Perfect” targeting I II III IV V

Budget for 2000/2001 • Economy was stagnant • Real wages, and hence poverty, was recovering • How much of the “safety net” programs should remain • Nearly all of them got axed (scholarships, pro-poor health cards, employment creation)—in part because they were attacked for not being sufficiently targeted to the “poor” as our own studies documented “leakage” to the “non-poor”

The subsidy to Kerosene • At the same time, there was an effort to eliminate the massive subsidy to fuel, including a subsidy to kerosene. • Huge hue and cry, protests in the street, political opposition • The subsidy to kerosene was claimed to be “pro-poor” and hence untouchable • The kerosene (bigger than many safety net programs) was spared

The facts about benefit incidence Kerosene subsidy—4th Quintile got 2.5 times what Poorest quintile did All of the safety net programs

The Truth—Which Sets You Free • The kerosene subsidy created a price differential with nearby Singapore and Malaysia • A small group of generals skimmed off a substantial amount of kerosene production and shipped it to Singapore, making millions of dollars a year • The protests were orchestrated (the going rate in the market for protestors was 2000 rp/day (plus lunch if all day, a snack if only afternoon)) • The newspaper editorials were similarly purchased (they were almost as cheap)

Kerosene Subsidy, round II • Design a “compensatory” targeted safety scheme to cushion the “shock” to the poor of the kerosene price rise • Launch the price rise and the safety net program simultaneously in the next round of budget discussions—takes “the poor” off the table • The program itself is self-liquidating (as the magnitude of transfer is based on the change n the price of fuel) • But the purpose of the program was not about the poor and the impact of the kerosene price (which was trivially small) but about a two step political gambit—get the fiscal savings from cutting the subsidy in the first round, taking the poor off the table, then get the program eliminated the next--which only a dozen or so people in the country understood.

What is “evaluation” about—finding out the most “cost-effective” policy/program designs Budget available to the program Program objectives, Technocratically conceived Optimal design Technocratic naiveté: Budget assumed The same for all designs Design parameter

Technocrat assuming fixed budget tries to maximize benefit to the poor At any given budget increasing targeting increases well-being of the poor But the true budget is downward sloped (more targeting means less budget) The naïve pro-poor technocrat produces the worst outcome for the poor –full targeting of zero budget Rangel vs. Gingrich:Simple story about targeting (with a rigorous model)

What is “evaluation” about—finding out the most “cost-effective” policy/program designs—or inform actual policy? Budget available to the program Program objectives, Technocratically conceived, Per dollar Rigorous evaluation demonstrates design B is much more cost-effective than design A, but both are above the “threshold”—what is the “policy” implication? And if design B would lead to program elimination (zero budget)? Same “policy” iimplication? Move to B? Threshold Design B Design A Design parameter

“Evaluation” as a Development Initiative Well-being of the poor • So far as deeply unscientific as any other advocacy group—claims with no evidence of any kind • Maybe right, maybe wrong, but certainly methodologically incoherent • Success cases? (Colombia? Great paper, dead program) ? Claims about impact of evaluation require a model of how additional knowledge affects realized policies Resources to evaluation

What do the poor say? “Is this information you are gathering from us just to help you write your report or can you really be helpful to us?” Woman in South Sudan to me, two weeks ago

Its all about MeE—Adding little ‘e’ evaluation to the standard mix Monitoring: The gathering of evidence to show what progress has been made in the implementation of programs. Focuses on inputs and (sometimes) outputs. Big E--Evaluation: Measuring changes in outcomes and evaluating the impact of specific interventions on those outcomes. Focuses on “with and without” interventions (needs “control” group) and identifies causal impacts. Little E—evaluation Uses within project design variations to identify differentials in the efficacy of the project on inputs and outputs for real time feedback into project/program implementation

Monitoring Routine collection of information Tracking implementation progress (actual vs. target) Measuring efficiency “Is the project doing things right ?” Evaluation Ex-post assessment of effectiveness and impact Confirming project expectations (unintended results) Measuring impacts “Is the project doing the right things?” Complementary roles for M&E

Introducing “e” to the mix • “e” lies in between M and E • Analyzing existing information (baseline data, monitoring data) • Drawing intermediate lessons • Serves as a feed-back loop into project design • Don’t always have to do Impact Evaluation

The problem in pictures Lets begin with the project time line Lots of “M” – passing raw data unto God for whatever use … Findings of “E” come too late to be of much assistance to implementers Lost opportunity: No timely “e” to help the project!!

Ideally … Implement design A-1 and a-2, collect “M” + analyze the M (“e”) Point of decision making: to scale up A-1 or A-2? Scale up A Implement design A and B, collect “M” + analyze the M (“e”) T+2 Point of decision making: to scale up A or B? T T+1 Scale up B Or Similarly with B-1 and B-2 If your M&E system doesn’t provide you with timely decision making information – what is the point for the implementers?

So … where do you begin? • Clear objectives of the project (what is the problem?) • Clear idea of how you will achieve the objectives (causal chain or storyline) • Outcome focused: • Answer the question: What visible changes in behavior can be expected among end users as a result of the project, thus validating the causal chain/ theory of change?

Design the “M, e and E” Plan • What? • Type of information and data to be consolidated • How? • Procedures and approaches including methods for data collection and analysis • Why? • How the collected data will support monitoring and project management • When? • Frequency of data collection and reporting • Who? • Focal points, resource persons and responsibilities

Advantages of little e over Big E evaluation • No collection of data on a “no program” group required—the comparisons are “within program/project” variants • Project implementers feel part of the process, see benefits to themselves and their objectives • Big E evaluation often cannot distinguish causes of failure—many projects simply fail to be implementer • Big E evaluation can explore only a tiny part of the design space (even with 5 design parameters, 2 options each, with complementarities the dimensionality blows up)

Advantages of little e over Big E evaluation (con’t) • Clear relationship to other sub-movements within development (see next set of slides) • Is within the capability of nearly all implementing organizations • M is always feasible • Big E requires substantial expertise to produce reliable, replicable (publishable?) results • Little e can be done by project management units • Using M data for little e makes the M unit high stakes and keeps the data real and relevant (otherwise MIS systems drift out of date, people lose interest). • Requires conscious consideration of design parameters, keeps open alternatives

Successful Movements Clearly articulated vision Politically feasible coalition “Career” trajectories Patina of “normal science” …but can be ineffective Insularity, not open to question fundamental premises Lock-in of movement specific “human capital” politically defensive Takes too long to shift if proves ineffective “Evaluation” as a innovation/movement/advocacy position to improve “development”

How does evaluation fit in “development” • “Development” is a coalition of narrower sub-movements both objective specific (e.g. education, health, gender, environment) and instrument specific (e.g. micro-credit, irrigation) • Help to make “successful” movements also effective • Eventually weed out the successful but ineffective sub-movements (but this is hard and unlikely to be the result of Big E evaluation)

Political economy of aid evaluation: How to build a sustainable and effective movement