Facilitated by: Michael Bamberger and Jim Rugh

RealWorld EvaluationDesigning Evaluations under Budget, Time, Data and Political ConstraintsAmerican Evaluation AssociationProfessional pre-session workshopDenverNovember 5, 2008 Facilitated by: Michael Bamberger and Jim Rugh

Workshop Objectives 1. The seven steps of the RealWorld Evaluation approach for addressing common issues and constraints faced by evaluators such as: when the evaluator is not called in until the project is nearly completed and there was no baseline nor comparison group; or where the evaluation must be conducted with inadequate budget and insufficient time; and where there are political pressures and expectations for how the evaluation should be conducted and what the conclusions should say

Workshop Objectives 2. Identifying and assessing various design options that could be used in a particular evaluation setting 3. Ways to reconstruct baseline data when the evaluation does not begin until the project is well advanced or completed. 4. How to identify and to address threats to the validity or adequacy of quantitative, qualitative and mixed methods designs with reference to the specific context of RealWorld evaluations

Workshop Objectives Note: Given time constraints the workshop will focus on project-level impact evaluations. However, if the results of a pre-workshop survey of participants call for it, a brief introduction to the application of RWE techniques in other forms of evaluation, including the assessment of country programs and policy interventions, could be included.

Workshop agenda 8.00 – 8.20: Session 1: Introduction: Workshop objectives Feedback from participant survey Handout: RealWorld Evaluation Overview (summary chapter of book) 8.20 – 8.50: Session 2: RealWorld Evaluation overview and addressing the counterfactual Handout: “Why evaluators can’t sleep at night” 8.50 - 9.20: Session 3: Small Group discussions. Participants will introduce themselves and then share experiences on the types of constraints they have faced when designing and conducting evaluations, and what they did to try to address those constraints. 9:20-10.00: Session 4: RWE Steps 1, 2 and 3: Scoping the evaluation and strategies for addressing budget and time constraints Presentation and discussion 10.00-10.15: BREAK

Workshop agenda, cont. 10:15 – 10:45: Session 5: RWE Step 4: Addressing data constraints Presentation and discussion 10.45 – 11.15: Session 6: Mixed methods Presentation and discussion 11.15 – 12.00: Session 7: Small groups read their case studies and begin to discuss the learning exercise. We will use a low-cost housing case study. All four groups will discuss the same project but from different perspectives. 12:00 – 1:00: LUNCH 2.10 – 1.45: Session 8: Identifying and addressing threats to the validity of the evaluation design and conclusions 1.45 – 2.30: Session 9: Small groups complete exercise. Negotiate with your paired group how you propose to modify the ToR of your case study. 2.30 – 2.45: Session 10: Feedback from exercise. Discussion of lessons learned from the case study or the RealWorld Evaluation approach in general. 2.45 – 3.00: Session 11: Wrap up and workshop evaluation

RealWorld EvaluationDesigning Evaluations under Budget, Time, Data and Political Constraints Session 2. a OVERVIEW OF THE RWE APPROACH

RealWorld Evaluation Scenarios Scenario 1: Evaluator(s) not brought in until near end of project For political, technical or budget reasons: • There was no baseline survey • Project implementers did not collect adequate data on project participants at the beginning or during the life of the project • It is difficult to collect data on comparable control groups

RealWorld Evaluation Scenarios Scenario 2: The evaluation team is called in early in the life of the project But for budget, political or methodological reasons: • The ‘baseline’ was a needs assessment, not comparable to eventual evaluation • It was not possible to collect baseline data on a comparison group

Reality Check – Real-World Challenges to Evaluation • All too often, project designers do not think evaluatively – evaluation not designed until the end • There was no baseline – at least not one with data comparable to evaluation • There was/can be no control/comparison group. • Limited time and resources for evaluation • Clients have prior expectations for what the evaluation findings will say • Many stakeholders do not understand evaluation; distrust the process; or even see it as a threat (dislike of being judged)

RealWorld Evaluation Quality Control Goals • Achieve maximum possible evaluation rigor within the limitations of a given context • Identify and control for methodological weaknesses in the evaluation design • Negotiate with clients trade-offs between desired rigor and available resources • Presentation of findings must recognize methodological weaknesses and how they affect generalization to broader populations

The Need for the RealWorld Evaluation Approach • As a result of these kinds of constraints, many of the basic principles of impact evaluation design (comparable pre-test-post test design, comparison group, instrument development and testing, random sample selection, control for researcher bias, thorough documentation of the evaluation methodology etc.) are often sacrificed.

The RealWorld Evaluation Approach An integrated approach to ensure acceptable standards of methodological rigor while operating under real-world budget, time, data and political constraints. See handout summary chapter extracted from RealWorld Evaluation book for more details

The RealWorld Evaluation approach • Developed to help evaluation practitioners and clients • managers, funding agencies and external consultants • A work in progress • Originally designed for developing countries, but equally applicable in industrialized nations

Special Evaluation Challenges in Developing Countries • Unavailability of needed data • Scarce local evaluation resources • Limited budgets for evaluations • Institutional and political constraints • Lack of an evaluation culture • Many evaluations are designed by, and for, external funding agencies and seldom reflect local and national stakeholder priorities

Special Evaluation Challenges in Developing Countries Despite these challenges, there is a growing demand for methodologically sound evaluations which assess the impacts, sustainability and replicability of development projects and programs …………………….

Most RealWorld Tools are not New—Only the Integrated Approach is New • Most of the RealWorld Evaluation data collection and analysis tools will be familiar to most evaluators • What is new is the integrated approach which combines a wide range of tools to produce the best quality evaluation under real-world constraints

Who Uses RealWorld Evaluation and When? • Two main users: • Evaluation practitioners • Managers, funding agencies and external consultants • The evaluation may start at: • The beginning of the project • After the project is fully operational • During or near the end of project implementation • After the project is finished

What is Special About the RealWorld Evaluation Approach? • There is a series of steps, each with checklists for identifying constraints and determining how to address them • These steps are summarized on the following slide and then the more detailed flow-chart … (See page6of handout)

The Steps of the RealWorld Evaluation Approach Step 1: Planning and scoping the evaluation Step 2: Addressing budget constraints Step 3: Addressing time constraints Step 4: Addressing data constraints Step 5: Addressing political constraints Step 6:Assessing and Addressing the strengths and weaknesses of the evaluation design Step 7: Helping clients use the evaluation

TheReal-World Evaluation Approach • Step 1: Planning and scoping the evaluation • . Defining client information needs and understanding the political context • . Defining the program theory model • . Identifying time, budget, data and political constraints to be addressed by the RWE • . Selecting the design that best addresses client needs within the RWE constraints Step 2 Addressing budget constraints A. Modify evaluation design B. Rationalize data needs C. Look for reliable secondary data D. Revise sample design E. Economical data collection methods Step 3 Addressing time constraints All Step 2 tools plus: F. Commissioning preparatory studies G. Hire more resource persons H. Revising format of project records to include critical data for impact analysis. I. Modern data collection and analysis technology • Step 4 • Addressing data constraints • . Reconstructing baseline data • . Recreating comparison groups • . Working with non-equivalent comparison groups • . Collecting data on sensitive topics or from difficult to reach groups • . Multiple methods Step 5 Addressing political influences A. Accommodating pressures from funding agencies or clients on evaluation design. B. Addressing stakeholder methodological preferences. C. Recognizing influence of professional research paradigms. Step 6 Assessing and addressing the strengths and weaknesses of the evaluation design An integrated checklist for multi-method designs A. Objectivity/confirmabilityB. Replicability/dependabilityC. Internal validity/credibility/authenticityD. External validity/transferability/fittingness Step 7 Helping clients use the evaluation A. Utilization B. Application C. OrientationD. Action 24

RealWorld EvaluationDesigning Evaluations under Budget, Time, Data and Political Constraints Session 2.b The challenge of the counterfactual

Attribution and counterfactuals How do we know if the observed changes in the project participants or communities • income, health, attitudes, school attendance etc are due to the implementation of the project • credit, water supply, transport vouchers, school construction etc or to other unrelated factors? • changes in the economy, demographic movements, other development programs etc

The Counterfactual • What would have been the condition of the project population at the time of the evaluation if the project had not taken place?

Where is the counterfactual? After families had been living in a new housing project for 3 years, a study found average household income had increased by an 50% Does this show that housing is an effective way to raise income?

Comparing the project with two possible comparison groups I n c o m e Project group. 50% increase 750 Scenario 1. 50% increase in comparison group income. No evidence of project impact 500 Scenario 2. No increase in comparison group income. Potential evidence of project impact 250 2000 2002

5 main evaluation strategiesfor addressing the counterfactual Randomized designs I. True experimental designs II. Randomized field designs Quasi-experimental designs III. Strong quasi-experimental designs IV. Weaker quasi-experimental designs Non-experimental designs. V. No logically defensible counterfactual

The best statistical design option in most field settings: Randomized or strong quasi-experimental evaluation designs Subjects randomly assigned to the project and control groups or control group selected using statistical or judgmental matching Conditions of both groups are not controlled during the project Gain score [impact] = P2 – P1 C2– C1

Control group and comparison group • Control group = randomized allocation of subjects to project and non-treatment group • Comparison group = separate procedure for sampling project and non-treatment groups

Reference sources for randomized field trial designs 1. MIT Poverty Action Lab www.povertyactionlab.org 2. Center for Global Development “When will we ever learn?” http://www.cgdev.org/content/publications/detail/7973 3. International Initiative for Impact Evaluation = 3ie http://www.3ieimpact.org/

The limited use of strong evaluation designs • It is estimated that • Less only 5-10% of impact evaluations use a strong quasi-experimental design • Significantly less than 5% use randomized control trials

TIME FOR DISCUSSION 35

Introductory small-group discussions Introduce yourselves, including something about your experience in coordinating or conducting evaluations. In particular share experiences on the types of constraints you have faced when designing and conducting evaluations, and what you did to try to address those constraints.

RealWorld EvaluationDesigning Evaluations under Budget, Time, Data and Political Constraints Session 4, Step #1 PLANNING AND SCOPING THE EVALUATION

Step 1: Planning and Scoping the Evaluation • Understanding client information needs • Defining the program theory model • Preliminary identification of constraints to be addressed by the RealWorld Evaluation

A. Understanding client information needs Typical questions clients want answered: • Is the project achieving its objectives? • Are all sectors of the target population benefiting? • Are the results sustainable? • Which contextual factors determine the degree of success or failure?

A. Understanding client information needs A full understanding of client information needs can often reduce the types of information collected and the level of detail and rigor necessary. However, this understanding could also increase the amount of information required!

B. Defining the program theory model All programs are based on a set of assumptions (hypothesis) about how the project’s interventions should lead to desired outcomes. • Sometimes this is clearly spelled out in project documents. • Sometimes it is only implicit and the evaluator needs to help stakeholders articulate the hypothesis through a logic model.

B. Defining the program theory model • Defining and testing critical assumptions are a essential (but often ignored) elements of program theory models. • The following is an example of a model to assess the impacts of microcredit on women’s social and economic empowerment

Critical Hypothesis for a Gender-Inclusive Micro-Credit Program • Outputs • If credit is available women will be willing and able to obtain loans and technical assistance. • Short-term outcomes • If women obtain loans they will start income-generating activities. • Women will be able to control the use of loans and reimburse them. • Medium/long-term impacts • Economic and social welfare of women and their families will improve. • Increased women’s economic and social empowerment. • Sustainability • Structural changes will lead to long-term impacts.

C. Determining appropriate (and feasible) evaluation design • Based on an understanding of client information needs, required level of rigor, and what is possible given the constraints, the evaluator and client need to determine what evaluation design is required and possible under the circumstances.

Let’s focus for a while on evaluation design (a quick review) 1: Review different evaluation (experimental/research) designs 2: Develop criteria for determining appropriate Terms of Reference (ToR) for evaluating a project, given its own (planned or un-planned) evaluation design. 3: Defining levels of rigor 4: A life-of-project evaluation design perspective. 45

scale of major impact indicator An introduction to various evaluation designs Illustrating the need for quasi-experimental longitudinal time series evaluation design Project participants Comparison group baseline end of project evaluation post project evaluation 46

OK, let’s stop the action to identify each of the major types of evaluation (research) design … … one at a time, beginning with the most rigorous design. 47

First of all: the key to the traditional symbols: • X = Intervention (treatment), I.e. what the project does in a community • O = Observation event (e.g. baseline, mid-term evaluation, end-of-project evaluation) • P (top row): Project participants • C (bottom row): Comparison (control) group Note: the RWE evaluation designs are laid out in Table 3 on page 46 of your handout 48

Design #1: Longitudinal Quasi-experimental P1 X P2 X P3 P4 C1 C2 C3 C4 Project participants Comparison group baseline midterm end of project evaluation post project evaluation 49

Design #1+: Longitudinal Randomized Control Trial P1 X P2 X P3 P4 C1 C2 C3 C4 Project participants Research subjects randomly assigned either to project or control group. Control group baseline midterm end of project evaluation post project evaluation 50

Design #2: Randomized Control Trial P1 X P2 C1 C2 Project participants Research subjects randomly assigned either to project or control group. Control group baseline end of project evaluation 51

Design #3: Quasi-experimental (pre+post, with comparison) P1 X P2 C1 C2 Project participants Comparison group baseline end of project evaluation 52

Design #7: Truncated Longitudinal X P1 X P2 C1 C2 Project participants Comparison group midterm end of project evaluation 53

Facilitated by: Michael Bamberger and Jim Rugh