IDEV 624 – Monitoring and Evaluation

IDEV 624 – Monitoring and Evaluation Introduction to Process Monitoring Payson Center for International Development and Technology Transfer Tulane University

Process vs. Outcome/Impact Monitoring Outcome Impact Monitoring Evaluation Process Monitoring LFM USAID Results Framework

Program vs. Outcome Monitoring • Program process monitoring: The systematic and continual documentation of key aspects of program performance that assess whether the program is operating as intended or according to some appropriate standard • Outcome monitoring: The continual measurement of intended outcomes of the program, usually of the social conditions it is intended to improve Process Monitoring A Form of Outcome Evaluation

A Public Health Questions Approach to HIV/AIDS M&E Are collective efforts being implemented on a large enough scale to impact the epidemic? (coverage; impact)?Surveys & Surveillance Are we doing them on a large enough scale? Determining Collective Effectiveness OUTCOMES & IMPACTS Are interventions working/making a difference? Outcome Evaluation Studies OUTCOMES Monitoring & Evaluating National Programs Are we doing them right? Are we implementing the program as planned? Outputs Monitoring OUTPUTS What are we doing? Are we doing it right? Process Monitoring & Evaluation, Quality Assessments ACTIVITIES Are we doing the right things? What interventions and resources are needed? Needs, Resource, Response Analysis & Input Monitoring INPUTS Understanding Potential Responses What interventions can work (efficacy & effectiveness)? Efficacy & Effectiveness Studies, Formative & Summative Evaluation, Research Synthesis What are the contributing factors? Determinants Research Problem Identification What is the problem? Situation Analysis & Surveillance (UNAIDS 2008)

(World Bank 2009)

Strategic Planning for M&E: Setting Realistic Expectations All Most Some Few* Number of Projects Input/ Output Monitoring Process Evaluation Outcome Monitoring / Evaluation Impact Monitoring / Evaluation Levels of Monitoring & Evaluation Effort *Disease impact monitoring is synonymous with disease surveillance and should be part of all national-level efforts, but cannot be easily linked to specific projects 6

Project Monitoring Plan

What is Process Monitoring?

Process Monitoring • Process Monitoring: The systematic attempt by evaluation researchers to examine program coverage and delivery • Program monitoring provides an estimate of • the extent to which a program is reaching its intended target population • the degree of congruence between the plan for providing services and treatments (program elements) and the ways they actually are provided • Does NOT attempt to assess the effects of the program on the program participants (Rossi/Freeman 1989)

Process Monitoring (cont.) • Not a single distinct evaluation procedure but a family of approaches, concepts and methods • Focus on the enacted program itself: operations, activities, functions, performance, component parts, resources, etc. • Often collects information about resource expenditures in the conduct of the program (cost-benefit analysis) (Rossi/Lipsey/Freeman: 2004)

Process Monitoring Strategies • Process implementation evaluation: • Conducted by evaluation specialists as a separate project, either stand alone or as a complement of impact evaluation • Continuous program monitoring: • Continuous monitoring of key indicators, routine data collection by management information system (MIS) or similar mechanism (Rossi/Lipsey/Freeman: 2004)

Monitoring Service Utilization: Coverage • Coverage: The extent to which participation by the target population achieves the levels specified in the program design • Both over-coverage and under-coverage are problems, however, the most common problem is the failure to achieve a high target participation (Rossi/Lipsey/Freeman: 2004)

Under- vs. Over-Coverage • Under-coverage: The proportion of the targets in need of a program that actually participates in it • Over-coverage: The number of program participants who are not in need, compared with the total number of participants in the program  Common problem: the inability to specify the number in need, the magnitude of the target population (Rossi/Lipsey/Freeman: 2004)

Monitoring Service Utilization: Bias • Bias: The degree to which some subgroups participate in greater proportions than others • Often caused by self-selection and/or program actions (for example by focusing on most “success prone” targets) (Rossi/Lipsey/Freeman: 2004)

Assessing Bias • Assessing Bias: Assessing the differences between program participants and those that • Drop-out of the program (drop-out rate, attrition) • Are eligible but do not participate at all • Accessibility: The extent to which structural and organizational arrangements facilitate participation in the program • Data sources: Program records, specifically designed surveys incl. community surveys, census data and similar secondary data sources (Rossi/Lipsey/Freeman: 2004)

Monitoring Organizational Functions • Three kinds of implementation failure: • “Nonprograms” and incomplete interventions (nothing or not enough delivered) • Wrong intervention (mode of delivery may negate intervention, for example, may be too sophisticated) • Unstandardized intervention (implementation may vary excessively across the target population) • Focus on service activities AND monitoring of vital program support functions (fund-raising, training, advocacy, etc.) (Rossi/Lipsey/Freeman: 2004)

Monitoring Data Collection

Planning for Data Collection • Choosing data collection method • Selecting indicators and developing questionnaires • Determining the sampling strategy • Assessing validity, reliability and sensitivity • Developing data analysis plan (King/Morris/Fitz-Gibbon: 1987)

STEP 1: Choosing Data Collection Method • Examine the records kept over the course of a program • Collect data to fill information gaps (King/Morris/Fitz-Gibbon: 1987)

Data Collection Methods • Analysis of program records • Key informant interviews • Focus group interviews • Observations • Physical measurements • Standardized tests • Surveys (program participants, community, etc.)

(Measure Evaluation Online Course)

Selecting Methods • Standardized approaches with ready-made measurement instruments sometimes available • However, often controversial, and ready-made measurement instruments may not be appropriate under all (most?) circumstances - Example: Measuring well-being/ happiness across cultures?

Selecting Methods (cont.) • Often evaluators have to develop their own tools and instruments • Researchers have rarely sufficient time and resources to do this properly • Requires significant amount of pilot testing, analysis, revision, and validation

STEP 2: Selecting indicators and developing questionnaires • Use existing questionnaires and instruments, if feasible • Use standardized indicators, if available • Follow indicators standards if developing or modifying questionnaires and instruments

An indicator is a measure of a concept or behavior Principal types of indicators: Process indicators Provide evidence of whether the project is moving in the right direction to achieve an objective Provide information about implementation of activities (quantitative and/or qualitative) They should be collected throughout the life of the project Outcome/Impact indicators Provide information about whether an expected change occurred, either at the program level or population level Measure changes that program activities are seeking to produce in the target population Often stated in percentage, ration or proportion to show what was achieved in relation to the total population Should be a direct reflection of the objectives Project Indicators

Indicator Criteria • Measurable (able to be recorded and analyzed in quantitative or qualitative terms) • Precise (defined the same way by all people) • Consistent (not changing over time so that it always measures the same thing) • Sensitive (changing proportionally in response to actual changes in the condition or item being measured)

Indicator Selection

Matching Indicators and Methods of Data Collection • Select more than one method to measure an indicator (if possible) • Criteria for selecting methods: • Reliability, validity, and sensitivity • Cost-effectiveness • Feasibility • Appropriateness  Only measures that are valid, reliable, and sensitive will produce estimates that can be regarded as credible

STEP 3: Determining the Sampling Strategy • Census vs. Sampling • Census measures all units in a population • Sampling identifies and measures a subset of individuals within the population • Probability vs. Non-Probability Sampling • Probability sampling results in a sample that is representative of the population • A non-probability sample is not representative of the population

Probability Sampling Sample representative of the population, large sample size Simple random/systematic sampling Stratified random/systematic sampling Cluster sampling • . • Advantages • Research findings representative of the population • Advanced statistical analysis • Disadvantages • Costly and time consuming (depending on target population) • Significant training needs

Non-Probability Sampling Sample not representative of the population, often small sample size Convenience/purposeful sampling Quota sampling Relatively inexpensive Can be implemented quickly Limited training needs • . • Advantages • Disadvantages • Results not representative of population • Limited options for statistical analysis of the data • Results biased

STEP 4: Assessing Validity, Reliability and Sensitivity • Validity • Is the instrument appropriate for what needs to be measured? • Reliability • Does the instrument yield consistent results? • Sensitivity • Indicators changing proportionally in response to actual changes in the condition or item being measured?

Reliability • Reliability: The extent to which the measure produces the same results when used repeatedly to measure the same thing • Variation in results = measurement error • Unreliability in measures obscures real differences (Rossi/Lipsey/Freeman 2004)

Reliability (cont.) • How to verify? • Test-retest reliability: Most straightforward but often problematic, esp. if measurement cannot be repeated before outcome might have changed • Internal consistency reliability: Examining consistency between similar items on a multi-item measure • Ready-made measures: Reliability information available from previous research

Validity • Validity: The extent to which a measure measures what it is intended to measure • Usually difficult to test whether a particular measure is valid • However, it is important that an outcome measure is accepted as valid by stakeholders (Rossi/Lipsey/Freeman 2004)

Validity (cont.) • How to verify? • Empirical demonstrations: • Comparison, often with another measure, that shows that the measure produces the results expected • Demonstration that results of the measure “predict” other characteristics expected to be related to the outcome • Other approaches: Using data from more than one source, careful theoretical justification based on program impact theory, etc.

Sensitivity • Sensitivity: The extent to which the values of the measure change when there is a change or difference in the thing being measured • Outcome measures in program evaluation are sometimes insensitive because: • They include elements that the program could not reasonably be expected to change • They have been developed for a different (often diagnostic) purpose (Rossi/Lipsey/Freeman 2004)

Sensitivity Margoluis & Salafsky p94

Sensitivity (cont.) • How to verify? • Previous research: Identify research in which the measure was used successfully (need to be very similar programs, sample size needs to be sufficiently large ) • Known differences: Apply the outcome measure to groups of known difference or situations of known change, and determine how responsive it is

Measurement Errors • Systematic error (we do not measure what we think we measure) • Random error (inconsistencies from one measurement to the next)

STEP 5: Developing Data Analysis Plan • Type of data • Qualitative and/or quanititative? Representative of larger population? Standardized vs. open-ended responses? • Type of comparison • Comparison between groups and/or over time? • Type of variable • Categorical and/or continuous? • Type of data analysis • Descriptive analysis and/or hypothesis testing?

Project Monitoring Plan

IDEV 624 – Monitoring and Evaluation