560 likes | 636 Views
Impact Evaluation of Behavioural HIV Prevention Programs. How is Impact Evaluation different to Traditional M&E?. “Traditional” M&E: Is the program being implemented as designed? Could the operations be more efficient? Are the benefits getting to those intended? Monitoring trends
E N D
How is Impact Evaluation different to Traditional M&E? • “Traditional” M&E: • Is the program being implemented as designed? • Could the operations be more efficient? • Are the benefits getting to those intended? • Monitoring trends • Are indicators moving in the right direction? • NO inherent Causality • Impact Evaluation: • What was the effect of the program on outcomes? • Because of the program, are people better off? • What would happen if we changed the program? • Causality
High-quality impact evaluations…. • ….measure the net change in outcomes that can be attributed to a specific programme. • It helps inform policy as to what works, what does not, and why.
Types of Impact Evaluations • Randomisation • At individual level • At community level When randomisation not possible • Stepped wedge • Randomised encouragement • Difference in differences • Matching
Randomisation Example Implement if there is variation that can be measured and controlled
Cluster Evaluation Designs • Unit of analysis is a group (e.g., communities, districts) • Usually prospective Intervention Comparison Source: McCoy, 2010
Cluster Evaluation Designs Case Study:Progresa/Oportunidades Program • National anti-poverty program in Mexico • Eligibility based on poverty index • Cash transfers • conditional on school and health care attendance • 506 communities • 320 randomly allocated to receive the program • 185 randomly allocated to serve as controls • Program evaluated for effects on health and welfare Source: McCoy, 2010
Randomised encouragement Implement when implementation is uniform, and some form of encouragement is possible
Randomised Promotion • Common scenarios: • National program with universal eligibility • Voluntary inscription in program • Comparing enrolled to not enrolled introduces selection bias • One solution: provide additional promotion, encouragement or incentives to a sub-sample: • Information • Encouragement (small gift or prize) • Transport Source: McCoy, 2010
Randomised Promotion Universal eligibility Source: McCoy, 2010
Randomised Promotion Universal eligibility Selectively promote NoPromotion Promotion Source: McCoy, 2010
Stepped wedge example Implement if uniform implementation is the eventual goal, but implementation can be controlled
Stepped Wedge or Phased-In Brown CA, Lilford RJ. BMC Medical Research Methodology, 2006. Source: McCoy, 2010
Stepped Wedge or Phased-In Brown CA, Lilford RJ. BMC Medical Research Methodology, 2006. Source: McCoy, 2010
Stepped Wedge or Phased-In Brown CA, Lilford RJ. BMC Medical Research Methodology, 2006. Source: McCoy, 2010
Stepped Wedge or Phased-In Brown CA, Lilford RJ. BMC Medical Research Methodology, 2006. Source: McCoy, 2010
Difference in Differences Example Implement if baseline and end of implementation data are known, and project being implemented in an environment where change is taking place
Effect = 3.47 – 11.13 = - 7.66 Participants 66.37 – 62.90 = 3.47 57.50 - 46.37 = 11.13 Non-participants
Effect = 8.87 – 16.53 = - 7.66 Before 66.37 – 57.50 = 8.87 62.90 – 46.37 = 16.53 After
Density of scores for high exposure ‘Low exposure’ Density of scores for low exposure ‘High exposure’ Low probability of exposure given X High probability of exposure given X
But, there are challenges with impact evaluations, specifically on the measurement …
Challenges in measuring ‘treatments’ with behavioural interventions: What do we measure?
‘Treatment’ measurement options for behavioural interventions • Coverage: % of population…. • Intensity of exposure: frequency distribution of population by times message seen • Type of exposure: Only ‘interventions’? What about the organic, endogenous changes we WANT to see….. Diffusion is good in this context (from a programmatic point of view)
Challenges in Behavioural Measurement • Use and compliance are unobserved and self-reported • Self-reports are subject to bias • Social desirability • Value of the saying what’s in the counseling message • Confounded by benefits of study participation • Changing social norms • Interpretation • Uniform understandings of, for example, ‘casual sex’, ‘non-regular partners’? • E.g. ‘faithfulness’ – FHI (2009) – man is faithful if his wife does not find out about the second partner • Questions interpreted in the same way? • Recall (can’t remember) • Think of those of us who are challenged by time sheets • Interviewer • Difficult to standardize counseling across all staff for the duration of the trial, affects adherence and its measurement Source: Adapted from Padian, 2009
Challenges in Behavioural Measurement • How do we measure changes in social norms? • At what level? • Couple, community, social network, ‘social’ • How do we track changes over time? • What about normative behaviors? • Almost exclusive reliance on ego-centric survey data Source: Adapted from Padian, 2009
Namibia – perceptions and social norms about concurrency Source: Drawn by author, with data from NDHS 2007
Challenges in behavioural measurements: How can we address them?
Challenges in behavioural measurements: How can we address them? Option 1: Use a different data collection method than FTF
3. We need to figure out better survey methods ICVI voting box PBS polling box Clipboard with enclosed PDA Self reported coital diaries Source: Boily, 2009
Reported rates of concurrent partners depend the method you use to collect the data: KZN Souce: McGrath et al., 2009 Overall Face to Face interviews Self-completed questionnaires But…55% missing data
Challenges in behavioural measurements: How can we address them? Option 2: Use biological markers for sexual behaviour
Zimbabwe study • Randomized, cross-sectional study of Zimbabwean women who had recently completed MIRA* • Objective 1: Measure the validity of self-reports of recent sex and condom use using an objective biomarker of semen exposure for the previous two days (i.e., prostate specific antigen (PSA)). • Objective 2: Evaluate whether ACASI improved the validity of self-reported data compared to standard face-to-face interviewing (FTFI). *Manuscript by Minnis et al. currently in preparation Source: Padian et al , 2009
Zimbabwe Study Study interviewer training a participant to use the computer, Seke South Clinic, Chitungwiza, Zimbabwe. Source: Padian et al , 2009
Zimbabwe Study Results N = 196 women who tested positive for PSA 53% (104/196) did NOT report unprotected sex…. †Based on one-sided Fisher’s exact test Source: Padian et al , 2009
So given that they lied…. • Did they lie consistently over time? • Did their pattern of lying change? • What influenced the (i) number that lied, (ii) who lied, and (iii) what they lied about?
Lying about sexual behaviour might change over time Source: Slaymaker and Gorgens, forthcoming
Challenges in behavioural measurements: How can we address them? Option 3: Don’t ask them about their behaviors, ask them about their sexual partners and sexual history
But the results not always encouraging… • 32% and 36% - agree on exact date of first and last sex, respectively (USA data, Brewer et al, 2005) • 64% and 81% - agree within 30 days on date of first and last sex, respectively (USA data, Brewer et al, 2005)
Malawi data (Likoma island), 2009 Reported by respondent’s partner(s) only Reported jointly but disagreement re: date Reported by respondent only Reported jointly and agree re: date Source: Helleringer et al., 2009
Challenges in behavioural measurements: How can we address them? Option 4: Collect socio-centric data
Challenges in behavioural measurements: How can we address them? Other options….
Possibility of cell phone use: South Africa Source: Chimbindi, 2010
Options for Cell Phone Use • Real-time program monitoring data • Sexual behaviour data of sub populations, e.g. mobile populations at taxi ranks in Southern Africa • Incentive-based • Quick feedback • Low cost (IVR option can be rolled at around USD 15/response (excluding incentive costs)
Take-home messages • Measurements differ by: • Time frame • How the questions was phrased • Interview mode • Biological marker versus self-reports • Type of survey
Take-home messages • Validity and reliability is at the centre of IE and therefore study design • Need to understand it, • Design with it in mind, • Test for it, and • Interpret study results with it in mind