Evaluation Revisited: Improving the quality of evaluative practice by embracing complexity

Implications of complication and complexity for evaluationPatricia J. RogersCIRCLE(Collaboration for Interdisciplinary Research, Consulting and Learning in Evaluation)Royal Melbourne Institute of Technology, AustraliaPatricia.Rogers@rmit.edu.au Evaluation Revisited: Improving the quality of evaluative practice by embracing complexity Utrecht, the Netherlands May 20-21 2010

The naïve experimentalism view of evaluation and evidence-based policy and practice INTENDED BENEFICIARIES BENEFIT AS EXPECTED PRACTITIONERS DO THING ‘A’ POLICYMAKERS DECIDE TO DO THING ‘A’ SINGLE STUDY RESEARCHERS SEVERAL STUDIES FIND THAT THING ‘A’ WORKS

But things are often more complicated or complex than this …

What can (and does) go wrong with naïve experimentalism PRACTITIONERS NOT FEASIBLE IN OTHER LOCATIONS DO THING ‘A’ NOT SCALEABLE POLICYMAKERS DECIDE TO DO THING ‘A’ RANDOM ERROR DIFFERENTIAL EFFECTS – THING ‘A’ ONLY WORKS IN SOME CONTEXTS RESEARCHERS FIND THAT THING ‘A’ WORKS NEGATIVE EFFECTS IGNORED NARROW STUDIES THAT IGNORE IMPORTANT EVIDENCE MISREPRESENTATION OF RESULTS

An alternative view of knowledge- building

An approach to evaluation and evidence-based policy and practice that recognizes the complicated and complex aspects of situations and interventions Researchers and evaluators Policymakers What is needed? What is possible? What works? What works for whom in what situations? What is working? Practitioners and managers Community and civil society

Advocacy for RCTs (Randomised Controlled Trials) in development evaluation 2003 2006 “J-PAL is best understood as a network of affiliated researchers … united by their use of the randomized trial methodology” “Advocated more use of RCTs Argued that experimental and quasi-experimental designs had a comparative advantage because they provide an unbiased numeric estimate of impact 2009 2010 TED talk Used leeches to illustrate the alternative to using RCTs as evidence

Distinguishing between RCTs and naïve experimentalism RCT (Randomised Controlled Trial) • one of many research designs that can be suitable • involves randomly assigning (truly randomly, not ad hoc) potential participants to either receive the treatment (or one of several version of the treatment) or to be in the control group (who might receive nothing or the current standard treatment) • in ‘double blind’ RCTs neither the participants nor the researchers know who is in the treatment group (eg the control group get pills that look the same and the details of the group are kept secret until after the results are recorded) Naïve experimentalism • believes that RCTs always provide the best evidence (the ‘gold standard’ approach) • ignores (or is ignorant) of the potential risks in using RCTs and the other approaches that can be appropriate

Exploring complication and complexity in evaluation 2008 1997 2006 2010 2008 2009

Some unhelpful ways ‘complex’ is used • Difficult – eg little available data, hard to get additional data • Beyond scrutiny – eg too technical for others to understand or challenge • Ad hoc – eg too overwhelmed with implementation to think about planning or follow through

Two framings of simple, complicated and complex

Using the framework Can be used to refer to a situation or to an intervention Not useful as a way of classifying the whole situation or intervention most useful to consider aspects of interventions Not normative complex is not better than simple simple interventions can still be difficult to do well, or to get good data about

Simple can sometimes be appropriate “Everything should be made as simple as possible, but no simpler.” “It can scarcely be denied that the supreme goal of all theory is to make the irreducible basic elements as simple and as few as possible without having to surrender the adequate representation of a single datum of experience.” Albert Einstein, Oxford University, 1933

Implications of complicated and complex situations and interventions for evaluation • Focus • Governance • Consistency • Necessariness • Sufficiency • Change trajectory • Unintended outcomes (Funnell and Rogers 2010 Purposeful Program Theory. Jossey-Bass)

1. Focus - implications for evaluation? Funnell and Rogers 2010 Purposeful Program Theory. Jossey-Bass)

Focus - Objectives at multiple levels of a system Funnell and Rogers 2010 Purposeful Program Theory. Jossey-Bass)

2. Governance - implications for evaluation? Funnell and Rogers 2010 Purposeful Program Theory. Jossey-Bass)

3. Consistency - implications for evaluation? Funnell and Rogers 2010 Purposeful Program Theory. Jossey-Bass)

What interventions look like – teaching reading Griffin, P. 2009 ‘Ambitious new project to raise literacy and numeracy levels in Victorian Schools. http://newsroom.melbourne.edu/studio/ep-29 Griffin, P., Murray, L., Care, E., Thomas, A., & Perri, P. (2009). Developmental Assessment: Lifting literacy through Professional Learning Teams, Assessment in Education. In press

What interventions look like – supporting small businesses

4. Necessariness - implications for evaluation? Funnell and Rogers 2010 Purposeful Program Theory. Jossey-Bass)

Necessariness – with/without comparisons St Pierre et al, 1996 Report on the National Evaluation of the Comprehensive Child Development Program. Summary and links to reports available at http://www.researchforum.org/project_abstract_166.html

5. Sufficiency - implications for evaluation? Funnell and Rogers 2010 Purposeful Program Theory. Jossey-Bass)

: False negatives – the potted plant thought experiment • If 200 potted plants are randomly assigned to either a treatment group that receives daily water, or to a control that receives none, and both groups are placed in a dark cupboard, the treatment group does not have better outcomes than the control. Possible conclusions: Watering plants is ineffective in making them grow. Better conclusion: Water is not sufficient.

: False positives – Early Head Start • Early Head Start program - on average effective. Listed as an ‘evidence-based program’ • But unfavourable outcomes for children in families with high levels of demographic risk factors (Mathematica Policy Research Inc, 2002, Westhorp (2008) • Westhorp, G (2008) Development of Realist Evaluation Methods for Small Scale Community Based Settings Unpublished PhD Thesis, Nottingham Trent University • Mathematica Policy Research Inc (2002). Making a Difference in the Lives of Infants and Toddlers and Their Families: The Impacts of Early Head Start, Vol 1. US Department of Health and Human Services.

6. Change trajectory - implications for evaluation? Funnell and Rogers 2010 Purposeful Program Theory. Jossey-Bass)

: Complicated dose-response relationship – does stress improve performance?

7. Unintended outcomes - implications for evaluation? Funnell and Rogers 2010 Purposeful Program Theory. Jossey-Bass)

Issues that may need to be addressed Focus Governance Consistency Necessariness Sufficiency Change trajectory Unintended outcomes Possible evaluation methods, approaches and methodologies Emergent evaluation design that can accommodate emergent program objectives and emergent evaluation issues Collaborative evaluation across different stakeholders and organisations Non-experimental approaches to causal attribution/contribution that don’t rely on a standardized ‘treatment’ Realist evaluation that pays attention to the contexts in which causal mechanisms operate Realist synthesis that can integrate diverse evidence (including credible single case studies) in different contexts ‘Butterfly nets’ to catch unanticipated results Some thoughts on how evaluation might help us to understand the complicated and the complex

Looking forward to hearing about your approaches to addressing these issues in evaluation

Evaluation Revisited: Improving the quality of evaluative practice by embracing complexity

Evaluation Revisited: Improving the quality of evaluative practice by embracing complexity

Presentation Transcript

Complexity revisited: learning from failures

Improving quality of the Childcare Workforce

Improving the Quality of Life

Embracing Complexity of Business Processes with Simpler Implementations

Complexity revisited: learning from failures

Improving the Quality of Housing

Improving quality by changing the service

European Evaluation Society Making a difference: supporting evaluative practice through the EES

Ties Matter: Complexity of Voting Manipulation Revisited

Evaluative practice

Improving the Quality of Stroke Care

Improving the Quality of Database Designs

Improving the Speed and Quality of Architectural Performance Evaluation

“Synergy Improving Quality in European Schools? Concepts of Evaluation

“Synergy Improving Quality in European Schools? Concepts of Evaluation

The Complexity of XPath Evaluation

Improving Security By Embracing The Cloud

Improving Quality of Life

Complexity revisited: learning from failures

Improving the Quality of Housing

Evaluation and the Science of Complexity

Improving the Quality of Clinical Preceptors