290 likes | 435 Views
Workshop on randomized controlled trials. Purpose: Increasing capacity to develop and conduct rigorous evaluations of the effectiveness of education interventionsCaveat: ?Rigorous evaluations" are not appropriate for every intervention or every research project involving an interventionThey requir
E N D
1. Session 2:Specifying the Conceptual and Operational Models and the Research Questions that Follow Mark W. Lipsey
Vanderbilt University
2. Workshop on randomized controlled trials Purpose: Increasing capacity to develop and conduct rigorous evaluations of the effectiveness of education interventions
Caveat: “Rigorous evaluations” are not appropriate for every intervention or every research project involving an intervention
They require special resources (funding, amenable circumstances, expertise, time)
They can produce misleading or uninformative results if not done well
The preconditions for making them meaningful may not be met.
3. Critical preconditions for rigorous evaluation A well-specified, fully developed intervention with useful scope
basis in theory and prior research
identified target population
specification of intended outcomes/effects
“theory of change” explication of what it does and why it should have the intended effects for the intended population
operators’ manual: complete instructions for implementing
ready-to-go materials, training procedures, software, etc.
4. Critical preconditions for rigorous evaluation (continued) A plausible rationale that the intervention is needed; reason to believe it has advantages over what’s currently proven and available
Clarity about the relevant counterfactual– what it is supposed to be better than
Demonstrated “implementability”– can be implemented well enough in practice to plausibly have effects
Some evidence that it can produce the intended effects albeit short of standards for rigorous evaluation
5. Critical preconditions for rigorous evaluation (continued) Amenable research sites and circumstances:
cooperative schools, teachers, parents, and administrators willing to participate
student sample appropriate in terms of representativeness and size for showing educationally meaningful effects
access to students (e.g., for testing), records, classrooms (e.g., for observations)
6. IES funding categories Goal 2 (intervention development) for advancing intervention concepts to the point where rigorous evaluation of its effects may be justified
Goal 3 (efficacy studies) for determining whether an intervention can produce worthwhile effects; RCT evaluations preferred.
Goal 4 (effectiveness studies) for investigating the effects of an intervention implemented under realistic conditions at scale; RCT evaluations preferred.
7. Specifying the theory of change embodied in the intervention Nature of the need addressed
what and for whom (e.g., 2nd grade students who don’t read well)
why (e.g., poor decoding skills, limited vocabulary)
where the issues addressed fit in the developmental progression (e.g., prerequisites to fluency and comprehension, assumes concepts of print)
rationale/evidence supporting these specific intervention targets at this particular time
8. Specifying the theory of change How the intervention addresses the need and why it should work
content: what the student should know or be able to do; why this meets the need
pedagogy: instructional techniques and methods to be used; why appropriate
delivery system: how the intervention will arrange to deliver the instruction
Most important: What aspects of the above are different from the counterfactual condition
What are the key factors or core ingredients most essential and distinctive to the intervention
9. Logic models as theory schematics
11. Mapping variables onto the intervention theory: Sample characteristics
12. Mapping variables onto the intervention theory: Intervention characteristics
13. Mapping variables onto the intervention theory: Intervention outcomes
14. Main relationships of (possible) interest Causal relationship between IV and DVs (effects of causes); tested as T-C differences
Duration of effects post-intervention; growth trajectories
Moderator relationships; ATIs (aptitude-Tx interactions): differential T effects for different subgroups; tested as T x M interactions or T-C differences between subgroups
Mediator relationships: stepwise causal relationship with effect on one DV causing effect on another; tested via Baron & Kenny (1986), SEM type techniques.
15. Formulation of the research questions Organized around key variables and relationships
Specific with regard to the nature of the variables and relationships
Supported with a rationale for why the question is important to answer
Connected to real-world education issues
What works, for whom, under what circumstances, how, and why?
16. Session 3:Describing and Quantifying Outcomes Mark W. Lipsey
Vanderbilt University
17. Outcome constructs to measure Identifying the relevant outcome constructs follows from the theory development and other considerations covered earlier in Session 2
What: proximal/mediating and distal outcomes
When: temporal status– baseline, immediate outcome, longer term outcomes
What else:
possible positive or negative side effects
construct control outcomes not targeted for change
18. Aligning the outcome constructs and measures with the intervention and policy objectives
19. Alignment of instructional tasks with the assessment tasks
20. Basic psychometric issues Validity (typically correlation with established measures or subgroup differences)
Reliability (typically internal consistency or test-retest correlation)
standardized measures of established validity and reliability
researcher developed measures with validity and reliability demonstrated in prior research
new measures with validity and/or reliability to be investigated in present study
21. Special issue for intervention studies: sensitivity to change
22. Achievement effect sizes from 97 randomized education studies
23. Data from which measurement sensitivity can be inferred Observed effects from other intervention studies using the measure
Mean effect sizes and their standard deviations from meta-analysis
Longitudinal research and descriptive research showing change over time or differences between relevant criterion groups
Archival data allowing ad hoc analysis of, e.g., change over time, differences between groups
Pilot data on change over time or group differences with the measure
24. Variance control and measurement sensitivity
25. Issues related to multiple outcome measures
26. Correlated measures: overlap and efficiency
27. Correlated change may be even more relevant
28. Handling multiple correlated outcome measures Pruning– try to avoid measures that have high conceptual overlap and are likely to have relatively large intercorrelations
Procedural– organize assessment and data collection to combine where possible for efficiency
Analytic
create composite variables to use in the analysis
use multivariate techniques like MANOVA to examine omnibus effects as context for univariate effects
use latent variable analysis, e.g., in SEM
29. Practicality and appropriateness to the circumstances Feasibility– time and resources required
Respondent burden– minimize demands, provide incentives/compensation
Developmental appropriateness– consider not only age but performance level, possible ceiling and floor effect
For follow-up beyond one school year, may need measures designed for a broad age span to maintain comparability
May need to tailor measures or assessment procedures for special populations (disabilities, English language learners)