Modeling “The Cause”: Assessing Implementation Fidelity and Achieved Relative Strength in RCTs

Modeling “The Cause”: Assessing Implementation Fidelity and Achieved Relative Strength in RCTs David S. Cordray Vanderbilt University IES/NCER Summer Research Training Institute, 2010

Overview • Define implementation fidelity and achieved relative strength • A 4-step approach to assessment and analysis of implementation fidelity (IF) and achieved relative strength (ARS): • Model(s)-based • Quality Measures of Core Causal Components • Creating Indices • Integrating implementation assessments with models of effects (next week) • Strategy: • Describe each step • Illustrate with an example • Planned Q&A segments

Caveat and Precondition • Caveat • The black box (ITT model) is still #1 priority • Implementation fidelity and achieve relative strength are supplemental to ITT-based results. • Precondition • But, we consider implementation fidelity in RCTs that are conducted on mature (enough) interventions • That is, the intervention is stable enough to describe an underlying— • Model/theory of change, and • Operational (logic and context) models.

Dimensions Intervention Fidelity Operative definitions: True Fidelity = Adherence or compliance: Program components are delivered/used/received, as prescribed With a stated criteria for success or full adherence The specification of these criteria is relatively rare Intervention Exposure: Amount of program content, processes, activities delivered/received by all participants (aka: receipt, responsiveness) This notion is most prevalent Intervention Differentiation: The unique features of the intervention are distinguishable from other programs, including the control condition A unique application within RCTs

Linking Intervention Fidelity Assessment to Contemporary Models of Causality Rubin’s Causal Model: True causal effect of X is (YiTx – YiC) In RCTs, the difference between outcomes, on average, is the causal effect Fidelity assessment within RCTs also entails examining the difference between causal components in the intervention and control conditions. Differencing causal conditions can be characterized as achieved relative strength of the contrast. Achieved Relative Strength (ARS) = tTx – tC ARS is a default index of fidelity

Treatment Strength Outcome .45 .40 .35 .30 .25 .20 .15 .10 .05 .00 TTx 100 90 85 80 75 70 65 60 55 50 Infidelity t tx Achieved Relative Strength =.15 (85)-(70) = 15 tC “Infidelity” TC ExpectedRelative Strength = (0.40-0.15) = 0.25

Why is this Important? Statistical conclusion validity Construct Validity: Which is the cause? (TTx - TC) or (tTx – tC) Poor implementation: essential elements of the treatment are incompletely implemented. Contamination: The essential elements of the treatment group are found in the control condition (to varying degrees). Pre-existing similarities between T and C on intervention components. External validity – generalization is about (tTx - tC) This difference needs to be known for proper generalization and future specification of the intervention components

So what is the cause? …The achieved relative difference in conditions across components Infidelity Augmentation of Control PD= Professional Development Asmt=Formative Assessment Diff Inst= Differentiated Instruction

Time-out for Questions

In Practice…. Step 1: Identify core components in the intervention group e.g., via a Model of Change Establish bench marks (if possible) for TTX and TC Step 2: Measure core components to derive tTx and tC e.g., via a “Logic model” based on Model of Change Step 3: Deriving indicators Step 4: Indicators of IF and ARSI Incorporated into the analysis of effects

Focused assessment is needed What are the options? (1) Essential or core components (activities, processes); (2) Necessary, but not unique, activities, processes and structures (supporting the essential components of T); and (3) Ordinary features of the setting (shared with the control group) Focus on 1 and 2.

Step 1: Specifying Intervention Models • Simple version of the question: What was intended? • Interventions are generally multi-component, sequences of actions • Mature-enough interventions are specifiable as: • Conceptual model of change • Intervention-specific model • Context-specific model • Start with a specific example  MAP RCT

Example –The Measuring Academic Progress (MAP) RCT • The Northwest Evaluation Association (NWEA) developed the Measures of Academic Progress (MAP) program to enhance student achievement • Used in 2000+ school districts, 17,500 schools • No evidence of efficacy or effectiveness

Measures of Academic Progress (MAP): Model of Change MAP Intervention: 4 days of training On-demand consultation Formative Testing Student Reports On-line resources Formative Assessment Achievement Differentiated Instruction Implementation Issues: Delivery – NWEA trainers Receipt – Teachers and School Leaders Enactment -- Teachers Outcomes -- Students

Logic Model for MAP Focus of implementation fidelity and achieved relative strength Resources Testing System Multiple Assessment Reports NWEA Trainers NWEA Consultants On-line teaching resources Outputs Use of Formative Assessment Differentiated Instruction Impacts Improved Student Achievement Activities 4 training sessions Follow-up Consultation Access resources Outcomes State tests MAP tests Program-specific implementation fidelity assessments: MAP only Comparative implementation assessments: MAP and Non-MAP classes

Context-Specific Model: MAP Two POINTS: 1. This tells us when assessments should be undertaken; and 2. Provides as basis for determining the length of the intervention study and the ultimate RCT design.

Step 2: Quality Measures of Core Components • Measures of resources, activities, outputs • Range from simple counts to sophisticated scaling of constructs • Generally involves multiple methods • Multiple indicators for each major component/activity • Reliable scales (3-4 items per sub-scale)

Measuring Program-Specific Components MAP Resources Testing System Multiple Assessment Reports NWEA Trainers NWEA Consultants On-line teaching resources MAP Activities 4 training sessions Follow-up Consultation Access resources Criterion: Attendance Source or method: MAP records MAP Outputs Criterion: Present or Absent Source or Method: MAP Records Criterion: Use Source or Method: Web-records

Measuring Outputs: Both MAP and Non-MAP Conditions MAP Outputs Use of Formative Assessment Data Differentiated Instruction Method: End of year Teacher Survey Indices: Difference in differentiated instruction (high v. low readiness students) Proportion of observations segments with any differentiated instruction Methods: End or year teacher survey Observations (3) Teacher Logs (10) Criterion: Achieved Relative Strength

Fidelity and ARS Assessment Plan for the MAP Program

Step 3: Indexing Fidelity and Achieved Relative Strength True Fidelity – relative to a benchmark; Intervention Exposure – amount of sessions, time, frequency Achieved Relative Strength (ARS) Index Standardized difference in fidelity index across Tx and C • Based on Hedges’ g (Hedges, 2007) • Corrected for clustering in the classroom

Calculating ARSI When There Are Multiple Components Infidelity Augmentation of Control PD= Professional Development Asmt=Formative Assessment Diff Inst= Differentiated Instruction

Weighted Achieved Relative Strength

Time-out for Questions

Some Program-Specific Results

Achieved Relative Strength: Some Results

Achieved Relative Strength: Teacher Classroom Behavior

Preliminary Conclusions for the MAP Implementation Assessment • The developer (NWEA) • Complete implementation of resources, training, and consultation • Teachers: Program-specific implementation outcomes • Variable attendance at training and use of training sessions • Moderate use of data, differentiation activities services • Training extended through May, 2009 • Teachers: Achieved Relative Strength • No between-group differences in enactment of differentiated instruction

Step 4: Indexing Cause-Effect Linkage • Analysis Type 1: • Congruity of Cause-Effect in ITT analyses • Effect = Average difference on outcomes ES • Cause = Average difference in causal components  ARS (Achieved Relative Strength) • Descriptive reporting of each, separately • Analysis Type 2: • Variation in implementation fidelity linked to variation in outcomes • Hierarchy of approaches (ITT  LATE/CACE  Regression  Descriptive) • TO BE CONTINUED ……

Questions?

Modeling “The Cause”: Assessing Implementation Fidelity and Achieved Relative Strength in RCTs