250 likes | 459 Views
Treatment Effect Heterogeneity & Dynamic Treatment Regime Development. S.A. Murphy. Dynamic treatment regimes (DTRs) are individually tailored treatments, with treatment type and dosage changing according to individual outcomes.
E N D
Treatment Effect Heterogeneity & Dynamic Treatment Regime Development S.A. Murphy
Dynamic treatment regimes (DTRs) are individually tailored treatments, with treatment type and dosage changing according to individual outcomes. ***utilize treatment effect heterogeneity to individualize treatment***
Example of a DTR • Adaptive Drug Court Program for drug abusing offenders. • Goal is to minimize recidivism and drug use. • Marlowe et al. (2008, 2009, 2011)
Treatment Effect Heterogeneity Focus on Theory: Used to deepen understanding of underlying causal, mechanistic structure Focus on Practice: Used to improve decision making in practice For Whom, When, and in Which Context, might a specific treatment be most useful? This is our focus today
Treatment Effect Heterogeneity & DTR Development Take Advantage of Treatment Effect Heterogeneity in Design of Intervention Trial Embedded tailoring variables Part of “treatment action” Take Advantage of Treatment Effect Heterogeneity in Design of the DTR. Data analyses
Pelham ADHD Study Continue, reassess monthly; randomize if deteriorate Yes 8 weeks Begin low-intensity BMOD Augment with other treatment Assess- Adequate response? Randomassignment: No Intensify Current Treatment Randomassignment: Continue, reassess monthly; randomize if deteriorate 8 weeks Intensify Current Treatment Begin low dose Med Assess- Adequate response? Randomassignment: Augment with other Treatment No
Txt Effect Heterogeneity Embedded Tailoring Variable Embedded Tailoring Variables: (a) Teacher reported Impairment Scale, (b) Teacher reported individualized list of target behaviors Non-response is assessed at 8 weeks and every 4 weeks thereafter.
Txt Effect Heterogeneity Embedded DTRs 4 Embedded DTRs Start with BMOD; only if nonresponse criterion reached, augment with MED Start with BMOD; only if nonresponse criterion reached, intensify BMOD Start with MED; only if nonresponse criterion reached, augment with BMOD Start with MED; only if nonresponse criterion reached, intensify MED
Oslin Alcoholism Trial NTX 8 wks Response Randomassignment: TDM + NTX Early Trigger for Nonresponse CBI+MM Randomassignment: Nonresponse CBI +NTX+MM Randomassignment: NTX 8 wks Response Randomassignment: TDM + NTX Late Trigger for Nonresponse Randomassignment: CBI +MM Nonresponse CBI +NTX+MM
Txt Effect Heterogeneity Embedded Tailoring Variable & Embedded DTR Embedded Tailoring Variable: heavy drinking days (HDD) First randomization is between treatment actions: move to stage 2 if 2 HDDs versus move to stage 2 if 5 HDDs 8 Embedded DTRs
A Data Analysis Method for Utilizing Treatment Effect Heterogeneity to Construct a “More Deeply Tailored” DTR: Q-Learning Subject data from sequential, multiple assignment, randomized trials. At each stage subjects are randomized among alternative options. Aj is a randomized action with known randomization probability. Binary actions with P[Aj=1]=P[Aj=-1]=.5
Dynamic Treatment Regime (DTR) • The DTR is given by a sequence of decision rules, one per stage of treatment (here 2 stages) • DTR= • Goal: Construct • for which the expected outcome is • maximal.
Q-Learning • Q-Learning (Watkins, 1989; Ernst et al., 2005; Murphy, 2005) (a popular method from computer science)—generalizes regression to multiple stages • Q-Learning uses dynamic programming arguments combined with linear regression estimation of conditional means.
Simple Version of Q-Learning – There is a regression for each stage. • Stage 2 regression: Regress Y on to obtain • Stage 1 regression: Regress on to obtain
for subjects entering stage 2: • is the predicted end of stage 2 response when the stage 2 treatment is equal to the “best” treatment. • is the dependent variable in the stage 1 regression for patients moving to stage 2
A Simple Version of Q-Learning – • Stage 2 regression, (using Y as dependent variable) yields • Arg-max over a2 yields
A Simple Version of Q-Learning – • Stage 1 regression, (using as dependent variable) yields • Arg-max over a1 yields
Pelham ADHD Study Continue, reassess monthly; randomize if deteriorate Yes 8 weeks Begin low-intensity BMOD Augment with other treatment Assess- Adequate response? Randomassignment: No Intensify Current Treatment Randomassignment: Continue, reassess monthly; randomize if deteriorate 8 weeks Intensify Current Treatment Begin low dose Med Assess- Adequate response? Randomassignment: Augment with other Treatment No
(X1, A1, R1, X2, A2, Y) Y = end of year school performance R1=1 if early responder; =0 if early non-responder X2includes the month of non-response, M2, and a measure of adherence in stage 1 (S2) S2 =1 if adherent in stage 1; =0, if non-adherent X1 includes baseline school performance, Y0 , whether medicated in prior year (S1), ODD (O1) S1 =1 if medicated in prior year; =0, otherwise. ADHD Example 20
Stage 2 regression for Y: Stage 1 outcome: ADHD Example 21
IF medication was not used in the prior year THEN begin with BMOD; ELSE select either BMOD or MED. IF the child is nonresponsive and was non-adherent, THEN augment present treatment; ELSE IF the child is nonresponsive and was adherent, THEN select intensification of current treatment. Dynamic Treatment Regime Proposal 22
Future Challenges • High dimensional data; investigators want to collect real time data • Feature construction & Feature selection • Many stages or infinite horizon • This seminar can be found at: • http://www.stat.lsa.umich.edu/~samurphy/ • seminars/JSM_Txt_Heterogeneity2012.ppt