270 likes | 369 Views
Methods for Estimating the Decision Rules in Dynamic Treatment Regimes. S.A. Murphy Univ. of Michigan IBC/ASC: July, 2004. Dynamic Treatment Regimes.
E N D
Methods for Estimating the Decision Rules in Dynamic Treatment Regimes S.A. Murphy Univ. of Michigan IBC/ASC: July, 2004
Dynamic Treatment Regimes are individually tailored treatments, with treatment type and dosage changing with ongoing subject information. Mimic Clinical Practice. • Brooner et al. (2002) Treatment of Opioid Addiction • Breslin et al. (1999) Treatment of Alcohol Addiction • Prokaska et al. (2001) Treatment of Tobacco Addiction • Rush et al. (2003) Treatment of Depression
EXAMPLE: Treatment of alcohol dependency. Primary outcome is a summary of heavy drinking scores over time.
Examples of sequential multiple assignment randomized trials: • CATIE (2001) Treatment of Psychosis in Alzheimer’s Patients • CATIE (2001) Treatment of Psychosis in Schizophrenia • STAR*D (2003) Treatment of Depression • Thall et al. (2000) Treatment of Prostate Cancer
k Decisions Observations made prior to jth decision Action at jth decision Primary Outcome: for a known function f
A dynamic treatment regime is a vector of decision rules, one per decision If the regime is implemented then
Three Methods for Estimating Decision Rules • Q-Learning (Watkins, 1989) • ---regression • A-Learning (Murphy, Robins, 2003) • ---regression on a mean zero space. • Weighting (Murphy, van der Laan & Robins, 2002) • ---weighted mean
One decision only! Data: is randomized with probability
Goal Choose to maximize:
Q-Learning Minimize
A-Learning Minimize
Discussion • Consistencyof Parameterization • ---problems for Q-Learning • Model Space • ---bias • ---variance
Q-Learning Minimize
Discussion • Consistencyof Parameterization • ---problems for Q-Learning • Model Space • ---bias • ---variance
Points to keep in mind • The sequential multiple assignment randomized trial is a trial for developing powerful dynamic treatment regimes; it is not a confirmatory trial. • Focus on MSE recognizing that due to the high dimensionality of X, the model parameterization is likely incorrect.
Goal Given a restricted set of functional forms for the decision rules, say , find
Discussion • Mismatch in Goals • ---problems for Q-Learning & A-Learning
Suppose our sample is infinite. Then in general neither or is close to
Open Problems • How might we “guide” Q-Learning or A-Learning so as to more closely achieve our goal? • Dealing with high dimensional X-- feature extraction---feature selection.
This seminar can be found at: http://www.stat.lsa.umich.edu/~samurphy/seminars/ ibc_asc_0704.ppt My email address: samurphy@umich.edu