230 likes | 369 Views
From Population to Individual Drug Dosing in Chronic Illness. Intelligent Control for Management of Renal Anemia. Adam E Gaweda University of Louisville Department of Medicine. Challenges in Dynamic Treatment Regimes and Multistage Decision-Making. Overview. Anemia management
E N D
From Population to Individual Drug Dosing in Chronic Illness Intelligent Control for Management of Renal Anemia Adam E Gaweda University of Louisville Department of Medicine Challenges in Dynamic Treatment Regimes and Multistage Decision-Making
Overview • Anemia management • Dose-response modeling • Model-based control in drug dosing • Model-free control in drug dosing Challenges in Dynamic Treatment Regimes and Multistage Decision-Making
rHuEPO Anemia ManagementBiological vs. clinical Challenges in Dynamic Treatment Regimes and Multistage Decision-Making
Anemia ManagementClinical guidelines • Dosing guidelines (NKF – KDOQI) • Maintain Haemoglobin (Hb) between 11 and 12 g/dL ( Hematocrit (Hct) between 33 – 36 % ). • Titration of EPO: “If the increase in Hb after EPO initiation or after a dose increase has been less than 1 g/dL over a 2- to 4-week period, the dose of EPO should be increased by 50%. If the absolute rate of increase of Hb after EPO initiation or after a dose increase exceeds 3 g/dL per month (eg, an increase from a Hgb 7 to 10 g/dL), or if the Hgb exceeds the target, reduce the weekly dose of EPO by 25%. When the weekly EPO dose is being increased or decreased, a change may be made in the amount administered in a given dose and/or the frequency of dosing.” Challenges in Dynamic Treatment Regimes and Multistage Decision-Making
Anemia ManagementCurrent state-of-the-art • Anemia Management Protocols (AMP) • Frequency of Hb observation: • Every 4 weeks if Hb within the target • Every 2 weeks if Hb outside of the target • EPO dose adjustment: • Minimum adjustment amount 10% (of current dose) • Maximum decrease 50% (if Hb > 15 g/dL) • Maximum increase 70% (if Hb < 9 g/dL) • Problem with AMP • Based on average response. • Only 1/3 of the patient population achieve the target. • Can we improve the outcome of anemia management by making it patient-specific using control theory and machine learning techniques ? Challenges in Dynamic Treatment Regimes and Multistage Decision-Making
Dose-response modelingOverview • In control system design and simulation, a good process model is priceless. • Models of erythropoiesis: • Physiological model (Uehlinger et al. 1992) • PK / PD model(Brockmöller et al. 1992) • Bayesian network model (Bellazzi et al. 1993) • Artificial Neural Network (ANN) models (Martin Guerrero et al. 2003, Gaweda et al. 2003, Gabutti et al. 2006) Challenges in Dynamic Treatment Regimes and Multistage Decision-Making
Subpopulation 1e.g. responders(EPO/Hb < a) Model 1 dose data set (batch) Whole population selection Model 1 response response data subsets (batch) dose Subpopulation 2e.g. non-responders(EPO/Hb ≥a) Model 2 Dose-response modelingPopulation vs. subpopulation modeling Challenges in Dynamic Treatment Regimes and Multistage Decision-Making
Dose-response modelingExample of response prediction Challenges in Dynamic Treatment Regimes and Multistage Decision-Making
Dose-response modelingOpen problems • Prediction seems to “lag” behind the actual value • Do our data allow us to build a model that shows the true effect of EPO on Hb ( Hct ) ? • Let’s estimate a dynamic linear model Hb(k+1) = f( Hb(k), EPO(k) ) Hbm(k+1) = 0.82 Hb(k) + 0.011 EPO(k) + 1.91 • Let’s now estimate a model of ΔHb(k+1) = f( EPO(k) ) ΔHbm(k+1) = 0.015 EPO(k) - 0.23 Both models achieve comparable accuracy. The second model “explains” the dose effect better. Challenges in Dynamic Treatment Regimes and Multistage Decision-Making
output distribution absolute prediction error vs. output Dose-response modelingOpen problems • Our data come from clinical treatment (“closed-loop system”) • How does that affect the model ? Martin Guerrero et al. report the same phenomenon. Challenges in Dynamic Treatment Regimes and Multistage Decision-Making
Model-based controlModel Predictive Control (MPC) • Rationale for using Model Predictive Control • There is a delay between EPO administration and Hb response(about 17 days – from EPO manufacturer information). • The relationship between EPO dose and Hb increase is nonlinear (monotonically increasing with saturation – Uehlinger et al. 1992). • The effect of EPO continues throughout the lifetime of red blood cells (up to 120 days). • We plan to include constraints on EPO dose (in the future)(such as minimization of the total dose or minimization of dose changes). Challenges in Dynamic Treatment Regimes and Multistage Decision-Making
Model-based controlMPC - Schematic diagram MODEL(population) Hb(k+1) = Hb(k) + FNN(EPO(k),EPO(k-1),EPO(k-2)) EPO Hbm CONTROLLER PATIENT Hb EPO* Challenges in Dynamic Treatment Regimes and Multistage Decision-Making
Model-based controlMPC Clinical trial - setup • Trial population: • 60 patients: • 30 controls (dosed by physicians) / 30 treatment (dosed by MPC) • 45 African-American / 15 Caucasian • 35 males / 25 females • Average age 58, min 21, max 84 • Trial length: • 8 months • 2 months “wash-out” period / 6 months for outcome analysis • Treatment goal: • maintain Hb at 11.5 g/dL • performance measure: mean absolute deviation from 11.5 Challenges in Dynamic Treatment Regimes and Multistage Decision-Making
Model-based controlMPC - Clinical trial results (thus far) Mean |11.5-Hb| Month Challenges in Dynamic Treatment Regimes and Multistage Decision-Making
Model-based controlOpen problems • Simulating MPC • How do we accurately represent the mismatch between the model and the patient ? • How do we effectively simulate adverse events ? • Measuring success • We try to individualize the treatment yet we use a mean performance measure – what are the alternatives ? • Individual performance measures (e.g. within-subject StDevof Hb ) ???? • How do we eliminate influence of Hb changes due to adverse events on the performance measure ? Challenges in Dynamic Treatment Regimes and Multistage Decision-Making
Model-free controlReinforcement Learning • Drug administration in chronic conditions is a trial-and-error control process that resembles reinforcement learning disease symptoms – initial state (s0) (standard) initial dose – action (a0) k = 1 Repeat (infinitely) evaluate patient (remission/progression/side effects) – new state (sk), reward (rk) adjust dosing strategy – update state-action table/function (Qk), extract policy (k) administer new dose – action (ak) k = k + 1 End Challenges in Dynamic Treatment Regimes and Multistage Decision-Making
Model-free controlQ-Learning simulation - Schematic diagram Q-LEARNING AGENT POLICY ()Ri: IF Hb = Hbi THEN EPO = EPOi Hb(s) EPO(a) PATIENT SIMULATOR(subpopulation model)Hb(k+1) = F( Hb(k), EPO(k), IRON(k) ) IRON(disturbance) Challenges in Dynamic Treatment Regimes and Multistage Decision-Making
Model-free controlReward function 11.5 11.5 11.5 11.5 11.5 Challenges in Dynamic Treatment Regimes and Multistage Decision-Making
Model-free controlQ-table update • Dose-response relationship (EPO to ΔHb) is monotonically increasing with saturation (Uehlinger et al. 1992). • Let’s update multiple entries in the Q-table at a time : • If Hb(k) < 11.5 and Hb(k+1) Hb(k) or Hb(k) = 11.5 and Hb(k+1) < Hb(k)thenupdate Q( s, a ) for all s Hb(k) and all a EPO(k) • If Hb(k) > 11.5 and Hb(k+1) ≥ Hb(k) or Hb(k) = 11.5 and Hb(k+1) > Hb(k)thenupdate Q( s, a ) for all s ≥ Hb(k) and all a ≥ EPO(k) Challenges in Dynamic Treatment Regimes and Multistage Decision-Making
Model-free controlQ-Learning - Simulated clinical trial • Trial population: • 200 individuals with various degrees of response to EPO • 100 distinct responders / 100 distinct non-responders • In the first run, all individuals dosed by AMP • In the second run, all individuals dosed by policy updatedon-line by Q-learning • Trial length: • 24 months • Treatment goal: • drive Hb to, and maintain at 11.5 g/dL • performance measure: mean absolute deviation from 11.5 Challenges in Dynamic Treatment Regimes and Multistage Decision-Making
Model-free controlQ-Learning - Simulation results Mean |11.5-Hb| Month Challenges in Dynamic Treatment Regimes and Multistage Decision-Making
Conclusionsand open problems • We believe that we are on a good path to successfully individualize anemia management using presented techniques. However, we need to address the following: • How do we produce reliable dose-response models that perform well on under-represented data instances ? • What performance measure do we need to use in order to adequately evaluate the success of an individualized treatment ? Challenges in Dynamic Treatment Regimes and Multistage Decision-Making
UofL Division of Nephrology George R Aronoff Michael E Brier Alfred A Jacobs UofL Dept Electrical and Computer Engineering Mehmet K Muezzinoglu Jacek M Zurada Acknowledgments Michael E Brier has been sponsored by Department of Veterans Affairs Merit Review Grant. Adam E Gaweda is sponsored by NIDDK (1K25DK072085-01A2). Challenges in Dynamic Treatment Regimes and Multistage Decision-Making