320 likes | 496 Views
Model Evaluation and Selection via Prediction. Real contributors. Lu Tian (Northwestern University) Tianxi Cai (Harvard University) Hajime Uno (Harvard University, DFCI). Outline . Background and motivation Developing and evaluating prediction rules based on a set of markers for
E N D
Real contributors • Lu Tian (Northwestern University) • Tianxi Cai (Harvard University) • Hajime Uno (Harvard University, DFCI)
Outline • Background and motivation • Developing and evaluating prediction rules based on a set of markers for • Non-censored outcomes • Censored event time outcomes • Evaluating the incremental value of a biomarker over • the entire population • various sub-populations • Incorporating the patient level precision of the prediction • Prediction intervals/sets • Remarks
Regression modeling, Tree classification et al? • Association • Prediction
Model checking? • Goodness of fit test (lack of fit test)? Is p-value a good metric for measuring lack of fit? • Quantitative approach? R-square? Likelihood ratio-type? Need heuristically interpretable distance function? (cost-benefit) • Every model is an approximation to the truth?
Diagnosis Prognosis Treatment Background and Motivation • Personalized medicine: using information about a person’s biological and genetic make up to tailor strategies for the prevention, detection and treatment of disease • Important step: develop prediction rules that can accurately predict the disease outcome or treatment response
Predictor Z Subject Characteristics Biomarkers Genetic Markers Outcome Y Disease status Time to event Treatment Response Background and Motivation • Accurate prediction of disease outcome and treatment response, however, are complex and difficult tasks. • Developing prediction rules involve • Identifying important predictors • Evaluating the accuracy of the prediction • Evaluating the incremental value of new markers
Outcome Y ? Predictor Z CD4week 24 Age, CD4week 0, CD4week 8 RNAweek 0,RNAweek 8 Background and Motivation AIDS Clinical Trial : ACTG320 • Study objective: to compare • 3-drug regimen (n=579): Zidovudine + Lamivudine + Indinarvir • 2-drug regimen (n=577): Zidovudine + Lamivudine • Identify biomarkers for predicting treatment response • How well can we predict the treatment response? • Is RNA needed?
Association Coefficients for RNA significant? Regression Analysis: CD4week 24 Background and Motivation Is RNA needed? Predictors
Coefficient for RNAweek 8highly significant RNA needed for a more precise prediction of responses?? Background and Motivation AIDS Clinical Trial Regression Coefficient
prediction procedure Does adding RNA improve the prediction? • Prediction rule: based on regression models • The distance between and Y? Background and Motivation Is RNA needed? Y = CD4week 8 Z=Predictors
Developing Prediction RulesBased on a Set of Markers • Regression approach to approximate Y | Z • Non-censored outcome:linear regression • Survival outcome: • Proportional Hazards model (Example: Framingham Risk Score) • Time-specific prediction models: • Regression modeling as a vehicle: • the procedure has to be valid when the imposed statistical model is not the true model!
Y = 0 Y = 1 Developing and Evaluating Prediction Rules • Predict Y with Z based on the prediction model • Evaluate the performance of the prediction by the average “distance” between and Y • The utility or cost to predicting Y as is • The average “distance” is • Examples: • Absolute prediction error: • Total “Cost” of Risk Stratification:
and Evaluating and Comparing Prediction Rules • The performance of the prediction model/rule with can be estimatedby • Prediction Model/Rule Comparison: • Prediction with E(Y | Z) = g1(a’Z) vs E(Y | W) = g2(b’W) • Compare two models/rules by comparing
Variability in the Estimated Prediction Performance Measures • Variability in the prediction errors: • Estimate = 50, SE = 1? SE = 50? • Inference about D and = D1 – D2 • Confidence intervals based on large sample approximations to the distribution of
and have the same limiting distribution Bias Correction • Bias issue in the apparent error type estimators • Bias correction via Cross-validation: • Data partition Tk, Vk • For each partition • Obtain based on observations in Tk • Obtain based on observations in Vk • Obtain cross-validated estimator
Example: AIDS Clinical Trial • Objective: identify biomarkers to predict the treatment response • Outcome: Y = CD4week 24 • Predictors Z: Age, CD4week 0, CD4week 8, RNAweek 0,RNAweek 8 • Working Model: E(Y|Z) = ’Z
Example: AIDS Clinical TrialIncremental Value of RNA Estimates 95% C.I. * : Std Error Estimates
ExampleBreast Cancer Gene Expression Study • Objective: construct a new classifier that can accurately predict future disease outcome • van’t Veer et al (2002) established a classifier based on a 70-gene profile • good- or poor-prognosis signature based on their correlation with the previously determined average profile in tumors from patients with good prognosis • Classify subjects as • Good prognosis if Gene score > cut-off • Poor prognosis if Gene score < cut-off • van de Vijver et al (2002) evaluated the accuracy of this classifier by using hazard ratios and signature specific Kaplan Meier curves
ExampleBreast Cancer Gene Expression Study • Data consist of 295 Subjects • Outcome T: time to death • Predictors: Lymph-Node Status, Estrogen Receptor Status, gene score • We are interested in • Constructing prediction rules for identify subjects who would survive t-year, Y = I(Tt)=1. • Evaluating the incremental value of the Gene Score.
Evaluating the Prediction RuleBased on Various Accuracy Measures • For a future patient with T0 and Z0, we predict • Classification accuracy measures • Sensitivity • Specificity • Prediction accuracy measures
Naïve • o Clinical • Clinical + Gene van de Vijver Example: Breast Cancer DataPredicting 10-year Survival
Example: Breast Cancer Data • To compare • Model II: g(a + Node + ER) • Model III: g(a + Node + ER + Gene) • Choosing cut-off values for each model to achieve SE = 69% which is an attainable value for Model II, then • Model II SP = 0.45, PPV = 0.35, NPV = 0.77 • Model III SP = 0.75, PPV = 0.54, NPV = 0.85 • 95% CI for the difference in • SP:[0.11, 0.45], PPV: [0.01, 0.24], NPV: [0.06, 0.19]
Prediction IntervalAccounting for the Precision of the Prediction • Based on a prediction model • predict the response • summarize the corresponding population average accuracy • What if the population average accuracy of 70% is not satisfactory? How to achieve 90% accuracy? • What if can predict Y0 more precisely for certain Z0, while on the other hand fails to predict Y0 accurately? • Account for the precision of the prediction? Identify patients would need further assessment?
Classic Rule: Risk of Death < 0.50 Survivor {Y=0} Risk of Death ≥ 0.50 Non-survivor {Y=1} Predicted Risk = 0.51 Predicted Risk = 0.04 {0} {1}
Prediction Interval • To account for patient-levelprediction error, one may instead predict such that • The optimal interval for the population with Z0 is • : estimated conditional density function
Example: Breast CancerStudy • Data: 295 patients • Response: 10 year survival • Predictors: Lymph-Node Status, Estrogen Receptor Status, Gene Score • Model • Possible prediction sets: {}, {0}, {1}, {0,1} • Classic prediction: considers {0}, {1} only.
90% Prediction Set: {0,1} 90% Prediction Set: {0} Predicted Risk = 0.04 Predicted Risk = 0.51
(0%) 4% (63%) 39% (37%) 57% Example: Breast Cancer Study Prediction Sets Based on Clinical + Gene Score
Remarks • Proper choice of the accuracy/cost measure • Classification accuracy vs predictive values • Utility function: what is the consequence of predicting a subject with outcome Y as • With an expensive or invasive marker • Should it be applied to the entire population? • Is it helpful for a certain sub-population? • Should the cost of the marker be considered when evaluating its value?