270 likes | 675 Views
Residuals. Residuals are used to investigate the lack of fit of a model to a given subject. For Cox regression, there’s no easy analog to the usual “observed minus predicted” residual of linear regression Martingale … Deviance ... Schoenfeld. Schoenfeld residuals.
E N D
Residuals • Residuals are used to investigate the lack of fit of a model to a given subject. • For Cox regression, there’s no easy analog to the usual “observed minus predicted” residual of linear regression • Martingale … • Deviance ... • Schoenfeld
Schoenfeld residuals • Schoenfeld (1982) proposed the first set of residuals for use with Cox regression packages • Schoenfeld D. Residuals for the proportional hazards regresssion model. Biometrika, 1982, 69(1):239-241. • Schoenfeld residuals are not defined for censored individuals. • Instead of a single residual for each individual, there is a separate residual for each individual for each covariate
Schoenfeld residuals • The Schoenfeld residual is defined as the covariate value for the individual that failed minus its expected value. • Yields residuals for each individual who failed, for each covariate. • The expected value of the covariate at time t is a weighted average of the covariate, weighted by the likelihood of failure for each individual in the risk set at t.
The function describing the failure pattern is the product of J terms, one for each observed failure time
Example • 5 people left in our risk set at event time=7 months: • Female 55-year old smoker • Male 45-year old non-smoker • Female 67-year old smoker • Male 58-year old smoker • Male 70-year old non-smoker The 55-year old female smoker is the one who has the event…
Example Based on our model, we can calculate a predicted probability of death by time 7 for each person (call it “p-hat”): • Female 55-year old smoker: p-hat=.10 • Male 45-year old non-smoker : p-hat=.05 • Female 67-year old smoker : p-hat=.30 • Male 58-year old smoker : p-hat=.20 • Male 70-year old non-smoker : p-hat=.30 Thus, the expected value for the AGE of the person who failed is: 55(.10) + 45 (.05) + 67(.30) + 58 (.20) + 70 (.30)= 60 And, the Schoenfeld residual is: 55-60 = -5
Example Based on our model, we can calculate a predicted probability of death by time 7 for each person (call it “p-hat”): • Female 55-year old smoker: p-hat=.10 • Male 45-year old non-smoker : p-hat=.05 • Female 67-year old smoker : p-hat=.30 • Male 58-year old smoker : p-hat=.20 • Male 70-year old non-smoker : p-hat=.30 The expected value for the GENDER of the person who failed is: 0(.10) + 1(.05) + 0(.30) + 1 (.20) + 1 (.30)= .55 And, the Schoenfeld residual is: 0-.55 = -.55
Schoenfeld residuals • Since the Schoenfeld residuals are, in principle, independent of time, a plot that shows a non-random pattern against time is evidence of violation of the PH assumption. • Plot Schoenfeld residuals against time to evaluate PH assumption • Regress Schoenfeld residuals against time to test for independence between residuals and time.
TEST OF PH ASSUMPTION If the impact of an independent variable meets the proportional hazard assumption, the smoothed values of a quantity called scaled Schoenfeld residuals would be roughly horizontal when plotted against survival time. Grambsch and Therneau (1994) demonstrated that a test of non-zero slope in a weighted regression of the residuals upon time can test for non-proportional hazard. Grambsch PM, Therneau TM. Proportional hazards tests and diagnostics based on weighted residuals. Biometrika 1994; 81: 515-526.
Summary of the many ways to evaluate PH assumption… 1. Examine log(-log(S(t)) plots PH assumption is supported by parallel lines and refuted by lines that cross or nearly cross Must use categorical predictors or categories of a continuous predictor
Summary of the many ways to evaluate PH assumption… 2. Include interaction with time in the model PH assumption is supported by non-significant interaction coefficient and refuted by significant interaction coefficient Retaining the interaction term in the model corrects for the violation of PH Don’t complicate your model in this way unless it’s absolutely necessary!
Testing Proportional Hazards • λ(t) = λ0(t) exp{ β1 age+ β2 drug} • exp{ β1age+β2drug+β3age*ln(t)+β4 drug*ln(t)} • Look at p-values associated with β3 and β4(Wald tests) • Do a partial likelihood ratio test comparing the two models
Summary of the many ways to evaluate PH assumption… 3. Plot Schoenfeld residuals PH assumption is supported by a random pattern with time and refuted by a non-random pattern
Summary of the many ways to evaluate PH assumption… 4. Regress Schoenfeld residuals against time to test for independence between residuals and time. PH assumption is supported by a non-significant relationship between residuals and time, and refuted by a significant relationship
Checking the proportional hazards assumption of the COX model using Schoenfeld residuals: R code: Cox.resid<-cox.zph(Cox.fit) plot(Cox.resid) R output rho chisq p mut 0.105 0.798 0.372 sex 0.130 1.139 0.286 GLOBAL NA 2.025 0.363