280 likes | 509 Views
Survival analysis. Brian Healy, PhD. Previous classes. Regression Linear regression Multiple regression Logistic regression. What are we doing today?. Survival analysis Kaplan-Meier curve Dichotomous predictor How to interpret results Cox proportional hazards Continuous predictor
E N D
Survival analysis Brian Healy, PhD
Previous classes • Regression • Linear regression • Multiple regression • Logistic regression
What are we doing today? • Survival analysis • Kaplan-Meier curve • Dichotomous predictor • How to interpret results • Cox proportional hazards • Continuous predictor • How to interpret results
Big picture • In medical research, we often confront continuous, ordinal or dichotomous outcomes • One other common outcome is time to event (survival time) • Clinical trials often measure time to death or time to relapse • We would like to estimate the survival distribution
Definitions • Survival time: time to event • Survival function: probability survival time is greater than a specific value S(t)=P(T>t) • Hazard function: risk of having the event l(t)=# who had event/# at risk • These two factors are mathematically related
Example • An important marker of disease activity in MS is the occurrence of a relapse • This is the presence of new symptoms that lasts for at least 24 hours • Many clinical trials in MS have demonstrated that treatments increase the time until the next relapse • How does the time to next relapse look in the clinic? • What is the distribution of survival times?
Kaplan-Meier curve Each drop in the curve represents an event
Survival data • To create this curve, patients placed on treatment were followed and the time of the first relapse on treatment was recorded • Survival time • If everyone had an event, some of the methods we have already learned could be applied • Often, not everyone has event • Loss to follow-up • End of study
Censoring • The patients who did not have the event are considered censored • We know that they survived a specific amount of time, but do not know the exact time of the event • We believe that the event would have happened if we observed them long enough • These patients provide some information, but not complete information
Censoring • How could we account for censoring? • Ignore it and say event occurred at time of censoring • Incorrect because this is almost certainly not true • Remove patient from analysis • Potential bias and loss of power • Survival analysis • Our objective is to estimate the survival distribution of patients in the presence of censoring
Example • For simplicity, let’s focus on 10 patients whose time to relapse is provided here • We assume that no one is censored initially • We would like to estimate S(t) and l(t)
What do we see from our curve? • Drops in the curve only occur at time of event • Between events, the estimated survival remains constant • What is the size of the drops?
Calculating size of drop • To calculate the hazard at each time point=# events/# at risk • If no event, hazard=0 • To calculate estimated survival use: 1/10 0.9 1/9 0.8 1/8 0.7 1/7 0.6 1/6 0.5 1/5 0.4 1/4 0.3 1/3 0.2 1/2 0.1 1/1 0
Example-censoring • For simplicity, let’s focus on 10 patients whose time to relapse is provided here • We assume that no one is censored initially • We would like to estimate S(t) and l(t)
What do we see from our curve? • Drops in the curve only occur at time of event • Between events, the estimated survival remains constant • Survival curve does not drop at censored times
Calculating size of drop • To calculate the hazard at each time point=# events/# at risk • If no event, hazard=0 • To calculate estimated survival use: 1/10 0.9 0 0.9 1/8 0.79 1/7 0.68 0 0.68 1/5 0.54 1/4 0.41 1/3 0.27 0 0.27 1/1 0
Confidence interval for survival curve • A confidence interval can be placed around the estimated survival curve • Greenwood’s formula
Summary • Kaplan-Meier curve represents the distribution of survival times • Drops only occur at event times • Censoring easily accommodated • If last time is not event, curve does not go to zero
Comparison of survival curve • One important aspect of survival analysis is the comparison of survival curves • Null hypothesis: S1(t)=S2(t) • Method: log-rank test
Log-rank test-technical • To compare survival curves, a log-rank test creates 2x2 tables at each event time and combines across the tables • Similar to MH-test • Provides a c2statistic with 1 degree of freedom (for a two sample comparison) and a p-value • Same procedure for hypothesis testing
Hypothesis test • H0: S1(t)=S2(t) • Time to event outcome, dichotomous predictor • Log rank test • Test statistic: c2=4.4 • p-value=0.036 • Since the p-value is less than 0.05, we reject the null hypothesis • We conclude that there is a significant difference in the survival time in the treated compared to untreated
Notes • Inspection of Kaplan-Meier curve will allow you to determine which of the groups had the significantly longer survival time • Other tests are possible • Gehan’s generalized Wilcoxon test • Tarone-Ware test • Peto-Peto-Prentice test • Generally give similar results, but emphasize different parts of survival curve