390 likes | 901 Views
Introduction to Survival Analysis. Rosner Chapter 14 14.8 Introduction 14.9 Kaplan-Meier Estimation 14.10 Log-Rank Test 14.11 Proportional Hazards Model. What is Survival Analysis?. A class of statistical methods for studying the occurrence and timing of events.
E N D
Introduction to Survival Analysis RosnerChapter 14 14.8 Introduction 14.9 Kaplan-Meier Estimation 14.10 Log-Rank Test 14.11 Proportional Hazards Model
What is Survival Analysis? • A class of statistical methods for studying the occurrence and timing of events. • A class of methods for analyzing survival times (i.e., times to events). • A class of methods for analyzing survival probabilities at different follow-up times. • Not restricted to data with a certain distribution (non-parametric in nature).
Survival Analysis • In many biomedical studies, the primary endpoint is time until an event occurs (e.g. death, recurrence, new symptoms, etc.) • Data are typically subject to censoringwhen a study ends before the event occurs. • We assume censoring is noninformative, i.e., patients who are censored have the same underlying survival curve after their censoring time as patients who are not censored.
Survival Analysis • Survival Function - A function describing the proportion of individuals surviving to or beyond a given time. • Notation: • T survival time of a selected individual • t a specific point in time. • S(t) = P(T > t) Survival Function • h(t) instantaneous failure rate among survivors at time t (akahazard function)
Survival Analysis in RCT • For survival analysis, the best observation plan is a randomized clinical trial (RCT). • Random treatment assignments. • Well-defined starting points. • Substantial follow-up time. • Exact time records of the interested events (few patients lost to follow-up).
Elements of Survival Experience • Event Definition (death, adverse events, …) • Starting time • Length of follow-up (equal length of follow-up, common stop time) • Failure time (observed time of event since start of trial) • Unobserved event time (no event recorded in the follow-up, early termination, etc)
Calendar Time Plot X: Event; O: Censored
Understanding Censoring • An event is said to be censored if the exact time at which the event occurs is not observed. • What is observed is a lower limit on the actual time to event. • If more than a few observations are informatively censored, special “competing risk” methods are required.
Describing Survival Experience • Central idea: event times are realizations of a random variable, thus can be described by a probability distribution: • Cumulative distribution function • Survival function • Probability density function • Hazard function • Cumulative hazard function
Distribution function Density function Survival function Mathematical Definitions
Mathematical Definitions Hazard Function and Cumulative Hazard Function • Hazard function: • Cumulative hazard:
Relationships Among These Different Representations • Given any one of them, we can recover the others. • Some useful relationships
Kaplan-Meier Estimate of S(t) • Rank the survival times as t(1)≤t(2)≤…≤t(n). • Number of individuals at risk before t(i) ni • Number of individuals with failure time t(i) di • Estimated hazard function at t(i): • Formula
Kaplan-Meier Estimate of S(t) • The Kaplan-Meier curve is a step function -- i.e., it does not change on days when no events occur. Step sizes are not all equal; they depend on changes in denominator. • Even with heavy censoring, the Kaplan-Meier curve is an unbiased estimate of the true (population) survival curve. Censoring affects the precision but not the accuracy (bias). • Censoring must be independent of occurrence of endpoint for estimate to be unbiased.
Leukemia Mouse Example • Mice given P388 murine leukemia assigned at random to one of two regimens of therapy • Regimen A - Navelbine + Taxol Concurrently • Regimen B - Navelbine + Taxol 1-hour later • Under A, 9 of nA=49 mice died on days: 6, 8, 22, 32,32,35,41,46, and 54. Remainder > 60 days • Under B, 9 of nB=15 mice died on days: 8, 10, 27, 31,34,35,39,47, and 57. Remainder > 60 days Source: Knick, et al (1995)
Leukemia Mouse Example Regimen A Regimen B
Log-Rank Test • Goal: Test whether two groups differ in population survival functions. Notation: • t(i) Time of the ith failure time (across groups) • d1i Number of failures for trt 1 at time t(i) • d2i Number of failures for trt 2 at time t(i) • n1i Number at risk prior for trt 1 prior to time t(i) • n2i Number at risk prior for trt 2 prior to time t(i) • Computations:
Log-Rank Test • H0: Two Survival Functions are Identical • HA: Two Survival Functions Differ Some programs conduct this identically as a chi-square test, with test statistic (TMH)2which is distributed c12 under H0
Leukemia Mouse Example Survival Analysis for DAY Total Number Number Percent Events Censored Censored REGIMEN 1 49 9 40 81.63 REGIMEN 2 15 9 6 40.00 Overall 64 18 46 71.88 Test Statistics for Equality of Survival Distributions for REGIMEN Statistic df Significance Log Rank 10.93 1 .0009
Leukemia Clinical Example ~ Remission time of acute leukemia ~ Patients randomly assigned ~ Purpose: evaluate ability to maintain remissions ~ Study terminated after 1 year ~ Different follow up times due to sequential enrollment 6-MP 6,6,6,7,10,22,23,6+,9+,10+,11+,17+,19+,20+,25+,32+, 32+, 34+, 35+ Placebo 1,1,2,2,3,4,4,5,5,8,8,8,8,11,11,12,12,15,17,22,23
Cox Proportional Hazards Model • Goal: Compare two or more groups (treatments), adjusting for other risk factors on survival times (like Multiple regression) • p Explanatory variables (including dummy variables) • Models hazard ratio of the event as function of time and covariates:
Cox Proportional Hazards Model • Common assumption: Relative Risk is constant over time. Proportional Hazards • Log-linear Model: • Test for effect of variable xi, adjusting for all other predictors: • H0: bi = 0 (No association between risk of event andxi) • HA: bi 0 (Association between risk of event andxi)
Relative Risk for Individual Factors • Relative Risk for increasing predictor xi by 1 unit, controlling for all other predictors: • 95% CI for Relative Risk for Predictorxi: • Compute a 95% CI forbi : • Exponentiate the lower and upper bounds for CI forRRi
Ex: Comparing Cancer Regimens • Subjects: Patients with multiple myeloma • Treatments (HDM considered less intensive): • High-dose melphalan (HDM) • Thiotepa, Busulfan, Cyclophosphamide (TBC) • Covariates: • Durie-Salmon disease stage III at diagnosis (Yes/No) • Having received 3+ previous treatments (Yes/No) • Outcome: Progression-Free Survival Time • 186 Subjects (97 on TBC, 89 on HDM) Source: Anagnostopoulos, et al (2004)
Ex: Comparing Cancer Regimens • Variables and Statistical Model: • x1 = 1 if Patient at Durie-Salmon Stage III • x2 = 1 if Patient has had 3 previos treatments • x3 = 1 if Patient received HDM, 0 if TBC • Of primary importance is b3: • b3 = 0 Adjusting for x1 and x2, no difference in risk for HDM and TBC • b3 > 0 Adjusting for x1 and x2, risk of progression higher for HDM • b3 < 0 Adjusting for x1 and x2, risk of progression lower for HDM
Ex: Comparing Cancer Regimens • Results: (RR=Relative Risk) • Conclusions (adjusting for all other factors): • Patients at Durie-Salmon Stage III are at higher risk • Patients who have had 3 previous treatments at higher risk • Patients receiving HDM at same risk as patients on TBC