1 / 27

HSRP 734: Advanced Statistical Methods July 10, 2008

HSRP 734: Advanced Statistical Methods July 10, 2008. Objectives. Describe the Kaplan-Meier estimated survival curve Describe the log-rank test Use SAS to implement. Kaplan-Meier Estimate of Survival Function S(t).

harry
Download Presentation

HSRP 734: Advanced Statistical Methods July 10, 2008

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HSRP 734: Advanced Statistical MethodsJuly 10, 2008

  2. Objectives • Describe the Kaplan-Meier estimated survival curve • Describe the log-rank test • Use SAS to implement

  3. Kaplan-Meier Estimate of Survival Function S(t) • The Kaplan-Meier estimate of the survival function is a simple, useful and popular estimate for the survival function. • This estimate incorporates both censored and noncensored observations • Breaks the estimation problem down into small pieces

  4. Kaplan-Meier Estimate of the Survival Function S(t) • For grouped survival data, • Let interval lengths Ljbecome very small – all of length L=Dt and let t1, t2, … be times of events (survival times)

  5. Kaplan-Meier Estimate of the Survival Function S(t) • 2 cases to consider in the previous equation • Case 1. No event in a bin (interval) • does not change — which means that we can ignore bins with no events

  6. Kaplan-Meier Estimate of the Survival Function S(t) • Case 2. yj events occur in a bin (interval) Also: nj persons enter the bin assume any censored times that occur in the bin occur at the end of the bin

  7. Kaplan-Meier Estimate of the Survival Function S(t) • So, as Dt → 0, we get the Kaplan- Meier estimate of the survival function S(t) • Also called the “product-limit estimate” of the survival function S(t) • Note: each conditional probability estimate is obtained from the observed number at risk for an event and the observed number of events (nj-yj) / nj

  8. Kaplan-Meier Estimate of Survival Function S(t) • We begin by • Rank ordering the survival times (including the censored survival times) • Define each interval as starting at an observed time and ending just before the next ordered time • Identify the number at risk within each interval • Identify the number of events within each interval • Calculate the probability of surviving within that interval • Calculate the survival function for that interval as the probability of surviving that interval times the probability of surviving to the start of that interval

  9. Example - AML + indicates a censored time to relapse; e.g., 13+ = more than 13 weeks to relapse

  10. Example – AML • Calculation of Kaplan-Meier estimates: In the “not maintained on chemotherapy” group:

  11. Example – AML (cont’d) In the “maintained on chemotherapy” group:

  12. Example – AML (cont’d) • The “Kaplan-Meier curve” plots the estimated survival function vs. time — separate curves for each group

  13. Example – AML (cont’d) • Notes — Can count the total number of events by counting the number of steps (times) — If feasible, picture the censoring times on the graph as shown above.

  14. Kaplan-Meier Estimate Using SAS

  15. Comments on the Kaplan-Meier Estimate • If the event and censoring times are tied, we assume that the censoring time is slightly larger than the death time. • If the largest observation is an event, the Kaplan-Meier estimate is 0. • If the largest observation is censored, the Kaplan-Meier estimate remains constant forever.

  16. Comments on the Kaplan-Meier Estimate • If we plot the empirical survival estimates, we observe a step function. If there are no ties and no censoring, the step function drops by 1/n. • With every censored observation the size of the steps increase. • When does the number of intervals equal the number of deaths in the sample? • When does the number of intervals equal n?

  17. Comments on the Kaplan-Meier Estimate • The Kaplan-Meier is a consistent estimate of the true S(t). That means that as the sample size gets large, KM estimate converges to the true value. • The Kaplan-Meier estimate can be used to empirically estimate any cumulative distribution function

  18. Comments on the Kaplan-Meier Estimate • The step function in K-M curve really looks like this: • If you have a failure at t1 then you want to say survivorship at t1 should be less than 1. • For small data sets it matters, but for large data sets it does not matter.

  19. Confidence Interval for S(t) – Greenwood’s Formula • Greenwood’s formula for the variance of : • Using Greenwood’s formula, an approximate 95% CI for S(t) is • There is a “problem”: the 95% CI is not constrained to lie within the interval (0,1)

  20. Confidence Interval for S(t) – Alternative Formula • Based on log(-log(S(t)) which ranges from -∞ to ∞ • Find the standard error of above, find the CI of above, then transform CI to one for S(t) • This CI will lie within the interval [0,1] • This is the default in SAS

  21. Log-rank test for comparing survivor curves • Are two survivor curves the same? • Use the times of events: t1, t2, ... (do not include censoring times) • Treat each event and its “set of persons still at risk” (i.e., risk set) at each time tjas an independent table • Make a 2×2 table at each tj

  22. Log-rank test for comparing survivor curves • At each event time t j, under assumption of equal survival (i.e., SA(t) = SB(t) ), the expected number of events in Group A out of the total events (dj=aj +cj) is in proportion to the numbers at risk in group A to the total at risk at time tj: Eaj= dj x njA / nj • Differences between ajand Eajrepresent evidence against the null hypothesis of equal survival in the two groups

  23. Log-rank test for comparing survivor curves • Use the Cochran Mantel-Haenszel idea of pooling over events j to get the log-rank chi-squared statistic with one degree of freedom

  24. Log-rank test for comparing survivor curves • Idea summary: • Create a 2x2 table at each uncensored failure time • The construct of each 2x2 table is based on the corresponding risk set • Combine information from all the tables • The null hypothesis is SA(t) = SB(t) for all time t.

  25. Comparisons across Groups • Extensions of the log-rank test to several groups require knowledge of matrix algebra. In general, these tests are well approximated by a chi-squared distribution with G-1 degrees of freedom. • Alternative tests: • Wilcoxon family of tests (including Peto test) • Likelihood ratio test (SAS)

  26. Comparison between Log-Rank and Wilcoxon Tests • The log-rank test weights each failure time equally. No parametric model is assumed for failure times within a stratum. • The Wilcoxon test weights each failure time by a function of the number at risk. Thus, more weight tends to be given to early failure times. As in the log-rank test, no parametric model is assumed for failure times within a stratum. • Between these two tests (Wilcoxon and log-rank tests), the Wilcoxon test will tend to be better at picking up early departures from the null hypothesis and the log-rank test will tend to be more sensitive to departures in the tail.

  27. Comparison with Likelihood Ratio Test in SAS • The likelihood ratio test employed in SAS assumes the data within the various strata are exponentially distributed and censoring in non-informative. Thus, this is a parametric method that smoothes across the entire curve.

More Related