380 likes | 754 Views
Introduction to Survival Analysis. Seminar in Statistics. Presented by : Stefan Bauer, Stephan Hemri 28.02.2011. Definition. Survival analysis : method for analysing timing of events; data analytic approach to estimate the time until an event occurs.
E N D
IntroductiontoSurvival Analysis Seminar in Statistics Presentedby: Stefan Bauer, Stephan Hemri 28.02.2011
Definition • Survival analysis: • method for analysing timing of events; • data analytic approach to estimate the time until an event occurs. • Historically survival time refers to the time that an individual „survives“ over some period until the event of death occurs. • Event is also named failure.
Areas of application Survival analysis is used as a tool in many different settings: • proving or disproving the value of medical treatments for diseases; • evaluating reliability of technical equipment; • monitoring social phenomena like divorce and unemployment.
Examples • Time from... • marriage to divorce; • birth to cancer diagnosis; • entry to a study to relapse.
Censoring • The survival time is not known exactly! This may occur due to the following reasons: • a person does not experience the event before the study ends; • a person is lost to follow-up during the study period; • a person withdraws from the study because of some other reason.
Outcome variable • Survival time ≠ calendar time (e.g. follow-up starts for each individual on the day of surgery) • Correct starting and ending times may be unknown due to censoring
Survivorfunction • Probability that random variable T exceeds specified time t • Fundamental to survival analysis
Hazardfunction • h(t) has no upper bounds • Often called: Failure rate
Example: Hazard function Assume having a huge follow-up study on heart attacks: • 600 heart attacks (events) per year; • 50 events per month; • 11.5 events per week; • 0.0011 events per minute. h(t) = rate of events occurring per time unit
Relation between S(t) and h(t) If T continous:
Sketch of Proof • Find relationship between density f(t) and S(t) • Express relationship between h(t) and S(t) as a function of density f(t)i in ii → h(t) as a function of S(t) and vice versa
Hazardratio Cox proportional hazards model: • h0(t): baseline hazard rate • X: vector of explanatory variables • : hazard ratio for the coefficient • Ratio between the predicted hazard rate of two individualsthat differ by 1 unit in the variable
Basic descriptive measures • Group mean (ignore censorship) • Median (t for which (t) = 0.5) • Average hazard rate:
Goals (of survival analysis) • to estimate and interpret survivor and or hazard function; • to compare survivor and or hazard function; • to assess the relationship of explanatory variables to survival times -> we need mathematical modelling (Cox model).
Computer layout Layout for multivariate data with p explanatory variables:
Notation & terminology censored t’s unordered failed t’s ordered (t(i)) • Orderedfailures: • Frequencycounts: • mi = # individualswhofailedat t(i) • qi = # ind. censoredin [t(i),t(i+1)) • Riskset R(t(i)): Collection of individualswhohavesurvivedatleast until time t(i)
Example: Leukaemiaremission Extended Remission Data containing: • two groups of leukaemia patients: treatment & placebo; • log WBC values of each individual; (WBC: white blood cell count) Expected behaviour: The higher the WBC value is the lower the expected survival time.
Example: Analysis layout • Analysis layout for treatment group:
Example: Confounding • Confounding of treatment effect by log WBC • Log WBC suggests: Treatment group survives longer simply because of lower WBC values • Controlling for WBC necessary
Example: Conclusion • Need to consider confounding and interaction; • basic problem: comparing survival of the two groups after adjusting for confounding and interaction; • problem can be extended to the multivariate case by adding additional explanatory variables.
Summary • Survival analysis encompasses a variety of methods for analyzing the timing of events; • problem of censoring: exact survival time unknown; • mixture of complete and incomplete observations • difference to other statistical data
Summary • Relationshipbetween S(t) and h(t): • Goals: • Estimation & Interpretation of S(t) and h(t) • Comparison of different S(t) and h(t) • Assessment of relationship of explanatory variables to survival time
References • A Conceptual Approach to Survival Analysis, Johnson, L.L., 2005. Downloaded from www.nihtraining.com on 19.02.2011. • Applied Survival Analysis: Regression Modelling of Time to event Data, Hosmer, D.W., Lemeshow, S., Wiley Series in Probability and Statistics 1999. • Lesson 6: Sample Size and Power - Part A, The Pennsylvania State University, 2007. Downloaded from http://www.stat.psu.edu/online/ courses/stat509/06_sample/09_sample_hazard.htm on 24.02.2011 • Survival analysis: A self-learning text, Kleinbaum, D.G. & Klein M., Springer 2005. • Survival and Event History Analysis: A Process Point of View (Statistics for Biology and Health), Aalen, O., Borgan, O. & Gjessing H., Springer 2010.