E N D
Statistics for clinicians Biostatistics course by Kevin E. Kip, Ph.D., FAHAProfessor and Executive Director, Research CenterUniversity of South Florida, College of NursingProfessor, College of Public HealthDepartment of Epidemiology and BiostatisticsAssociate Member, Byrd Alzheimer’s InstituteMorsani College of MedicineTampa, FL, USA
SECTION 6.6 Introduction to survival analysis 2 2 2 2
Learning Outcome: Recognize concepts and methods used in survival analysis
Survival Analysis • A technique to estimate the probability of “survival” (and also risk of disease) that takes into account incomplete subject follow-up. • Calculates risks over a time period with changing incidence rates. • Wide application in a variety of disciplines, such as engineering.
Survival Analysis • With the Kaplan-Meier method (“product-limit method”), survival probabilities are calculated at each time interval in which an event occurs. • The cumulative survival over the entire follow-up period is derived from the product of all interval survival probabilities. • Cumulative incidence (risk) is the complement of cumulative survival.
K-M formula: # of time intervals (Nk – Ak) S = ------------- k = 1 Nk Where: k = sequence of time interval Nk = number of subjects at risk Ak = number of outcome events
Survival Analysis • With the Kaplan-Meier method, subjects with incomplete follow-up (FU) are “censored” at their last known time of (FU). • An important assumption (often not upheld) is that censoring is “non-informative” (survival experience of subjects censored is the same as those with complete FU). • Non-fatal outcomes can also be studied.
Survival Analysis • The Life-Table method is conceptually similar to the Kaplan-Meier method. • The primary difference is that survival probabilities are determined at pre-determined intervals (i.e. years), rather than when events occur.
SECTION 6.7 Calculation and Interpretation of Survival Analysis Estimates 9 9 9 9 9
Learning Outcome: Calculate and interpret survival analysis estimates of incidence
Survival Analysis Example: • Assume a study of 10 subjects conducted over a 2-year period. • A total of 4 subjects die. • Another 2 subjects have incomplete follow-up (study withdrawal or late study entry). What is the probability of 2-year survival, and the corresponding risk of 2-year death?
Survival Analysis (Practice) Example: • Assume a study of 12 subjects conducted over a 3-year period. • A total of 5 subjects die. • Another 2 subjects have incomplete follow-up (study withdrawal or late study entry). What is the probability of 3-year survival, and the corresponding risk of 3-year death?
Complete the worksheet below What is the probability of 3-year survival, and the corresponding risk of 3-year death? Survival _______ Death _________
Complete the worksheet below What is the probability of 3-year survival, and the corresponding risk of 3-year death? Survival _0.5346_ Death _0.4654_
SECTION 6.8 Logistic Regression Model 22 22 22 22 22 22
Learning Outcome: Recognize components and interpret parameters from the logistic regression model
Logistic Regression Analysis • Conceptually similar to linear regression with dichotomous outcome. • Outcome is usually coded as “0” or “1”, with “1” referring to presence of the outcome in interest (although SAS assumes 0). • p represents the probability that the outcome is present (e.g. value of 1), given particular covariate values of an individual
Logistic Regression Analysis • Multiple logistic regression model can be written in different ways: where:p = expected probability that outcome is present x1 through xp= independent variables b0 through bp = regression coefficients
Logistic Regression Analysis b1 = change in the expected log odds in the outcome relative to a 1-unit change in xi holding other predictors constant Anti-log of regression coefficient, exp(bi), produces odds ratio
Logistic Regression Analysis Example: Estimate the risk of incident CVD among persons defined as obese. p { } ln = b0 + b1x1 + b2x2 + … bpxp 1 – p p { } ln = -2.367 + 0.658(Obesity) = log odds exp(0.658) = 1.93 (odds ratio) 1 – p
Example: Estimate the log odds of being on a statin drug in relation to the predictors listed below. p { } ln = b0 + b1x1 + b2x2 + … bpxp 1 – p Write out the logistic regression equation below. (Practice) p { } ln = 1 – p
Example: Estimate the log odds of being on a statin drug in relation To the predictors listed below. p { } ln = b0 + b1x1 + b2x2 + … bpxp 1 – p Write out the logistic regression equation below. p { } ln = -3.065 + 0.036(age) – 0.53(female) + 0.029(BMI) – 0.001 (physical activity) + 1.067(diabetes) 1 – p
p { } ln = b0 + b1x1 + b2x2 + … bpxp 1 – p So, the predicted odds of an individual being on a statin drug = = EXP[(-3.065 + 0.036(age) – 0.53(female) + 0.029(BMI) – 0.001 (physical activity) + 1.067(diabetes)] AND Predicted Probability = Predicted odds / (1 + predicted odds).
Estimate the predicted odds and probability of an individual being on a statin drug with the following characteristics: Age=55; male; BMI=31.4; physical activity level=2; diabetic = EXP[(-3.065 + 0.036(55) – 0.53(0) + 0.029(31.4) – 0.001 (2) + 1.067(1)] = exp(0.896) = 2.446 Predicted Probability = odds / (1 + predicted odds) = 2.446 / (3.446) = 0.71
Estimate the predicted odds and probability of an individual being on a statin drug with the following characteristics: PRACTICE Age=52; female; BMI=29.5; physical activity level=3; non-diabetic = Predicted Probability = odds / (1 + predicted odds) =
Estimate the predicted odds and probability of an individual being on a statin drug with the following characteristics: Age=52; female; BMI=29.5; physical activity level=3; non-diabetic = EXP[(-3.065 + 0.036(52) – 0.53(1) + 0.029(29.5) – 0.001 (3) + 1.067(0)] = exp(-0.8645) = 0.42 Predicted Probability = odds / (1 + predicted odds) = 0.42 / (1.42) = 0.296
Example: Estimate the log odds of being on a statin drug in relation to the predictors listed below. Produce odds ratio estimates of statin use for the following (Practice): Age (per year) = Age per 5 years) = Female gender = History of diabetes =
Example: Estimate the log odds of being on a statin drug in relation To the predictors listed below. Produce odds ratio estimates of statin use for the following: Age (per year) = exp(0.036) = 1.04 Age per 10 years) = exp(10 x 0.036) = 1.43 Female gender = exp(-0.530) = 0.59 History of diabetes = exp(1.067) = 2.91
Example: Estimate the log odds of being on a statin drug in relation To the predictors listed below. Interpret odds ratio estimates of statin use for the following: Age per 10 years) = exp(10 x 0.036) = 1.43 History of diabetes = exp(1.067) = 2.91
Example: Estimate the log odds of being on a statin drug in relation To the predictors listed below. Interpret odds ratio estimates of statin use for the following: Age per 10 years) = exp(10 x 0.036) = 1.43 For every 10 year increase in age, the adjusted odds of being on a statin drug increases 1.43-fold History of diabetes = exp(1.067) = 2.91 Persons with diabetes have 2.91 times higher odds of being on a statin drug compared to persons without diabetes
SECTION 6.9 SPSS for Logistic Regression Analysis 38 38 38 38 38 38 38
Learning Outcome: Use SPSS to fit and interpret a logistic regression model
SPSS Analyze Regression Binary Logistic Dependent Variable Covariates
SPSS Analyze Descriptive Statistics Crosstabs Row=Hx diabetes Col = Statin use Odds Ratio = odds exposure cases odd exposure controls = (17 / 88) / (24 / 372) = 0.193 / 0.0645 = 2.99