170 likes | 325 Views
Lab 9 Survival Analysis Henian Chen, M.D., Ph.D. Description of Data ‘SURVIVAL65.TXT’ is data from a study on multiple myeloma in which researchers treated 65 patients with
E N D
Lab 9 Survival Analysis Henian Chen, M.D., Ph.D. Applied Epidemiologic Analysis - P8400 Fall 2002
Description of Data ‘SURVIVAL65.TXT’ is data from a study on multiple myeloma in which researchers treated 65 patients with alkylating agents. Of those patients, 48 died during the study and 17 survived. The goal of this study is to identify important prognostic factors. TIME survival time in months from diagnosis STATUS 1 = dead, 0 = alive (censored) LOGBUN log blood urea nitrogen (BUN) at diagnosis HGB hemoglobin at diagnosis PLATELET platelets at diagnosis: 0 = abnormal, 1 = normal AGE age at diagnosis in years LOGWBC log WBC at diagnosis FRACTURE fractures at diagnosis: 0 = none, 1 = present LOGPBM log percentage of plasma cells in bone marrow PROTEIN proteinuria at diagnosis SALCIUM serum calcium at diagnosis Applied Epidemiologic Analysis - P8400 Fall 2002
LIFETEST procedure • Estimation of the distribution of the • survival times by nonparametric • methods • Kaplan-Meier method • (also called product-limit method) • 2. life table method Applied Epidemiologic Analysis - P8400 Fall 2002
Kaplan-Meier Estimates for Total Sample procimport datafile='a:survival65.txt' out=survival65 dbms=tab replace; run; proclifetestdata=survival65 method=km plots=(s,lls); title'Distribution of Survival Times for 65 Myeloma Patients by Kaplan-Meier Method'; time time*status(0); run; Method=KM or PL: Kaplan-Meier (KM) or product-limit (PL) estimates Method=LT or LIFE: life table estimates By default, Method=PL. Plots=S: plot the survival curve (estimated survival distribution function (SDF) against time) Applied Epidemiologic Analysis - P8400 Fall 2002
Plots = LLS plot the log[-log(estimated SDF)] against log(time) to show the distribution of the survival time. Exponential Distribution the hazard function is constant and does not depend on time, the graph is approximately a straight line, the slope is 1. Weibull Distribution the hazard function changes with time, the graph is approximately a straight line, but the slope is not 1. Applied Epidemiologic Analysis - P8400 Fall 2002
Life Table Estimates for Total Sample proclifetestdata=survival65 method=lt plots=(s,lls) width=10; title'Distribution of Survival Times for 65 Myeloma Patients by Life Table Method'; time time*status(0); run; Applied Epidemiologic Analysis - P8400 Fall 2002
Comparison of Two Survival Curves for Normal Platelet and Abnormal Platelet by Kaplan-Meier proclifetestdata=survival65 method=km plots=(s,lls); time time*status(0); strata platelet; run; Log-Rank test for Weibull distribution or proportional hazards assumption, using weight=1 so that each failure time has equal weighting. Wilcoxon test For lognormal distribution, using weight=the total number at risk at that time so that earlier times receive greater weight than later times, placing less emphasis on the later failure times. -2Log(LR) : Likelihood Ratio test for exponential distribution survival data. Applied Epidemiologic Analysis - P8400 Fall 2002
PHREG procedure Procedure PHREG performs regression analysis of survival data based on the Cox proportional hazards model. Procedure PHREG also performs conditional logistic regression analysis for matched case-control studies Applied Epidemiologic Analysis - P8400 Fall 2002
SAS Program for Cox Model procimport datafile='a:survival65.txt' out=survival65 dbms=tab replace; run; procphregdata=survival65; model time*status(0)= logbun hgb platelet age logwbc fracture logpbm protein calcium /selection=stepwise detailsrl; run; Applied Epidemiologic Analysis - P8400 Fall 2002
TIME : survival time in months from diagnosis • STATUS: 1 = dead, 0 = alive (censored) • Cox Regression Model • model time*status(0)= nine independent variables; • Linear Regression Model • model time= nine independent variables; • Can we fit a linear regression model for this data? • NO !! • We don’t know the distribution of the survival times. • Linear regression model treats the censored data as non-censored data. Applied Epidemiologic Analysis - P8400 Fall 2002
Logistic Regression Model model status= nine independent variables; Can we fit a logistic regression model for this data? NO !! It is not right to use logistic regression to fit the survival data because it treats different strata (different time points) as one stratum (the last time point). It is not right if you don’t use the information of “time” when you have it. You have to use logistic regression if you don’t have the information of “survival time”. Applied Epidemiologic Analysis - P8400 Fall 2002
Patient A: time=10 months, status=1 (dead) id time status sex age A 0 0 F 30.00 A 1 0 F 30.08 A 2 0 F 30.17 A 3 0 F 30.25 A 4 0 F 30.33 A 5 0 F 30.42 A 6 0 F 30.50 A 7 0 F 30.58 A 8 0 F 30.67 A 9 0 F 30.75 A 10 1 F 30.83 Logistic regression (one stratum) Survival Analysis (11 strata) Applied Epidemiologic Analysis - P8400 Fall 2002
LIFEREG procedure Procedure LIFEREG fits parametric models for survival data by using maximum likelihood. If you has clear idea about the distribution of survival times, you should use parametric models Applied Epidemiologic Analysis - P8400 Fall 2002
SAS Program for 7 Parametric Models (using the SURVIVAL65.TXT DATA) procliferegdata=survival65; class platelet fracture; model time*status(0)=logbun hgb platelet age logwbc fracture logpbm protein calcium /distribution = weibull; run; WEIBULL Weibull distribution EXPONENTIAL exponential distribution GAMMA generalized gamma distribution LLOGISTIC loglogistic distribution LNORMAL lognormal distribution LOGISTIC logistic distribution NORMAL normal distribution Applied Epidemiologic Analysis - P8400 Fall 2002
Results of the Weibull Regression Model Standard 95% Confidence Chi- Parameter DF Estimate Error Limits Square Pr > ChiSq Intercept 1 7.0556 2.7719 1.6228 12.4883 6.48 0.0109 LOGBUN 1 -1.5325 0.5412 -2.5932 -0.4718 8.02 0.0046 HGB 1 0.0970 0.0621 -0.0248 0.2187 2.44 0.1185 PLATELET 1 -0.2655 0.4557 -1.1585 0.6276 0.34 0.5602 AGE 1 0.0103 0.0169 -0.0228 0.0434 0.37 0.5411 LOGWBC 1 -0.4008 0.5989 -1.5746 0.7729 0.45 0.5033 FRACTURE 1 0.3324 0.3526 -0.3587 1.0234 0.89 0.3459 LOGPBM 1 -0.3871 0.4290 -1.2280 0.4538 0.81 0.3670 PROTEIN 1 -0.0092 0.0228 -0.0540 0.0356 0.16 0.6862 CALCIUM 1 -0.0998 0.0898 -0.2758 0.0761 1.24 0.2661 Scale 1 0.8671 0.0927 0.7032 1.0694 Weibull Shape 1 1.1532 0.1233 0.9351 1.4222 -- An increase in one unit of the LOGBUN increases the log of the hazard of dying by 1.5325, controlling for other variables -- An increase in one unit of the LOGBUN increases the hazard of dying by 363% [exp(1.5325)=4.63-1] *The coefficients are expected to have opposite signs for parametric models Applied Epidemiologic Analysis - P8400 Fall 2002