300 likes | 311 Views
Explore the application of survival analysis, the Cox PH model, time to event concept, limitations of conventional models, censoring types, and the Cox model's interpretation challenges and solutions. Gain insights into estimating treatment effects and the importance of the auxiliary model in survival analysis.
E N D
Survival time treatment effects Ting-Ting Chung 2017/09/12
Outline • When to use survival analysis • Introduction of Cox PH model • The Problem of Cox PH model • Auxiliary model • IPW • IPWA
When to use survival analysis • Time (not must be time) to event • Will smoking reduce the time to a second heart-attack among men aged 45-55 who have already had a heart attack? • What is the effect of participation in a supported work program after release from prison on the time until a subsequent arrest?
Survival analysis concepts • Time (not must be time) to event • Censored • Time origin
Compare with conventional model • Logistic regression • Ignores time of events • Can’t handle time-dependent variable • Linear regression • Can’t handle censored, time-dependent variable • Time to event can always to be unusual distribution
Data type • Censoring • Rightlifetest, Phreg • If the observation is terminated before the event occurs • Left ICLifetest, ICPHreg • When the observation experience the event before the start of follow-up • Interval ICLifetest, ICPHreg • You know the survival time is that is between the values t and t+k B C E F Withdrew or lost of follow-up subject event Start of study time t time t+k End of study
Right censored • Type 1 • Subjects survived until end of study某個時間點結束 • Type 2 • Subjects survived until end of study (when a pre-specified numbers of events have occurred)EX:試驗收案,收案狀況已超過預期,停止收案) • Radom (個人的離開不在控制範圍) • Uninformative ex : 搬離台灣 • Informative ex : 藥物試驗,無法承受副作用,因此離開,sensitivity analysis
Cox Proportional Hazard Model • The Cox model models the probability that the event will occur in the next moment given that it has not yet happened as a function of covariates hi(t|x)=
The hazard ratio that smoking raises the hazard of a second heart attack by a factor of 1:5 relative to not smoking Hazard ratio = = = =
Compared with Parametric models • Parametric models (exponential, weibull,…) • Distribution of survival time is known • The hazard function is completely specified • Semi-parametric models (cox) • Distribution of survival time is unknown • The hazard function is unspecified
Stata command-1 • stsetatime, failure(fail) (David M. Drukker, 2015)
Stata command-2 • stcox smoke age exercise diet , nolognoshow (David M. Drukker, 2015)
Two problems with the Cox model • It is hard to understand the units of the hazard ratio • How bad is it that smoking raises the hazard ratio by 1.5? • This interpretation is only useful if the treatment enters the x term linearly • If the treatment is interacted with other covariates, the effect of the treatment varies over individuals (David M. Drukker, 2015)
Problems with the Cox model • The average difference in time to second heart attack when everyone smokes instead of when no one smokes • For each individual, the effect of the treatment is a contrast of what would happen if the individual received the treatment versus what would happen if the individual did not receive the treatment • The hazard-ratio measure of the treatment effect is the ratio of the hazard of the smoking potential outcome to the hazard nonsmoking potential outcome (David M. Drukker, 2015)
Problems with the Cox model • Ratios of unconditional hazards are harder to estimate and more difficult to interpret than the average difference in time to second heart attack when everyone smokes instead of no one smokes • ATE在現實中是不太可能的: 一個人不可能同時屬於抽菸組,又屬於不抽菸組 potential-outcome mean (POM) (David M. Drukker, 2015)
Problems with the Cox model • The “fundamental problem of causal inference" is that we only observe one of the potential outcomes • We can use the tricks of missing-data analysis to estimatetreatment effects (David M. Drukker, 2015)
Random-assignment case • If smoking were randomly assigned, the missing potential outcome would be missing completely at random • If the time to second heart attack was never censored and smoking was randomly assigned • The average time to second heart attack among smokers would estimate the smoking POM • The average time to second heart attack among nonsmokers would estimate the nonsmoking POM (David M. Drukker, 2015)
Random-assignment case • Instead of assuming that the treatment is randomly assigned, we assume that the treatment is as good as randomly assigned after conditioning on covariates • Formally, this assumption is known as conditional independence • The auxiliary model is how we condition on covariates so that the treatment is as good as randomly assigned (David M. Drukker, 2015)
Inverse-probability-weighted (IPW) estimators-1 • IPW estimators weight observations on the observed outcome variable by the inverse of the probability that it is observed to account for the missingness process • Observations that are not likely to contain missing data get a weight close to one; observations that are likely to contain missing data get a weight larger than one, potentially much larger (David M. Drukker, 2015)
Inverse-probability-weighted (IPW) estimators-2 • IPW estimators use estimates from models for the probability of treatment and the probability of censoring to correct for the missing potential outcome and the observations lost to censoring (David M. Drukker, 2015)
預測censoring機率的變項 預測治療機率的變項 • stteffectsipw (smoke age exercise diet) (age exercise diet), nolognoshow The average time to second heart attack is 1.7 years sooner when everyone in the population smokes instead of no one smokes The average time to second heart attack is 4.2 years when no one smokes (David M. Drukker, 2015)
結果變項 預測治療變項 • teffectsipw (fail) (smoke age exercise diet, probit) • stteffectsipw (smoke age exercise diet) (age exercise diet), nolognoshow
Inverse-probability-weighted regression-adjustment (IPWRA) • IPWRA estimators use the inverse of the estimated treatment-probability weights to estimate missing-data-corrected regression coefficients that are subsequently used to estimate the POMs • Censoring can be handled in the log likelihood function or by modeling the censoring process (David M. Drukker, 2015)
預測結果機率的變項 預測治療機率的變項 預測censoring機率的變項 • stteffectsipwra (age exercise diet) (smoke age exercise diet) (age exercise diet)可寫可不寫 The average time to second heart attack is 1.5 years soonerwhen everyone in the population smokes instead of no onesmokes The average time to second heart attack is 4.1 years when noone smokes (David M. Drukker, 2015)
Reference • Drukker, D. M. (2015, November 12) . Estimating survival-time treatment effects from observational data.