10 likes | 108 Views
Analysing censored longitudinal data with non-ignorable missing values. Milena Falcaro and Andrew Pickles Biostatistics Group, University of Manchester, UK. Statistical issues in longitudinal studies
E N D
Analysing censored longitudinal data with non-ignorable missing values Milena Falcaro and Andrew Pickles Biostatistics Group, University of Manchester, UK Statistical issues in longitudinal studies • Longitudinal data have a 2-level hierarchical structure: repeated measures over time are nested within individuals observations are correlated. • Many longitudinal studies are affected by the presence of missing values. Analysing • only the complete data when the missing data mechanism is non-ignorable can lead • to severely biased results. • Not uncommonly, outcomes are poorly measured and censoring may also arise. • Motivating example • Data come from the Steel-Dyne Cognitive Ageing cohort of elderly men and women • The study was established in 1982 with the aim to investigate the nature, time course, extent and aetiology of changes in cognitive function • Outcome of interest: depression scores measured up to 4 times • Participants were observed at individually-varying time points • Missing values were common, non-monotone and related to the outcome under study • Floor effect (left censoring) • Dealing with non-ignorable missing values • i: index for subject • k: index for measurement occasion (k = 1, . . . , 4) • Yik: outcome variable • Dik: missing data indicator (Dik = 1 if Yik is missing and Dik = 0 otherwise) • D*ik: underlying continuous latent variable generating Dik(i.e. Dik = 1 if D*ik passes a • certain threshold and Dik = 0 otherwise) • tik : time when the kth measurement for subjectioccurred • To account for non-ignorable missing values we jointly model the repeated measures • and the missing data mechanism by assuming they depend on a common underlying • process. • Yik and Dikare assumed to be conditionally independent given the latent variable Fik. • We consider the model • where theb ’s denote regression coefficients,Xis a vector of covariates andl is a • factor loading,whereas eikY ~ N(0,sY2) and eikD~ N(0,sD2) areindependenterror • terms. • For identifiability reasons we impose l2Var(Fik) + sD2= 1orsD2= 1. • Missing data mechanism is ignorable when l = 0. • The correlation betweenFi1, . . . , Fi4 can be characterised in many different ways.We • here consider a quadratic growth curve model (index i is omitted for simplicity): • Dealing with censored observations • We handle censored observations by considering auxiliary 3-category variables • 0if the observation is left-censored • Wik =1if the observation is interval-censored • 2if the observation is right-censored • and by using ordinal probit models with two subject- and occasion-specific thresholds • t*ik(1) and t*ik(2) (Falcaro and Pickles, 2007). The assumption of normality can be relaxed • by embedding a Box-Cox transformation into the model. • The intervals can differ in length and position from subject to subject and can be made arbitrarily small. Non-censored observations are allowed for by specifying a very small interval. • Implementation of the model • The latent components of the models are assumed to be normally distributed and the • underlying components of variance are analytically combined to form the covariance • matrix of the joint distribution. Numerical integration is then undertaken over this • multivariate distribution, not over the individual random effects. • The model can be fitted by software for numerical integration of multivariate Gaussian • distributions as implemented in programs such as Mx (Neale et al., 2002). • Results • The variance of the quadratic time random effect was found to be wholly non-significant. • The factor loading l was found to be highly significant (p < 0.001), indicating that the • non-responses were indeed not at random and needed to be accounted for in the analysis. • References • Bollen, K.A. and Curran, P.J. (2006). Latent curve models: a structural equation perspective. • John Wiley & Sons, New Jersey. • Falcaro, M. and Pickles, A. (2007). A flexible model for multivariate interval-censored survival • times with complex correlation structure. Statistics in Medicine 26, 663-680. • Little, R.J.A. and Rubin, D.B. (2002). Statistical analysis with missing data, 2nd ed. John Wiley • & Sons, New York. • Neale, M.C. et al. (2002). Mx: statistical modeling, 6th ed. Dept. of Psychiatry, Richmond, • Virginia. • This work was supported by an ESRC research grant.