190 likes | 353 Views
Latent Variable Methods for Longitudinal Data: Recent Developments & Challenges. David B. Dunson National Institute of Environmental Health Sciences, National Institutes of Health. Latent variables in longitudinal data analysis.
E N D
Latent Variable Methods for Longitudinal Data: Recent Developments & Challenges David B. Dunson National Institute of Environmental Health Sciences, National Institutes of Health
Latent variables in longitudinal data analysis • Latent variables are routinely incorporated into longitudinal data models: • to allow heterogeneity among subjects • to accommodate dependency in repeat observations • to study the dependency structure • to characterize changes in a latent class(e.g., health condition) or continuous latent trait over time • to allow informative censoring and/or missingness (e.g., joint modeling of longitudinal & survival data)
Generalized linear mixed models • Extend GLMs to allow dependent observations • Generalized linear mixed model (GLMM): hij= xij’b+ zijbi, bi ~ Nq(0, S) • Frequentist & Bayesian model fitting requires integrating out the random effects - MCMC is a common solution(e.g., Zeger and Karim, 1991)
Issues with MCMC • Random effects models typically have identifiability problems • Naive improper priors can lead to improper posterior distributions • Vague but proper priors can lead to major problems with mixing/computational efficiency
Developments: Computation in GLMMs • Identifiability, improper priors, & Gibbs sampling (Gelfand & Sahu, 1999; Chen et al., 2003) • Methods for improving computation • Hierarchical centering (Gelfand et al., 1995) • Data augmentation (van Dyk & Meng, 2001) • Block updating algorithms (Chib & Carlin, 1999) • MCMC for non-conjugate priors (Wolfinger and Kass, 2000)
Remaining Computational Issues • Simple proper priors typically used in GLMMs (e.g., in WinBUGs) can be difficult to elicit • Typically used algorithms can have poor performance - • NLMIXED sensitive to starting values • WinBUGS can have very poor mixing • Particular problem for random effects covariance • Inverse-Wishart prior overly restrictive
Random Effects Covariance • Covariance has many parameters to estimate • Shrinkage priors may improve stability of estimates • Inverse-Wishart prior overly restrictive • Recent work has focused on covariance selection & graphical modeling of normal data
Covariance Selection Models • Gaussian covariance structure models (Wong et al, 2003; Liechty et al, 2004) allow shrinkage estimation & inference on structure • Methods developed for normal data & not random effects or latent variables
Random effects selection • Several authors have proposed tests for whether or not to include random effects(Commenges & Jacqmin-Gadda, 1997; Lin, 1997; Hall & Praestgaard, 2001) • Bayesian: Point mass at 0 allows random intercept to drop out of model(Albert & Chib, 1997) • Sinharay & Stern (2001) estimated Bayes factors for comparing GLMMs
Random effect Selection • For linear mixed models, Chen & Dunson (2003) developed stochastic search Gibbs sampler for selection of random effects • A similar idea can be applied to frailty models (Dunson & Chen, 2004)
Random effect Covariance Selection • Incorporation of the random effects induces a dependency structure on the multiple outcomes • There is often interest in inferences on this dependency structure • Cai & Dunson (2004) developed an approach for Bayesian random effects covariance selection
Remaining issues • Current algorithms for random effects covariance selection rely on one at a time updating • Such algorithms are infeasible when the number of random effects is large • An important area is the development of methods for efficient selection in high dimensions - likely not based on MCMC
Factor models • Random effects models are closely related to factor analytic models for longitudinal data, e.g., hij = xij’b + lj’xij, L = factor loadings matrix & xij = latent traits • Mixing problems & uncertainty in the factor structure are challenging problems • Recent authors have proposed Bayesian methods for factor model selection (Lopes & West, 2004)
Factor models • GLMMs & factor models can be linked - observed predictors act as multipliers on the factors • This formulation provides more flexibility in characterizing the covariance structure • It is very interesting to consider selection of the covariance structure under restrictions - • In most applications, only certain structures are scientifically plausible
Dynamic factor models • Aguilar and West (2000) developed a Bayesian approach motivated by financial applications • Recent work has focused on non-normal and mixed outcomes • Dunson (2003) allowed latent traits in mixed GLMMs to vary dynamically with time • Miglioretti (2003) proposed a related dynamic latent class approach
Informative censoring & missing data • Shared latent variables in primary outcome & censoring/missing data models allow for informative censoring/missingness • The censoring time and/or missingness indicators can be considered an additional outcome • Thus, latent variable methods for mixed outcomes can be applied directly.
Some other developments • Varying-coefficient models (Zhang, 2004) • Nonparametric distribution of random effect: • Frequentist: Chen et al. (2004), • Bayesian: Mukhopadhyay & Gelfand (1997), Kleinman & Ibrahim (1998), Dunson (2004)
Interesting areas for future research • Fully parametric latent variable models are well developed, though computational issues remain • Methods are needed for relaxing assumptions on latent & manifest variable distributions • A rich class of methods can be developed by allowing latent variable distributions to change with predictors • Related to recent work on quantile regression in SEMs (Chesher)
Bayesian semiparametrics • The Bayesian framework is natural for latent variable models with uncertainty in factor structures & distributional forms • Challenging to develop non-parametric and semi-parametric approaches • Computationally efficient methods, which are robust to prior specification (or at least don’t require highly informative priors) are needed.