330 likes | 348 Views
Using Mixture Latent Markov Models for Analyzing Change in Longitudinal Data with the New Latent GOLD 5.0 GUI Jay Magidson , Ph.D. President, Statistical Innovations Inc. Belmont, MA., U.S. statisticalinnovations.com.
E N D
Using Mixture Latent Markov Models for Analyzing Change in Longitudinal Data with the New Latent GOLD 5.0 GUI • Jay Magidson, Ph.D. • President, Statistical Innovations Inc.Belmont, MA., U.S. • statisticalinnovations.com Presented at Modern Modeling Methods (M3) 2013, University of Connecticut
Abstract The latent Markov GUI in version 5.0 of Latent GOLD was designed to make it easy to estimate an extended class of latent Markov models. The simplicity of modeling with a single dichotomous indicator (response) and 5 time points is maintained with even hundreds of time points, multiple indicators, indicators of differing scale types (nominal, ordinal, count, continuous), observations of different length, covariates, and a mixture including mover- stayer structures. New longitudinal bivariate residuals (L-BVRs) are available to diagnose whether the model is picking up the most important aspect of the data, in addition to standard tools such chi-squared tests, with bootstrap, and AIC/BIC. Informative graphical displays are provided and parameter estimation is very fast. The final goal is to obtain useful and correct answers to the research questions of interest.
Classification of Latent Class Models for Longitudinal Research Latent Markov (LM) models are cluster models for longitudinal data where persons can switch between clusters. In the corresponding growth models, persons stay in the same cluster. Clusters in LM models are called latent states, while in the growth model clusters are called latent classes. In LM models, transition probability parameters spell out how switching between states occurs from time t to t+1. Vermunt, Tran and Magidson (2008) “Latent class models in longitudinal research”, chapter 23 in Handbook of Longitudinal Research, S. Menard Editor, Academic Press.
Graphs for LM and MLM Models Xdenotes a categorical latent variable with S categories (latent states) Person i can be in a different state s=1,2,…,S at different times t=0,1,2,3,4 Xt = latent state variable at time t LM model – 3 sets of probability parameters: • Initial State probs (b0): P(X0 = s) • Transition probs (bt), t=1,…,4 • Measurement errors (a) – measurement equivalence The particular set of latent states (s0, s1, s2, s3, s4) defines a change pattern for person i Extension to mixture LM (MLM) model MLM model implies different change patterns for each latent class k=1,2,…,K Example for K=2 classes, where class 2 is a ‘Stayer’ class (1,1,1,1,1) or (2,2,2,2,2) – no change
Graphs for LM and MLM Models Xdenotes a categorical latent variable with S categories (latent states) Person i can be in a different state s=1,2,…,S at different times t=0,1,2,3,4 Xt = latent state variable at time t LM model – 3 sets of probability parameters: • Initial State probs (b0): P(X0 = s) • Transition probs (bt), t=1,…,4 • Measurement errors (a) – measurement equivalence Extension to multiple indicators is immediate! b3 b2 b4 b1 b0 X4 X0 X1 X2 X3 aY aY aV aY aV aY aY aV aV aV V3 Y0 V0 V1 Y2 Y3 Y1 V2 Y4 V4
Latent GOLD Lets Users Customize Logit Models* • Initial State probs: P(X0 = s) Logit model may include covariates Z (e.g., AGE, SEX) • Transition probs: P(Xt = r|Xt-1 = s) Logit model may include time-varying and fixed predictors • Measurement model probs: P(yt = j|Xt = s) One or more indicator (dependent) variables Y, of possibly different scale types (e.g., continuous, count, dichotomous, ordinal, nominal) For introductory purposes, examples here are limited to analyses of a single categorical indicator: * Equations can be customized using Latent GOLD 5.0 GUI and/or syntax.
Examples • Example 1: LM model with time-homogeneous transition probabilities (loyalty data) • Example 2: MLM model (satisfaction data) • Example 3: MLM model with covariates (sparse panel data) These introductory examples are limited to 5 and to 23 time points but even data with hundreds of time points are very easy to analyze with Latent GOLD.
Example 1: Loyalty Data in Long File Format • N=631 respondents • T= 5 time points • Dichotomous Y: • Choose Brand A? • 1=Yes • 0=No • Y=(Y1,Y2,Y3,Y4,Y5) • 25 = 32 response patterns id=1,2,…,32 • ‘freq’ is used as a case weight
Loyalty Model with Time-homogeneous Transitions Transition probs equal over time Initial State Probability (b0) Transition Probabilities (b) Measurement Model Probabilities (a)
Example 1: Easy to generate future predictions Predict market share for brand A continues to increase Forecasts computed directly from model parameter estimates
Example 1: LM Model Fits Better than Latent Growth Model 2-state time-homogeneous latent Markov model fits well: (p=.77), small L-BVRs 2-class latent growth model (“2-class Regression”) is rejected (p=.0084) Lag1 BVR pinpoints problem in LC growth model as failure to explain 1st order autocorrelation
Example 2: Life Satisfaction • N=5,147 respondents • T= 5 time points • Dichotomous Y: • Satisfied with life? • 1=No • 2=Yes • Y=(Y1,Y2,Y3,Y4,Y5) • 25 = 32 response patterns id=1,2,…,32 • ‘weight’ is used as a case weight • Models: • Null • Time heterogeneous LM • 2-class mixture LM • Restricted (Mover-Stayer)
Example 2: Model Parameters -- Time-heterogeneous Model Initial State Probability (b0) Transition Probabilities (bt) Measurement Model Probabilities (a) Estimated values for 2-state LM model
Ex. 2: Mover-Stayer Time-heterogeneous Latent Markov Model Both the unrestricted and Mover-Stayer 2-class MLM models fit well (p=.95 and .71), the BIC statistic preferring the Mover-Stayer model. Again, the comparable LC growth model fails to explain 1st order autocorrelation
2-class MLM Model Suggests Mover-Stayer Class Structure Class Size Estimated Values output for 2-state time-heterogeneous MLM model with 2 classes Initial State Transition Probabilities Measurement model
2-class MLM Model with Mover-Stayer Structure for Classes The Estimated Values output shows that 52.25% of respondents are in the Stayer class, who tend to be mostly Satisfied with their lives throughout this 5 year period -- 67.85% are in state 1 (‘Satisfied' state) initially and remain in that state. In contrast, among respondents whose life satisfaction changed during this 5 year period (the ‘Mover’ class), fewer (54.82%) were in the Satisfied state during the initial year. Class Size Initial State Transition Probabilities – note that class 1 probability of staying in the same state has been restricted to 1. Measurement model
Example 2: Longitudinal Profile Plot for the Mover-Stayer LM model Stayer class showing 61.75% satisfied each year Mover class showing changes over time
Longitudinal-Plot with Overall Predicted Probability Appended
Example 3: Latent GOLD Longitudinal Analysis of Sparse Data • N=1725 pupils who were of age 11-17 at the initial measurement occasion (in 1976) • Survey conducted annually from 1976 to 1980 and at three year intervals after 1980 • 23 time points (T+1=23), where t=0 corresponds to age 11 and the last time point to age 33. • For each subject, data is observed for at most 9 time points (the average is 7.93) which means that responses for the other time points are treated as missing. (See Figure 2) • Dichotomous dependent variable – ‘drugs’ indicating whether respondent used hard drugs during the past year (1=yes; 0=no). • Time-varying predictors are ‘time’ (t) and ‘time_2’ (t2); time-constant predictors are ‘male’ and ‘ethn4’ (ethnicity).
Example 3: Latent GOLD Longitudinal Analysis of Sparse Data The plot on the left shows the overall trend in drug usage during this period is non-linear, with zero usage reported for 11 year olds, increasing to a peak in the early 20s and then declining through age 33. The plot on the right plots the results from a mixture latent Markov model suggesting that the population consists of 2 distinct segments with different growth rates, Class 2 consisting primarily of non-users.
Example 3: 2-class MLM Model Class size Initial state probabilities by class Transition probabilities by class Measurement model probabilities
Example 3: Including Gender and Ethnicity as Covariates in Model
Example 3: Including Gender and Ethnicity as Covariates in Model Adding gender and ethnicity improves the BIC. Again, the 2-class LC growth model has a very large Lag1 BVR
Example 3: Including Gender and Ethnicity as Covariates in Model
Example 3: Including Gender and Ethnicity as Covariates in Model • For concreteness, we focus on 18 year olds • 18 year olds who were in the lower usage state (State 1) at age 17 have a probability of .1876 of switching to the higher usage state (State 2) if they are in Class 2 compared to a probability of only .0211 of switching if they were in Class 1. • If they were in the higher use state (State 2) at age 17, they have a probability of .9589 of remaining in that state compared to only .3636 if they were in Class 1. • The more general pattern -- Class 2 is more likely to move to and remain in a higher drug usage state than Class 1.
Example 3: Including Gender and Ethnicity as Covariates in Model Parameters output for model c, showing that age and ethnicity are significant.
Example with 3 Indicators 4 order-restricted latent state model. See Vermunt, J.K. (2013, in press). Latent class scaling models for longitudinal and multilevel data sets. In: G. R. Hancock and G. B. Macready (Eds.), Advances in latent class analysis: A Festschrift in honor of C. Mitchell Dayton. Charlotte, NC: Information Age Publishing, Inc.
Summary The latent Markov GUI in version 5.0 of Latent GOLD was designed to make it easy to estimate a very extended class of LMs. In this presentation we analyzed data based on a single dichotomous indicator. However, because of the program structure/ design, the simplicity of analysis and speedy estimation is maintained with even hundreds of time points, multiple indicators of different scale types, observations of different length time series. In addition, • The LG Syntax is an open system that allows more extended models, such as • models with parameter restrictions • more indicator scale types (censored, truncated, counts with exposure, beta, gamma) • models with multiple state variables • multilevel latent Markov models, including models with continuous random effects • step3 latent Markov modeling (new in LG 5.0) • continuous-time latent Markov modeling (new in LG 5.0)
References and Additional Resources Vermunt, J.K., Tran, B. and Magidson, J (2008). Latent class models in longitudinal research. In: S. Menard (ed.),Handbook of Longitudinal Research: Design, Measurement, and Analysis, pp. 373-385. Burlington, MA: Elsevier. Vermunt, J.K. (2014, in press). Latent class scaling models for longitudinal and multilevel data sets. In: G. R. Hancock and G. B. Macready (Eds.), Advances in latent class analysis: A Festschrift in honor of C. Mitchell Dayton. Charlotte, NC: Information Age Publishing, Inc.
Appendix: Equations – K latent states, T+1 equidistant time points Latent Markov (LM): Initial latent state & Transition sub-models Measurement sub-model Latent Growth:
Equations: Latent Markov (LM): Initial latent state & Transition sub-models Measurement sub-model Mixture Latent Markov (MLM):
Longitudinal Bivariate Residuals (L-BVRs) Longitudinal bivariate residuals quantify for each response variable Yk how well the overall trend as well as the first- and second-order autocorrelations are predicted by the model. BVR.Time=BVRk(time, yk), BVR.Lag1 =BVRk(yk[t-1], yk[t]) and BVR.Lag2 =BVRk(yk[t-2], yk[t]) residual autocorrelations remaining unexplained by the model. Lag1: