1.52k likes | 1.66k Views
Modeling longitudinal data Sanja Franić Vrije Universiteit Amsterdam. Introduction Frequently, researchers are faced with the question of how to optimally utilize longitudinal data E.g., one may have collected data on children’s cognitive abilities, at ages 10, 12, 14, 16, and 18
E N D
Modeling longitudinal data Sanja Franić Vrije Universiteit Amsterdam
Introduction • Frequently, researchers are faced with the question of how to optimally utilize longitudinal data • E.g., one may have collected data on children’s cognitive abilities, at ages 10, 12, 14, 16, and 18 • Some of the possible questions:
Introduction • How large is the role of factors that act in concert across different time points to cause the observed stability of the variable of interest over time? • How large is the role of those factors that cause individual differences specific to a certain time point? • Is there a stabile driving force behind the growth or decline of a trait over time, or do novel factors relevant to the trait emerge at different time points? If so, how to detect and quantify them?
Introduction • How well can a trait at a certain time point be predicted from a measurement at the preceding time point? • How much do individuals differ in the starting level of the variable of interest (e.g., in mathematical skills prior to formal education)? • How much do individuals differ in their speed of growth or decline over time? • Does the development of a skill follow a linear curve, or is there non-linear change?
Introduction • In today’s workshop, we will cover several types of models aimed at addressing the above questions
Overview • Cholesky decomposition • Simplex model • Latent growth curve model
Example • You’ve collected longitudinal data on IQ. You applied the Wechsler Intelligence Scale for Children (WISC) at ages 10, 12, 14, and 16.
Example • Structure of the data at each time point:
Example • Subscale scores:
Example • Subscale scores: VCI PRI WMI PSI Age 10
Example • Subscale scores: VCI VCI PRI PRI WMI WMI PSI PSI Age 10 Age 12
Example • Subscale scores: VCI VCI VCI PRI PRI PRI WMI WMI WMI PSI PSI PSI Age 10 Age 12 Age 14
Example • Subscale scores: VCI VCI VCI VCI PRI PRI PRI PRI WMI WMI WMI WMI PSI PSI PSI PSI Age 10 Age 12 Age 14 Age 16
Example • Let us assume all subscale scores at all time points are continuous normally distributed variables (for IQ scores this is a reasonable assumption; with real data, one can test it) • We will demonstrate each of the three methods as applied to this example dataset; we will use path-diagrammatic representations of the data (as presented in the SEM workshops by Dylan Molenaar) • A brief explanation of path diagrams:
Path diagrams • Squares = observed (measured) variables VCI PRI WMI PSI
Path diagrams • Circles = latent variables VCI PRI g WMI PSI
Path diagrams • Single-headed arrows = causal relations in the model VCI PRI g WMI PSI
Path diagrams • Path coefficients = strength of the relationship between variables VCI .5 PRI .6 g .7 WMI .8 PSI
Path diagrams • Double-headed arrows: variances and covariances VCI .5 1 PRI .6 g .7 WMI .8 PSI
Path diagrams • Double-headed arrows: variances and covariances VCI .5 .1 1 PRI .6 g .7 WMI .8 PSI
Path diagrams • Residuals VCI .5 .1 1 PRI .6 g .7 WMI .8 PSI
Overview • Cholesky decomposition • Simplex model • Latent growth curve model
Cholesky decomposition VCI12 VCI14 VCI16 VCI10 PRI12 PRI14 PRI16 PRI10 WMI12 WMI14 WMI16 WMI10 PSI12 PSI14 PSI16 PSI10 Age 10 Age 12 Age 14 Age 16
Cholesky decomposition PSI12 PSI14 PSI16 PSI10 Age 10 Age 12 Age 14 Age 16
Cholesky decomposition 1 PS10 PSI12 PSI14 PSI16 PSI10 Age 10 Age 12 Age 14 Age 16
Cholesky decomposition 1 1 PS12 PS10 PSI12 PSI14 PSI16 PSI10 Age 10 Age 12 Age 14 Age 16
Cholesky decomposition 1 1 1 PS14 PS12 PS10 PSI12 PSI14 PSI16 PSI10 Age 10 Age 12 Age 14 Age 16
Cholesky decomposition 1 1 1 1 PS14 PS16 PS12 PS10 PSI12 PSI14 PSI16 PSI10 Age 10 Age 12 Age 14 Age 16
Cholesky decomposition • - the first factor (PS10) captures all of the variation in perceptual speed at age 10 (PSI10) and the variation in the other three observed variables (PSI12, PSI14, and PSI16) which they share with PSI10 1 1 1 1 PS14 PS16 PS12 PS10 PSI12 PSI14 PSI16 PSI10 Age 10 Age 12 Age 14 Age 16
Cholesky decomposition - this factor (PS10) represents what is common to all four observed variables 1 1 1 1 PS14 PS16 PS12 PS10 PSI12 PSI14 PSI16 PSI10 Age 10 Age 12 Age 14 Age 16
Cholesky decomposition • →factor that causes stability of the observed measure (perceptual speed) across all four ages 1 1 1 1 PS14 PS16 PS12 PS10 PSI12 PSI14 PSI16 PSI10 Age 10 Age 12 Age 14 Age 16
Cholesky decomposition • - this factor (PS12) represents what is common only to the last three observed variables 1 1 1 1 PS14 PS16 PS12 PS10 PSI12 PSI14 PSI16 PSI10 Age 10 Age 12 Age 14 Age 16
Cholesky decomposition • →factor that causes stability of the observed measure (over and above the stability caused by the first factor, PS10) across the last three ages 1 1 1 1 PS14 PS16 PS12 PS10 PSI12 PSI14 PSI16 PSI10 Age 10 Age 12 Age 14 Age 16
Cholesky decomposition • - the third factor (PS14) represents what is common only to the last two observed variables 1 1 1 1 PS14 PS16 PS12 PS10 PSI12 PSI14 PSI16 PSI10 Age 10 Age 12 Age 14 Age 16
Cholesky decomposition - the last factor (PS16) represents the variation that is unique to the last variable (PSI16) 1 1 1 1 PS14 PS16 PS12 PS10 PSI12 PSI14 PSI16 PSI10 Age 10 Age 12 Age 14 Age 16
Cholesky decomposition If we specify a model like this (e.g., in Mplus) and fit it to observed data, we will get estimates of parameters in the model – in this case, the loadings of the observed variables on the latent factors. 1 1 1 1 PS14 PS16 PS12 PS10 PSI12 PSI14 PSI16 PSI10 Age 10 Age 12 Age 14 Age 16
Cholesky decomposition If we specify a model like this (e.g., in Mplus) and fit it to observed data, we will get estimates of parameters in the model – in this case, the loadings of the observed variables on the latent factors. 1 1 1 1 PS14 PS16 PS12 PS10 λ11 λ21 λ31 λ41 PSI12 PSI14 PSI16 PSI10 Age 10 Age 12 Age 14 Age 16
Cholesky decomposition If we specify a model like this (e.g., in Mplus) and fit it to observed data, we will get estimates of parameters in the model – in this case, the loadings of the observed variables on the latent factors. 1 1 1 1 PS14 PS16 PS12 PS10 λ11 λ21 λ22 λ31 λ32 λ41 λ42 PSI12 PSI14 PSI16 PSI10 Age 10 Age 12 Age 14 Age 16
Cholesky decomposition If we specify a model like this (e.g., in Mplus) and fit it to observed data, we will get estimates of parameters in the model – in this case, the loadings of the observed variables on the latent factors. 1 1 1 1 PS14 PS16 PS12 PS10 λ11 λ21 λ22 λ31 λ32 λ41 λ33 λ42 λ43 PSI12 PSI14 PSI16 PSI10 Age 10 Age 12 Age 14 Age 16
Cholesky decomposition If we specify a model like this (e.g., in Mplus) and fit it to observed data, we will get estimates of parameters in the model – in this case, the loadings of the observed variables on the latent factors. 1 1 1 1 PS14 PS16 PS12 PS10 λ11 λ21 λ22 λ31 λ32 λ41 λ33 λ42 λ43 λ44 PSI12 PSI14 PSI16 PSI10 Age 10 Age 12 Age 14 Age 16
Cholesky decomposition These loadings (i.e., path coefficients in a path diagram) represent the strength of the relationship between the observed variables and the latent factors. 1 1 1 1 PS14 PS16 PS12 PS10 λ11 λ21 λ22 λ31 λ32 λ41 λ33 λ42 λ43 λ44 PSI12 PSI14 PSI16 PSI10 Age 10 Age 12 Age 14 Age 16
Cholesky decomposition Knowing that the variability that is common to processing speed at ages 10, 12, 14, and 16 is represented by the first latent factor, and the paths between that factor and each of the observed variables... 1 1 1 1 PS14 PS16 PS12 PS10 λ11 λ21 λ31 λ41 PSI12 PSI14 PSI16 PSI10 Age 10 Age 12 Age 14 Age 16
Cholesky decomposition ... and knowing the estimates of the path loading parameters (λs)... 1 1 1 1 PS14 PS16 PS12 PS10 λ11 λ21 λ31 λ41 PSI12 PSI14 PSI16 PSI10 Age 10 Age 12 Age 14 Age 16
Cholesky decomposition • ... we can quantify the proportion of variance in processing speed at a given time point that is due to factors that cause temporal stability across all four ages. 1 1 1 1 PS14 PS16 PS12 PS10 λ11 λ21 λ31 λ41 PSI12 PSI14 PSI16 PSI10 Age 10 Age 12 Age 14 Age 16
Cholesky decomposition • E.g., at age 12 this proportion is: λ21*var(PS10)*λ21 • = λ21*1*λ21 • = λ212. 1 1 1 1 PS14 PS16 PS12 PS10 λ11 λ21 λ31 λ41 PSI12 PSI14 PSI16 PSI10 Age 10 Age 12 Age 14 Age 16
Cholesky decomposition • E.g., at age 12 this proportion is: λ21*var(PS10)*λ21 • = λ21*1*λ21 • = λ212. • We assume here that the • variance of the observed • variable is 1. Otherwise, to • obtain the proportion, we • have to divide the λ212by the • variance of the observed • variable. 1 1 1 1 PS14 PS16 PS12 PS10 λ11 λ21 λ31 λ41 PSI12 PSI14 PSI16 PSI10 Age 10 Age 12 Age 14 Age 16
Cholesky decomposition • At age 14 it is: λ31*var(PS10)*λ31 • = λ31*1*λ31 • = λ312. 1 1 1 1 PS14 PS16 PS12 PS10 λ11 λ21 λ31 λ41 PSI12 PSI14 PSI16 PSI10 Age 10 Age 12 Age 14 Age 16
Cholesky decomposition • E.g., if these are the factor loading estimates (below), then the factors that cause stability of processing speed (PS) across all ages explain .52=.25 of the variance in PS at age 12, .32=.09 of the variance in PS at age 14, and .22=.04 of the variance in PS at age 16. 1 1 1 1 PS14 PS16 PS12 PS10 1 .5 .3 .2 PSI12 PSI14 PSI16 PSI10 Age 10 Age 12 Age 14 Age 16
Cholesky decomposition • If there are additional sources of stability arising after the initial age of measurement (age 10), those are quantified by the other path coefficients in the model. 1 1 1 1 PS14 PS16 PS12 PS10 PSI12 PSI14 PSI16 PSI10 Age 10 Age 12 Age 14 Age 16
Cholesky decomposition • For instance, if a factor emerges at age 12, which causes additional temporal stability in processing speed across the ages 12-16, over and above the stability caused by factors present at age 10, the strength of influence of that factor is quantified by the coefficients λ22, λ32, and λ42. 1 1 1 1 PS14 PS16 PS12 PS10 λ22 λ32 λ42 PSI12 PSI14 PSI16 PSI10 Age 10 Age 12 Age 14 Age 16