220 likes | 568 Views
Longitudinal data analysis in HLM. Longitudinal vs cross-sectional HLM. Similar things: Fixed effects Random effects Difference: Cross-sectional HLM: individual, school,… Longitudinal HLM: observations over time, individual,…. Characteristics in longitudinal data.
E N D
Longitudinal vs cross-sectional HLM • Similar things: • Fixed effects • Random effects • Difference: • Cross-sectional HLM: individual, school,… • Longitudinal HLM: observations over time, individual,…
Characteristics in longitudinal data • Source of variations • Within-subject variation (intra-individual variation) • Between-subject variation (inter-individual variation) • Often incomplete data or unbalanced data • OLS regression is not suitable to analyze longitudinal data because its assumptions are violated by the data.
Limitations of traditional approach for modeling longitudinal data • Univariate repeated measure ANOVA • Person effects are random, time effects and other factor effects are fixed – it reduces residual variance by considering the person effects. • Fixed time point (evenly or unevenly spaced) • It assumes a unique residual variance-covariance structure (compound symmetry), which assume equal variance over time among observations from the same person and a constant covariance.
Limitations of traditional approach for modeling longitudinal data • Univariate repeated measure ANOVA • An alternative assumption, sphericity: it assumes equal variance difference between any two time points.
Limitations of traditional approach for modeling longitudinal data • Univariate repeated measure ANOVA • The assumptions could not be held for longitudinal data • People change at varied rates, so that variances often change over time • Covariances close in time usually greater than covariances distil in time • Test of variance-covariance structure is necessary to validate significance tests
Limitations of traditional approach for modeling longitudinal data • Multivariate repeated measure ANOVA • Use generalized method – no specific assumptions about variances and covariances (unstructured). • It does not allow any other structure, so when the repeated measures increase, it causes over-parameterization. • Subjects with missing data on any time point will be deleted from analysis.
Limitations of traditional approach for modeling longitudinal data • In addition, none of them allow time-varying predictors
Advantage of longitudinal data analysis in HLM • Ability to deal with missing data (missing at random, MAR) • No assumptions about compound symmetry • More flexible: • Unequal numbers of measurement or unequal measurement intervals • Includes time-varying covariate
Research questions • Is there any effect of time on average (fixed effect of time significant)? • Does the average effect of time vary across persons (random effect of time significant)?
A Linear Growth Model • Level 1 (within subject model) Yti is the measurement of ith subject at tthtime point • Level 2 (between subject model)
An example covariance Residual Sample Intercept (Grand Mean) Sample slope (Grand Mean) Individual Intercept Deviation Individual slope Deviation
In the model • Six Parameters: • Fixed Effects: β00and β10, level 2 • Random Effects: • Variances of r0iand r1i(τ002, τ112), level 2 • Covariance of r0iand r1i(τ01), level 2 • Residual Variance of eti (σe2), level 1
Average growth trend Growth rate, average English increase at one unit of time increment is 1.50 Initial status, average English score at time 0 is 235.62 Final estimation of fixed effects: ---------------------------------------------------------------------------- Standard Approx. Fixed Effect Coefficient Error T-ratio d.f. P-value ---------------------------------------------------------------------------- For INTRCPT1, P0 INTRCPT2, B00 235.619409 0.344737 683.476 6822 0.000 For TIME slope, P1 INTRCPT2, B10 1.500423 0.033980 44.156 6822 0.000
Random intercept-slope Growth rates are different among different students, various slopes. A student whose growth is 1 SD above average is expected to grow at the rate of 1.50+0.93=2.43 per time unit Initial status, students vary significantly in English score at time 0. Final estimation of variance components: ----------------------------------------------------------------------------- Random Effect Standard Variance df Chi-square P-value Deviation Component ----------------------------------------------------------------------------- INTRCPT1, R0 19.99410 399.76396 6703 11430.55547 0.000 TIME slope, R1 0.930350.86556 6703 9568.06880 0.000 level-1, E 24.81882 615.97397
Reliability • Ratio of the “true” parameter variance to the “total” observed variance. Close to zero means observed score variance must be due to error. • Without knowledge of the reliability of the estimated growth parameter, we might falsely draw a conclusion due to incapability of detecting relations. ---------------------------------------------------- Random level-1 coefficient Reliability estimate ---------------------------------------------------- INTRCPT1, B0 0.423 TIME, B1 0.108 ----------------------------------------------------
Correlation of change with initial status • Choose “print variance-covariance matrices” under output settings. • Students who have higher English score at initial point tend to have a faster growth rate. Tau (as correlations) INTRCPT1,B0 1.000 0.413 TIME,B1 0.413 1.000
We could make it more complicated • An intercepts- and Slopes-as-outcomes model • Level 1 (within subject model) Yti is the measurement of ith subject at tthtime point • Level 2 (between subject model)