310 likes | 779 Views
Mixture & Multilevel Modeling. Shaunna Clark & Ryne Estabrook NIDA Workshop – October 19, 2010. Outline . Mixture Models What is mixture modeling? Growth Mixture Model Open Mx Genetic Mixture Models Other Longitudinal Mixture Models Multilevel Models What is multilevel data?
E N D
Mixture & Multilevel Modeling Shaunna Clark & RyneEstabrook NIDA Workshop – October 19, 2010
Outline • Mixture Models • What is mixture modeling? • Growth Mixture Model • Open Mx • Genetic Mixture Models • Other Longitudinal Mixture Models • Multilevel Models • What is multilevel data? • Multilevel regression model • Open Mx
Homogeneity Vs. Heterogeneity • Most models assume homogeneity • i.e. Individuals in a sample all follow the same model • What have seen so far today • But not always the case • Ex: Sex, Age, Alcohol Use Trajectories
What is Mixture Modeling Used to model unobserved heterogeneity by identifying different subgroups of individuals Ex: IQ, Religiosity
Growth Mixture Modeling (GMM) • Muthén & Shedden, 1999; Muthén, 2001 • Setting • A single item measured repeatedly • Hypothesized trajectory classes • Individual trajectory variation within class • Aims • Estimate trajectory shapes • Estimate trajectory class probabilities • Proportion of sample in each trajectory class • Estimate variation within class
Linear Growth Model Diagram σ2Int,Slope σ2Slope σ2Int S I mInt 1 mSlope 2 0 1 3 4 1 1 1 1 1 x1 x2 x3 x4 x5 σ2ε1 σ2ε2 σ2ε3 σ2ε4 σ2ε5
Linear GMM Model Diagram C σ2Int,Slope σ2Slope σ2Int S I mInt 1 mSlope 2 0 1 3 4 1 1 1 1 1 x1 x2 x3 x4 x5 σ2ε1 σ2ε2 σ2ε3 σ2ε4 σ2ε5
Growth Mixture Model Equations xitk = Interceptik + λtk*Slopeik + εitk for individual i at time t in class k εitk ~ N(0,σ)
LCGA Vs. GMM • LCGA – Latent Class Growth Analysis • Nagin, 1999; Nagin & Tremblay, 1999 • Same as GMM except no residual variance on growth factors • No individual variation within class (i.e. everyone has the same trajectory • LCGA is a special case of GMM
Class Enumeration • Determining the number of classes • Can’t use LRT Χ2 • Not distributed as Χ2 due to boundary conditions (McLachlan & Peel, 2000) • Information Criteria: AIC (Akaike, 1974), BIC (Schwartz,1978) • Penalize for number of parameters and sample size • Model with lowest value • Interpretation and usefulness • Profile plot • Substantive theory • Predictive validity
Global Vs Local Maximum Log Likelihood Log Likelihood Global Global Local Local Parameter Parameter
Open Mx Example • Take it away Ryne!
Selection of Mixture Genetic Analysis Writings • Growth Mixture Model • Wu et al., 2002; Kerner and Muthen, 2009; Gillespie et al., (submitted) • Latent Class Analysis • Eaves, 1993; Muthén et al., 2006; Clark, 2010 • Additional References • McLachlan, Do, & Ambroise, 2004
Other Longitudinal Mixture models • Survival Mixture • Multiple latent classes of individuals with different survival functions • Kaplan, 2004; Masyn, 2003; Muthén & Masyn, 2005 • Longitudinal Latent Class Analysis • Models patterns of change over time, rather than functional growth form • Lanza & Collins, 2006; Feldman et al., 2009 • Latent Transition Analysis • Models transition from one state to another over time • Ex: Drinking alcohol or not over time • Graham et al., 1991; Nylund et al., 2006
What is Multilevel Data . . . • Most methods assume individuals are independent • Responses for one individual do not influence another individual’s responses • Multilevel, or nested data, arise when individuals are not independent • Ex: Twins in a family, students in a classroom • Share common experiences
. . .And why we should Care • When ignore nested structure, have underestimated standard errors • Can lead to misinterpretation of the significance of model parameters • Large body of literature about how to handle nested data • Today, focus on multilevel techniques • General multilevel texts: • Raudenbush & Bryk, 2002; Snijders & Bosker, 1999
Multilevel Model Equation For individual i in cluster j: • Level One (Individual) yij = β0j + β1j*xij + εij • Level Two (Twin Pair\Family) β0j = γ00 + γ01*wj + μ0j β1j = γ10 + γ11*wj + μ1j Where εitk ~ N(0,σ), μ~ N(0,Ψ), Cov(ε,μ) = 0 xij is an individual level covariate (age, weight) wj is a cluster level covariate (maternal smoking)
Multilevel Model Equation extensions • Can have additional levels • Ex: Individuals within nuclear families with family • Can be longitudinal • Ex: Observations within individuals within families
Mixed Effects Vs. Multilevel Modeling • They are the same thing!! Multilevel Model Equation: Level One (L1): yij = β0j + β1j*xij + εij Level Two (L2): β0j = γ00 + μ0j β1j = γ10 + μ1j Mixed Model Equation: Plug L2 into L1, some rearranging yij = (γ00 + μ0j) + (γ10 + μ1j) *xij+ εij yij = γ00 + γ10*xij + μ0j + μ1j*xij + εij Fixed Effects Random Effects
Multilevel Vs. Multivariate Modeling of Families • Today have dealt with multivariate analyses • Multivariate • Model for all variables for each family member • Family members can have different parameter values • Ex: different growth trajectories for parents vs. children • Only feasible when small number of family members • Ex: twins, spouses A A C C E E PA PB
Multilevel Modeling of Families • Model for variation within individual and between family members • Members of a cluster are assumed statistically equivalent • i.e. Same model for each family member • Can handle various family structures • Ex: Large pedigrees, families with differing numbers of siblings • Do not have to make arbitrary assignment of family members (and checking whether assignment impacted estimates) • Ex: Assigning twins to A and B
Implementation of Multilevel models in open Mx • OpenMx Discussion http://openmx.psyc.virginia.edu/thread/125 • Discuss more tomorrow in Dynamical Systems talk
Multilevel Genetic Articles • General • Discuss how to extend ACDE model to twins and larger family pedigrees • Guo & Wang, 2002; McArdle & Prescott, 2005; Rabe-Hesketh, Skrondal, Gjessing, 2008 • Longitudinal • McArdle, 2006 • Other • Inclusion of measured genotypes: Van den Oord, 2001
Data Considerations • Multivariate – Wide • Multiple family members per row of data • Multilevel – Long • One individual per row of data