310 likes | 509 Views
Analysis of Clustered and Longitudinal Data. Module 3 Linear Mixed Models (LMMs) for Clustered Data – Two Level Part A. Biostat 512: Module 3A - Kathy Welch, Heidi Reichert. The Linear Mixed Model (LMM). A Linear Mixed Model is a parametric model for a continuous outcome.
E N D
Analysis of Clustered and Longitudinal Data Module 3 Linear Mixed Models (LMMs) for Clustered Data – Two Level Part A Biostat 512: Module 3A - Kathy Welch, Heidi Reichert
The Linear Mixed Model (LMM) • A Linear Mixed Model is a parametric model for a continuous outcome. • The model is linear in the parameters. • The model contains both fixed and random effects. • LMMs can be used to analyze both clustered and longitudinal/repeated measures data. • We will discuss the analysis clustered data using LMMs in this module and cover the analysis of longitudinal and repeated measures data using LMMs in later modules.
30 female rats were randomly assigned to one of three treatment groups, high dose, low dose and control. The objective of the study was to compare the birth weights of pups from litters born to female rats that received the drug treatment at high and low doses to the birth weights of pups from litters that received the control treatment. Research question: Is there an effect of drug treatment (High, Low, Control) on birth weight? Data Example: Rat Pup Data
Clustered Data Example: Rat Pup Data • The design is unbalanced • Number of rats receiving each treatment varies by treatment group (3 rats in the high-dose group died) • Number of rat pups per litter varies across the litters • Variables include • Litter (litter ID number) • Pup_ID (rat pup ID number) • Weight (birth weight of the rat pup: the outcome) • Sex (sex of the rat pup: female or male) • Treatment (dose: high, low, or control)
The Rat Pup Data is Multilevel Level 2 (Litter) Litter 1 Litter 2 Level 1 (Rat Pup) Pup 11 Pup 21 Pup n1 Pup 12 Pup 22 Pup n2 .. .. Level 1 Variables: Birth Weight, Sex Level 2 Variables: Treatment
Weights Vary Within and Between Litters • Rat weights vary from rat to rat within the same litter. • The average litter weight ( ) varies between litters.
Weights are Correlated Within Litters • The weights of rats from within the same litter tend to be pretty similar. • For some litters, the rat weights lie entirely above or below the overall average (-) .
Level 1 covariate is sex sex | Freq. Percent Cum. ------------+----------------------------------- Female | 151 46.89 46.89 Male | 171 53.11 100.00 ------------+----------------------------------- Total | 322 100.00 Summarize the Level 1 Covariate(s)
Y is Weight Level 1 covariate is sex Summary for variables: weight by categories of: female (Sex) female | N mean sd min max ---------+-------------------------------------------------- 0 | 171 6.205322 .6741926 4.57 8.33 1 | 151 5.940132 .5867458 3.68 7.73 ---------+-------------------------------------------------- Total | 322 6.080963 .6474272 3.68 8.33 ------------------------------------------------------------ Summarize Weight by the Level 1 Covariate(s)
Visualize Weight by the Level 1 Covariate(s) • Use boxplots to assess the effect of sex
Level 2 covariate is treatment group treatment | Freq. Percent Cum. ------------+----------------------------------- Control | 10 37.04 37.04 High | 7 25.93 62.96 Low | 10 37.04 100.00 ------------+----------------------------------- Total| 27 100.00 Summarize the Level 2 Covariates
Visualize Weight by the Level 2 Covariates • Use boxplots to assess the effect of treatment
The Linear Mixed Model (LMM) for Clustered Data • LMMs for clustered data allow for both fixed and random effects. • Fixed effects may be modeled at any level of the data. • In the rat pup data, we are interested in the fixed effects of sex and treatment. • Sex can vary from rat to rat. It is measured at Level 1. • Treatment is constant for rats within the same litter. It is measured at Level 2. • Random effects usually include a random intercept for each level of clustering to account for possible correlation within clusters, and to make inference to the larger population of clusters. • In the rat pup model, we will include a random intercept term.
The LMM for the Rat Pup Data fixed random • We start with the simplest mixed model : where i denotes a rat pup j denotes the litter is the overall intercept term is the random deviation from the fixedintercept for litter j is the random error for the ith rat pup in the jth litter
The LMM for the Rat Pup Data ------------------------------------------------------------------------------ weight | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _cons | 6.195284 .1090958 56.79 0.000 5.981461 6.409108 ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval] -----------------------------+------------------------------------------------ litter: Identity | var(_cons) | .3003704 .092285 .1644887 .5485019 -----------------------------+------------------------------------------------ var(Residual) | .1963076 .016214 .1669676 .2308033 ------------------------------------------------------------------------------
The LMM for the Rat Pup Data • The random portion of the model now involves two parts – the cluster-specific random deviations (the b0j), and the subject-within-cluster-specific error (the ). • This LMM is commonly referred to as the Variance Components model, because it partitions the total variation in the outcome into between-cluster variation and within-cluster variation. • The variance of the random intercepts is the between-cluster variation. Also referred to as the Level 2 variance. • The variance of the residuals is the within-cluster variation, also known as the Level 1 variance.
The LMM for the Rat Pup Data random fixed • We now add the dummy variables for the Level 2 covariate, Treatment: where idenotes a rat pup j denotes the litter is the overall intercept term, and represents the mean for Control group are the difference in effect of treatment for the High and Low treatment groups, respectively, compared to Control is the random deviation from the treatment-specific intercept for litter j is the random error for the ith rat pup in the jth litter
The LMM for the Rat Pup Data ------------------------------------------------------------------------------ weight | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- treatnum | 1 | -.3944372 .2695682 -1.46 0.143 -.9227811 .1339067 2 | -.4287423 .2434727 -1.76 0.078 -.9059401 .0484555 | _cons | 6.453315 .1716384 37.60 0.000 6.11691 6.78972 ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval] -----------------------------+------------------------------------------------ litter: Identity | var(_cons) | .276991 .0905209 .1459796 .5255803 -----------------------------+------------------------------------------------ var(Residual) | .1965504 .0162532 .1671422 .2311328 ------------------------------------------------------------------------------
The LMM for the Rat Pup Data • The addition of the Level 2 dummies for treatment has reduced the Level 2 between-cluster variance. • The variance of the random intercepts (or the b0js) is smaller because the systematic variation due to treatment has been removed.
The LMM for the Rat Pup Data random fixed • We now add the dummy variable for the Level 1 covariate, Sex: where idenotes a rat pup j denotes the litter is the overall intercept term, and represents the mean for Males in the Control group are the difference in effect of treatment for the High and Low treatment groups, respectively, compared to Control is the effect being Female compared to Male is the random deviation from the treatment-specific intercept for litter j is the random error for the ith rat pup in the jth litter
The LMM for the Rat Pup Data ------------------------------------------------------------------------------ weight | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- treatnum | 1 | -.354683 .2893063 -1.23 0.220 -.9217129 .2123469 2 | -.3747049 .2617241 -1.43 0.152 -.8876746 .1382648 | 1.female | -.3612726 .0477986 -7.56 0.000 -.4549561 -.2675891 _cons | 6.606246 .1856211 35.59 0.000 6.242436 6.970057 ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval] -----------------------------+------------------------------------------------ litter: Identity | var(_cons) | .3259097 .1037444 .1746388 .6082104 -----------------------------+------------------------------------------------ var(Residual) | .1636033 .0135447 .1390981 .1924257 ------------------------------------------------------------------------------
The LMM for the Rat Pup Data • The addition of the Level 1 dummy for sex has reduced the Level 1 within-cluster variance. • The residual variance is smaller because the systematic variation due to sex has been removed.
The LMM Accounts for Correlation We say that given the s, the s within a cluster are independent.
The Linear Mixed Model (LMM) for Clustered Data • LMMs for clustered data generally include both fixed and random effects. • We include random intercepts for each level of clustering. • In LMMs the random part of the model now involves two parts – the b0js and the s • The variance of the random intercepts (the b0js) quantifies the between-cluster variation in the outcome. • The residual variance (variance of the s) quantifies the within-cluster variation in the outcome.
Lab Example Rat Pup Data