Lecture 8: Generalized Linear Models for Longitudinal Data

Lecture 8:Generalized Linear Models for Longitudinal Data

Generalized Linear Models for Longitudinal Data • Generalized Linear Models: A Review • The Logistic Regression Model • Marginal model • Random effects model • Transition model • Contrasting Approaches Different assumptions about the correlation structure

GLM Examples • Linear regression • Logistic regression • Poisson regression

Note for GLMs • var(Yi) may be a function of i • Logistic: var(Yi) = i(1- i) • Poisson: var(Yi) = I • Logistic regression model • Marginal logistic regression model • Logistic model with random effects • Transition logistic model

1. Marginal Logistic Regression Model • Goal: To assess the dependence of respiratory infection on vitamin A status in the Indonesian Children’s Health Study

Parameter Interpretation

Correlation between binary outcomes Two options: • Specify pairwise correlations • Model association among binary data using the odds ratio Which is better? Option 2.

2. Logistic model with random effects Assume: • The propensity for respiratory infections varies across children, reflecting their different genetic predispositions and unmeasured influences of environmental factors • Each child has his/her own propensity for respiratory disease , but that the effect of vitamin A deficiency ( ) on this probability is the same for every child, i.e. • Given i, we further assume that the repeated observations for the ith child are independent of one another

Logistic model with random effects (cont’d) = log odds of respiratory infection for a “typical” child (with random effect )

Logistic model with random effects (cont’d) = odds of infection for a child with random effect when he/she is vitamin A deficient relative to when the same child is not vitamin A deficient = degree of heterogeneity across the children in the propensity of disease, not attributable to x Ratio of individual odds Ratio of population odds

Random effects (RE) model: Basic ideas • There is natural heterogeneity across individuals in their regression coefficients, and this heterogeneity can be explained by a probability distribution • RE models are most useful when the objective is to make inference about individuals rather than the population average • represents the effects of the explanatory variables on an individual child’s chance of infection • …this is in contrast with the marginal model coefficients, which describe the effect of explanatory variables on the population average

Parameter Interpretation of a Logistic Regression Model with Random Effects Note: The odds of respiratory infection for a hypothetical child with random effect Ui and with vitamin A deficiency, are equal to times the odds of respiratory infection for the same hypothetical child with random effect Uiwithout vitamin A deficiency.

Parameter Interpretation of a Logistic Regression Model with Random Effects(cont’d) Compare the individual odds from the previous slide: …with the population average odds below:

3. Transition models for binary responses(a.k.a. Markov Models) • i.e. the chance of respiratory infection at time tij depends on explanatory variables, and also on whether or not the child had infection 3 months earlier = change in the log odds of infection for a unit change in x, among children with outcome yij-1 at the prior visit = ratio of the odds of infection among children who did and did not have infection at the prior visit

Transition matrix We can also specify this logistic model via a transition matrix: Elements of matrix are “transition probabilities”, example:

Parameter Interpretation of a Transition Logistic Regression Model Note: The odds of respiratory infection for a hypothetical child with outcome at the previous visit equal to yij-1with vitamin A deficiency are equal to times the odds of repiratory infection for the same hypothetical child with outcome at the previous visit equal to yij-1withoutvitamin A deficiency.

In summary • Marginal model: describes the effect of explanatory variables on the chance of infection in the entire population. • Random effects model describes the effect of the explanatory variables on an individual chance of infection. • Transition model describes the effect of explanatory variables on the chance of infection adjusted by the outcome for respiratory infection at the prior visit.

In summary (cont’d) Note that this transition model: is also a marginal logistic regression model for the sub-population of children who are free of infection at the previous visit.

Contrasting Approaches • In linear models, the interpretation of β is essentially independent of the correlation structure. • In non-linear models for discrete data, such as logistic regression, different assumptions about the source of correlation can lead to regression coefficients with distinct interpretations. • Two examples: • Infant growth • Respiratory disease data

Linear Regression Model:Infant Growth

Linear Regression Model:Infant Growth (cont’d) In the linear case, it is possible to formulate the three regression approaches to have coefficients with the same interpretation. That is, coefficients from a random effect model and a transition model can have marginal interpretations as well.

Linear Random Effects Model In the linear random effects model, the regression coefficients also have a marginal interpretation: …this is because the average of the rates of growth for individuals is the same as the change in the population average weight across time in a linear model. Note:

Linear Random Effects Model (cont’d)

Transition Model, AR(1):Infant Growth This form of transition model has coefficients which have a marginal interpretation

In summary • Marginal model: describes the change in the average response for a unit change in xij for the entire population • Random effects model describes the change in the average response for a unit change in xij for a particular subject, and for the entire population • Transition model describes the change of the average response for a unit change in xijamong subjects with yij-1 in the prior visit

In summary (cont’d) That is, these coefficients have the same interpretation!, and their numerical estimates will be very similar:

A Simulation Example

A Simulation Example (cont’d)

A Simulation Example (cont’d) However, the corresponding change in absolute risk differs depending on the baseline rate The population rate of infection is the average risk, which is given by:

Implementation of Simulation Study

If we use a marginal model… • In the marginal model, we ignore the differences among children and model the population average, i.e. P(Yij=1) rather than P(Yij = 1 | Ui) • We estimate the fraction of people with infection among the people with (or without) vitamin A deficiency: • Infection rate in the sub-group that has sufficient vitamin A deficiency is estimated to be:

If we use a marginal model… (cont’d) • Odds ratio for vitamin A deficiency is: so that

Figure 7.2. Simulation of risk of infection with and without dichotomous risk factor, for 100 children in a logistic model with random intercept. Population average risks with and without infection are indicated by the vertical line on the left. The Gaussian density for the random intercepts is shown below.

Marginal Model vs. Random Effects • Marginal and Random Effects model parameters differ in the logistic model • Marginal: ratio of population odds • Random Effects: ratio of individuals’ odds • Marginal parameter values are small in absolute values than their random effects analogues • Parameters in transition models also differ from both the random effects and marginal model parameters

Lecture 8: Generalized Linear Models for Longitudinal Data