120 likes | 227 Views
Repeated Measures in Statistical Modeling. ANOVA/Linear Model Assumptions. All observations are independent. All observations have the same, simple variance ( σ 2 ). A Simplified Example.
E N D
ANOVA/Linear Model Assumptions • All observations are independent. • All observations have the same, simple variance (σ2)
A Simplified Example • We ask informants to repeat a given set of words 5 times to test the effect of sex on the formant frequency for F1. • Are these five repetitions independent?
Previous Solutions • A similar problem was noted by Saito(1998) for VARBRUL results: Multiple observations originating from the same subject. • Young and Yandell(1998) response was to add a subject effect if you thought that a person was behaving significantly differently from the rest of the population.
Adding a subject or repetition effect • We could add a subject effect to see if people are behaving differently. • We could add a repetition effect to see if the first repetition is significantly different.
Independence: What does it mean in a formal mathematical sense • A intuitive definition of two observations being independent is that knowing one observation gives us no information about another. • A formal Mathematical Definition: • Two observations are independent if the probability of the two observations occurring can be decomposed into the product of the two probabilities.
Proposed Solution 1:Subject Effect • What does testing a subject effect mean? • Non-subject model • Y=intercept + sex effect + noise. • Subject Model • Y=intercept + sex effect + subject effect + noise. • So? • We are modeling the mean effect of each subject on our response (F1.0). This does not make each subject independent. • We still have the original problem, but a more complex model. • When does it make sense to have a subject term? • When you are testing searching for response variables that are not explainable by your other factors, but may instead be a function of (an or several) individual’s (s’) idiosyncratic behavior
Proposed Solution # 2 • Add a repetition effect (i.e. for each repeated measure code for the first, second etc. repetition). • Again, adding more terms to the model does not make the observations independent. • What are we testing? • By doing this, we are testing if there is a difference between the one repetition and each of the others. This may not be useful except to test for a drift effect (i.e. the equipment is malfunctioning consistently across interviews).
What can we do? • Repeated measures provide us with more information about the response we are measuring, but also introduce a level of complexity that may not be necessary. • Average over the repetitions to produce an ‘average value’ for each factor combination. • Don’t take repeated measures.
Statistical Solutions • If we keep the repetitions and wish to account for them, we have to modify our model’s variance assumptions. • Subject-specific models • We are focused on individuals and wish to analyze or predict information for specific people (an example of this would be ram fertility) • Random Intercepts—Each individual has their own intercept that contributes not only to the mean of the response variable, but also contributes a noise component • Population-average models • We are focused on inference for the entire population and don’t really care about the individuals in the study. • Usually produce statistics for each individual level (except the reference parameter).
Software Solutions • R has implemented Subject-specific models through nlme package • SAS has several procedure: Subject-specific models: PROC MIXED, PROC NLMIXED, PROC GLIMMIX, Population Average: PROC GLIMMIX
Alternative Solutions • Non-probabilistic based • Classification Trees • Algorithmic use of regression techniques. Rather than using probabilistic measures of significance, these are replaced with classification errors. The underlying probabilistic assumptions (such as independence) become irrelevant.