130 likes | 254 Views
ANOVA: ANalysis Of VAriance. In the general linear model x = μ + σ 2 (Age) + σ 2 (Genotype) + σ 2 (Measurement) + σ 2 (Condition) + σ 2 ( ε ) Each of the terms σ 2 can be questioned. Moreover, their particular combinations can be studied:
E N D
In the general linear model x = μ + σ2(Age) + σ2(Genotype) + σ2(Measurement) + σ2(Condition) + σ2(ε) Each of the terms σ2can be questioned. Moreover, their particular combinations can be studied: x = μ + … σ2(Age X Genotype) +…+ σ2(Age X Genotype X Condition) + … + σ2(ε)
…discrete classes (~bins, levels etc.) for one variable,.. Y X Class 1 Class 2
Sampling • Random • Should provide sufficient sample size given the signal/noise ratio • The population from which the sample is taken should correspond to the studied general population
While only comparing two means, ANOVA will give the same results as the normal t-test. • However, it allows comparing multiple means and thus multiple groups (factor levels) as well as multiple factors simultaneously.
Basic terms • Factor: an independent variable to be tested in the ANOVA design. Example: gender • Factor level: an individual value of the variable specifying the factor, defines a group of observations. Example: MALE • Observation: an individual element of the dataset; shall have unambiguously identified factor levels it belongs to • ANOVA design: a chart to delineate which factors are analysed, with which level and in which combinations • Factor interaction: a cumulative action of more than one factors that cannot be predicted from their known individual signals • Effect: a signal of a factor or of an interaction of factors
Basic terms • Sum of squares, SS: the sum of squared individual deviations from a mean (~the cumulative estimate of the variability due to the factor in the dataset) • Number of degrees of freedom, df: an estimate of the number of individual elements that have contributed to SS • Mean square, MS: SS/df, the normalized measure of the variability due to the factor
ANOVA (as well as t-test) takes into account: • Mean differences (~effect magnitude) • Variance (~noise magnitude) • Sample size (as a measure of potential bias) P(H0) = f(SSF, SSe, df) To estimate every effect, all the 3 components shall be known for it! In ANOVA, due to its complexity, it is more problematic than in t-tests
The core ANOVA test: F = MSfactor /MSerror The F value is distributed in accordance with the F statistics, and provides a p-value for the null hypothesis (σ2(effect) = 0) given the dffactor and dferror
A factor effect is easier to prove if: • The mean difference is bigger • The residual variance is smaller • The sample is larger
Fixed effect factors: levels are deliberately arranged by the experimenter, rather than randomly sampled from an infinte population of possible levels: to study the effects of EXACTLY THESE levels of specific research interest. • Random effect factors: levels sampled from a population of “possible levels” instead: to study the effect of the factor in general
A simple criterion for deciding whether an effect in an experiment is random or fixed is to determine how you would select (or arrange) the levels for the respective factor in a replication of the study. For example, if you want to replicate a school study, you would choose (take a sample of) different schools from the population of schools. Thus, the factor "school" in this study would be a random factor. In contrast, if you want to compare the academic performance of boys to girls in an experiment with a fixed factor Gender, you would always arrange two groups: boys and girls. Hence, in this case the same (and in this case only) levels of the factor Gender would be chosen when you want to replicate the study.
Variance components • The estimates of σ2(a factor) derived from the ANOVA results: MSs, Ns, etc. Allow not only prove an effect of the factor, but to show its strength. Especially useful to compare multiple ANOVA results with each other.