270 likes | 359 Views
Area Test for Observations Indexed by Time. L. B. Green Middle Tennessee State University E. M. Boczko Vanderbilt University. Outline. Problem The Null Hypothesis The Statistic Determining Significance Comparison to Other Tests Extending the Test. The Problem.
E N D
Area Test for Observations Indexed by Time L. B. Green Middle Tennessee State University E. M. Boczko Vanderbilt University
Outline • Problem • The Null Hypothesis • The Statistic • Determining Significance • Comparison to Other Tests • Extending the Test
The Problem • Four observations of mouse RNA each at 2, 3, 7, and 21 days after birth. • Test to see if there is a change in metabolic regulation of fatty acid metabolism and, if so, when the change happens.
The Problem Independent observations at each time, represented by: A value of zero represents “no change.” Positive values represent an increase, negative values represent a decrease.
are identically distributed, with mean (or median) of zero. The Null Hypothesis There is no change at any time point.
The Null Hypothesis If the null hypothesis is true, then the order of the observations is completely due to chance.
The Statistic • Create a piecewise linear function whose value at each time point is the mean (or median) of the observations at that time point. • Calculate the square of the L2 norm of this function.
The Statistic Note: It is possible for the mi’s to be medians rather than means.
Determining Significance Bootstrap: Sample from a distribution (constructed from the data) that does satisfy H0. Calculate new values of and compare to original value. If H0is true, the original value will not be different from the new values.
Determining Significance • Calculate , the mean of all the data. • Calculate • Repeat B times • Choose a new set of from , with replacement. • Calculate the new value of the test statistic, • Calculate • Reject if
Determining Significance Why sample from original data? The empirical distribution is the closest distribution we have to the true distribution.
Determining Significance Why re-center the data? We must ensure that the distribution we are sampling from satisfies H0.
Determining Significance Reject if If the sample size is large, this p-value is uniformly distributed. So
Determining Significance If sample size is small: t=(0,3,6,10) Four observations per time point.
Other Tests Multiple t-tests At each time point, perform a t-test to see if the mean is different from zero. Combine these results using Bonferroni Correction factor.
Other Tests Multiple t-tests Do not deal with time explicitly. Have very small samples at each time point. Assumes normality in data.
Other Tests ANOVA Test for difference in means using one-way ANOVA. Doesn’t explicitly deal with time. Null hypothesis is that means are the same, not that they are equal to zero. Assumes normality.
Other Tests • Area test is more powerful than multiple t-tests or ANOVA when applied to simulated data sets. • Simulated using data from distributions with means that increase linearly over time. In this case, power depends on slope of the line.
Extending the Test Use median instead of mean at each time point. Allows test to be used in cases where the existence of the mean is in doubt.
Extending the Test Two data sets. Test to see whether both sets of data come from the same distribution, and there is no change in distribution over time.
Extending the Test Two data sets. Distribution may change over time. For example: Comparison to a control data set. Resample within time points rather than across whole set.