390 likes | 618 Views
Inference About a Mean Vector, Part 1. BMTRY 726 1/21/2014. Basic Ideas. Define a “reasonable” distance measure. An estimate of a mean vector that is “far enough away” from our hypothesized mean vector provides evidence against the null hypothesis. Likelihood Ratio Test :
E N D
Inference About a Mean Vector, Part 1 BMTRY 726 1/21/2014
Basic Ideas • Define a “reasonable” distance measure. An estimate of a mean vector that is “far enough away” from our hypothesized mean vector provides evidence against the null hypothesis. • Likelihood Ratio Test: • Specify a multivariate likelihood • Conduct LRT • Exact distribution • Large sample distribution
Univariate Case • Consider a random sample X1, X2, …, Xn~iid N(m, s2). • What is the LRT for the Hypothesis (Note, s2 is not specified by the null hypothesis) • The sufficient statistics are
Compute the t statistic • If the difference in the estimated mean and the hypothesized mean is large enough we decide against H0. • Note when H0 is true
Multivariate Case Now the random sample is Test Sufficient Stats Test Statistic
Conclude H0: m = m0 unlikely if The normality assumption is needed to show that has a distribution when the null hypothesis is true.
Recall that has an approximately central chi-square distribution with p d.f. when is true and n-p is large. Note that but
Calcium Content of Turnip Greens Data Test
Calcium Content of Turnip Greens Summary statistics
Calcium Content of Turnip Greens Compute
Calcium Content of Turnip Greens Compare p-value?
LRT Consider a LRT for testing Recalling that Thus we want to consider
LRT cont’d Under H0:m = m0, m0 is known giving Under H0 or HA:
LRT cont’d Then…
LRT cont’d Then…
LRT cont’d Then and approximate chi-squared test (when n is large) is
The LRT test statistic can be rearranged to which is referred to as Wilk’s Lambda The relationship between Wilk’s Lambda and Hotelling’sT2 is Thus rejecting H0 for small values of the likelihood ratio is equivalent to rejecting H0 for largeT2. This is a nice results since we know the distribution of T2 under the null.
Properties of • Invariant to full rank affine transformation Transform each data vector Then Test Instead of
Properties of Compute
Properties of • Invariant to full rank affine transformation Transform each data vector And test instead of (2) T 2 is the most powerful test with property (1)
Consider reductions to one dimension The null hypothesis H0: m = m0 is true iff all the hypotheses are true Test with
Confidence Regions and Comparisons An “exact” (1-a) x 100% confidence region for the mean vector m is given by the set of all vectors which are “close enough” to the observed sample mean vector to satisfy (1) those hypothesized mean vectors H0:m = v not rejected by T2 when is observed (2) v is close enough to w.r.t. the squared statistical distance
One at a Time… If X1, X2,…,Xn ~ NID(m, S) then for any vector of constants Consequently, an exact confidence interval for the single quantity is Consider a set of p of these one at a time confidence intervals for
One at a Time… Each interval has probability 1-aof covering the respective mi. In general these intervals are correlated. (1) (2) (3)
Turnip Example Consider our turnip example where
Simultaneous Confidence A set of simultaneous (1-a) x 100% confidence intervals for all linear combinations of the elements of the mean vector is given by This means that the probability of at least one of these intervals does not contain the corresponding is no larger than a These simultaneous confidence intervals are good for data “snooping”
For an arbitrary linear combination of the elements of the mean vector Recall that Implies
For an arbitrary linear combination of the elements of the mean vector
Turnip Example Consider simultaneous CIs for our turnip example where
Turnip Example What if we want to know about the difference in means between X1 and X3
If attention is restricted to a few quantities, a set of less conservative simultaneous CIs can be constructed (than those provided by the T2 method) Bonferroni Method: Consider making a set of k simultaneous confidence intervals for the quantities Let ci be the event that the ith interval contains
Using the Bonferroni’s inequality we get We want this to be 1-a
This can be done by choosing Then the k simultaneous Bonferroni confidence intervals are For example, if you want simultaneous 95% confidence intervals for k = 3 intervals use
Turnip Example Consider Bonferroni CIs for the 3 means rather than simultaneous CIs for our turnip example
Turnip Example • Let’s compare the three intervals for the mean • How do these compare with n is large • What about when p(or the number of comparisons in the case of Bonferroni intervals) is large
Turnip Example What if we want to know about the difference in means between pairs of components in X.