410 likes | 529 Views
Today’s lesson (Chapter 13). variance(W-Y) Two sample (W, Y) standard normal test Confidence intervals for E(W-Y) Determining the sample size needed to have a specified probability of a Type II error and probability of a Type I error in a two sample test. Var(W-Y).
E N D
Today’s lesson (Chapter 13) • variance(W-Y) • Two sample (W, Y) standard normal test • Confidence intervals for E(W-Y) • Determining the sample size needed to have a specified probability of a Type II error and probability of a Type I error in a two sample test.
Var(W-Y) • Definition: Variance(Y)=E[(Y-EY)2] • Fact: Expectation is a linear operator. • Variance(W-Y)=E[(W-Y-E(W-Y))2] • Variance(W-Y)=E[(W-EW-(Y-EY))2] • Expand right term out using high school algebra.
Var(W-Y) • E[(W-EW-(Y-EY))2]= • E[(W-EW)2+(Y-EY)2-2(W-EW)(Y-EY)]= • E[(W-EW)2]+E[(Y-EY)2]-2E[(W-EW)(Y-EY)]= • var(W)+var(Y)-2cov(W,Y).
New Facts • New Definition • cov(W,Y)=cov(Y,W)=E[(W-EW)(Y-EY)] • cov(W,Y) is the numerator of the correlation coefficient of W and Y • New Fact • var(W-Y)=var(W)+var(Y)-2cov(W,Y)
Two Sample Testing Problem • Research team has two samples, n observations from W and m observations from Y. • ASS-U-ME • W sample and Y sample are independent • W is normally distributed with mean E(W) and variance σW2 • Y is normally distributed with mean E(Y) and variance σY2
Two Sample Testing Problem • Null hypothesis: E(W)=E(Y) • New parameter: E(W)-E(Y) • Alternative hypothesis may be left-sided, right-sided, or two-sided. • Test statistic: • W mean (of n observations) - Y mean (of m observations)
Distribution of Test Statistic • Distribution of W mean - Y mean • either normal or approximately normal • expected value is E(W-Y) • variance is (σW2/n)+(σY2/m) • Under null hypothesis, expected value of difference of means is 0.
Deriving standard error of difference of two means • Var(W mean-Y mean)= • var(W mean)+var(Y mean)-2cov(W mean, Y mean) • W sample is assumed to be independent of Y sample, so covariance of means is 0. • var(W mean)=σW2/n, n is number in W sample. • var(Y mean)=σY2/m, m is number in Y sample.
Variances Known • Find null distribution of test statistic using known variances and sample sizes. • Standardize the test statistic, which is always the difference of the two sample means. • Follow standard decision sequence.
Variances Unknown • This is a Student’s t problem. • Two possibilities • ASS-U-ME var(W)=var(Y) is reasonable; this is the classic two independent sample t-test that is usually covered in the prerequisite class. • Assumption var(W)=var(Y) is not reasonable; use unequal variance t-test.
Checking Assumption of Equal Variances • Use SPSS • statistics, compare means, independent sample t-test. • SPSS uses Levene’s test for equality of variances. • Sig. means p-value of Levene’s test • Use it as you would any observed significance level.
Choosing Equal or Unequal Variance t-test • Some statistics professors always want equal variance t-test. Answer their questions with the equal variance t-test. This typically includes Actuary Society questions. • In AMS315 life, I will tell you which test to use (use the equal variance t-test if there is no specification).
Choosing Equal or Unequal Variance t-test • In real life, I ALWAYS use the unequal variance t-test. • Some people choose the unequal variance t-test if the p-value for Levene’s test of the equality of variances is very small.
Example Problem Group I • I present you with a computer output on the comparison of average irresponsibility at time 5 for subjects who did not use marijuana at time 3 to the average irresponsibility at time 5 for subjects who did use marijuana at time 3.
Example Problem Group I • You register that this is an A vs. B comparison, with the A group being those who did not use marijuana at time 3 and the B group is those who did use marijuana at time 3. The dependent variable is irresponsibility at time 5.
Example Problem Group I • Reading the output, you learn that there were 215 subjects who did not use marijuana at time 3 and that their average irresponsibility was 10.7860, with a standard deviation of 2.5779. There were 151 subjects who did use marijuana at time 3, and their average irresponsibility was 10.8411. The standard deviation was 2.3526.
Example Problem 1 • Levene’s test for the equality of variances had sig.=0.206. • The 2-tailed sig for the equal variance test was 0.835, for the unequal variance test was 0.833.
First Problem • Which of the following conclusions is correct about the test of the null hypothesis that expected irresponsibility at time 5 for a subject who did not use marijuana at time 3=expected irresponsibility at time 5 for a subject who did use marijuana at time 3
First Problem Continued • against the alternative hypothesis that expected irresponsibility at time 5 for a subject who did not use marijuana at time 3 was not equal expected irresponsibility at time 5 for a subject who did use marijuana at time 3? • Usual options.
Solution • Both p-values were approximately equal and were large (0.8). • Hence, the correct decision is to accept at the 0.10 level of significance (last option).
Second Problem • What is the correct decision in the following? The null hypothesis is: Expected irresponsibility at time 5 for a subject who did not use marijuana at time 3 - expected irresponsibility at time 5 for a subject who did use marijuana at time 3 = 0, alpha=0.05;
Second Problem Continued • the alternative is expected irresponsibility at time 5 for a subject who did not use marijuana at time 3 - expected irresponsibility at time 5 for a subject who did use marijuana at time 3 is not equal to 0.
Solution to Second Problem • Read your output to find that the mean difference in the two groups was -5.50E-02. • -5.50E-02 is a notation for -5.50x10-02=-0.0550. • Check means of groups to confirm that the mean difference is that of irresponsibility for subjects who did not use marijuana at time 3-irresponsibility for subjects who did use (10.7860-10.8411=-0.0550.
Solution to Second Problem • Check that this order is the same as the order that I asked you about. • Read your computer output to find that the 95 percent confidence interval for the mean difference is -0.5744 to 0.4644. • Check whether or not the value 0 specified in the problem is in the confidence interval. • It is, so accept the null hypothesis.
Third Problem • What is the value of the t-test assuming equal variances?
Solution to the Third Problem • The computer output has the mean difference (-5.50E-02). • The computer output has the standard error of the mean difference (0.2641, on the line labeled “Equal variances assumed”) • t-test is the standard score value of the test statistic (-0.0550-0)/0.2641=-0.21.
Fourth Problem • How many degrees of freedom does the equal variance independent sample t-test have in this problem?
Solution • Read the computer output to find that there were 215 subjects who did not use marijuana at time 3. • There were 151 subjects who did use marijuana at time 3. • The number of degrees of freedom is n+m-2 in general. • Here, 215+151-2=364.
Fifth Problem • Which of the following is a correct decision about the test of the null hypothesis that variance of irresponsibility at time 5 for a subject who did not use marijuana at time 3 is equal to the variance of irresponsibility at time 5 for a subject who did use marijuana against the alternative that these two variances are not equal? Usual options.
Solution • Read the computer output to find that the sig of Levene’s test is 0.206. • This is larger than 0.10, the level of significance in the last option. • The answer is to accept at the 0.10 level (choose the last option).
Example Problem Group II • Each patient in a study will take a specified medicine, and the patient’s response to that medicine will be measured. Twenty patients will be randomly assigned to two groups of ten each.
Example Problem Group II • Group 1 will receive an experimental medicine. The random variable X denotes a patient’s response to the experimental medicine and is normally distributed with unknown expected value E(X) and unknown standard deviation σ.
Example Problem Group II • Group 2 will receive the best available medicine. The random variable B denotes a patient’s response to the experimental medicine and is normally distributed with unknown expected value E(B) and unknown standard deviation σ. The null hypothesis in this experiment is that E(X-B)=0, and the alternative is that E(X-B)<0.
Example Problem Group II • The experiment was run. The observed x sample average was 274.9; and the observed b sample average was 473.7. The observed X group standard deviation was 233.7, and the B group standard deviation was 348.0. The resulting pooled estimate of the standard deviation was 296.5.
Group II First Problem • What is the standard deviation of the random variable X average - B average?
Solution • Var(X average)=σ2/10. • Var(B average)=σ2/10. • Two averages are from independent samples, and so the covariance is zero. • Var(X average-B average)=(σ2/10)+(σ2/10) • sd(X average-B average)=(0.2)0.5σ=0.447σ. • The answer is 0.447σ. NOT 0.447(296.5)!
Group II, Second Problem • Which of the following is a correct decision for accepting or rejecting the null hypothesis based on the sample averages and standard deviations given in the common paragraph? • Usual options: reject at 0.01, accept at 0.01 and reject at 0.05, accept at 0.05 and reject at 0.10, and accept at 0.10.
Solution • Calculate the t-statistic (standard score form of the test statistic). • Difference of means is 274.9-473.7=-198.8 • Estimated standard error of test statistic is 0.447(296.5)=132.36. • Standard units value=(-198.8-0)/132.36=-1.50. • Find degrees of freedom. • 10+10-2=18
Solution • Determine side of test. • Left sided test. • Stretch normal distribution critical values to values appropriate for 18 degrees of freedom. • Stretch -2.326 (0.01 level) to -2.552, -1.645 to -1.734, and -1.282 to -1.330 • Decide: Accept at 0.01; accept at 0.05; reject at 0.10. Option C is correct.
Today’s Class • New fact about var(W-Y) • Application to testing two independent samples. • Making Student’s corrections.
Next Class • Paired t-test. • Finding smarter ways of making an A vs. B comparison.