Two Population Means Hypothesis Testing and Confidence Intervals With Unknown Standard Deviations

Two Population Means Hypothesis Testing and Confidence Intervals With Unknown Standard Deviations

The Problem • 1 or 2 are unknown • 1 and 2 are not known (the usual case) OBJECTIVES • Test whether 1 > 2 (by a certain amount) • or whether 1  2 • Determine a confidence interval for the difference in the means: 1 - 2

KEY ASSUMPTIONS Sampling is done from two populations. • Population 1 has mean µ1 and variance σ12. • Population 2 has mean µ2 and variance σ22. • A sample of size n1 will be taken from population 1. • A sample of size n2 will be taken from population 2. • Sampling is random and both samples are drawn independently. • Either the sample sizes will be large or the populations are assumed to be normally distribution.

Distribution of X1 - X2 • Since X1 and X2 are both assumed to be normal, or the sample sizes, n1 and n2 are assumed to be large, then because 1 and 2 are unknown, the random variable X1 -X2 has a: • Distribution -- t • Mean = 1 - 2 • Standard deviation that depends on whether or not the standard deviations of X1 and X2 (although unknown) can be assumed to be equal • Degrees of freedom that also depends on whether or not the standard deviations of X1 and X2 can be assumed to be equal

Appropriate Standard Deviation For X1 -X2 When Are s’s Are Known • Recall the appropriate standard deviation for X1 - X2 is: • Now if 1 = 2 we can simply call it  and write it as: • So if the standard deviations are unknown, we need an estimate for the common variance, 2.

Estimating 2 Degrees of Freedom • If we can assume that the populations have equal variances, then the variance of X1 - X2 is the weighted average of s12 and s22, weighted by: DEGREES OF FREEDOM • There are n1- 1 degrees of freedom from the first sample and n2-1 degrees of freedom from the second sample, so • Total Degrees of Freedomfor the hypothesis test or confidence interval= (n1 -1) + (n2 -1) = n1 + n2 -2

The Appropriate Standard DeviationFor X1 - X2 When Are s’s Unknown, but Can Be Assumed to Be Equal • The best estimate for 2 then is the pooled variance, sp2: • Thus the best estimates for the variance and standard deviation of X1 - X2 are:

t-Statistic and t-Confidence Interval Assuming Equal Variances t-Statistic Confidence Interval Degrees of Freedom = n1 + n2 -2

The Appropriate Standard DeviationFor X1 - X2 When Are s’s Unknown, And Cannot Be Assumed to Be Equal • If we cannot assume that the populations have equal variances, then the best estimate for 12 is s12 and the best estimate for 22 is s22. • Thus the best estimates for the variance and standard deviation of X1 - X2 are:

t-Statistic and t-Confidence Interval Assuming Unequal Variances t-Statistic Confidence Interval Total Degrees of Freedom Round the resulting value.

Testing whether the Variances Can Be Assumed to Be Equal • The following hypothesis test tests whether or not equal variances can be assumed: H0: s12/s22 = 1 (They are equal) HA: s12/s22 1 (They are different) This is an F-test! If the larger of s12 and s22 is put in the numerator, then the test is: Reject H0 if F = s12/s22 > Fa/2, DF1, DF2

Hypothesis Test/Confidence Interval Approach With Unknown ’s • Take a sample of size n1 from population 1 • Calculate x1 and s12 • Take a sample of size n2 from population 2 • Calculate x2 and s22 • Perform an F-test to determine if the variances can be assumed to be equal • Perform the Appropriate Hypothesis Test or Construct the Appropriate Confidence Interval

Example 1 Women Men Number sampled = 32 Sample Average = 75 Sample St’d Dev. = 13.92 Number sampled = 30 Sample Average = 73 Sample St’d Dev. = 11.79 Based on the following two random samples, • Can we conclude that women on the average score better than men on civil service tests? • Construct a 95% for the difference in average scores between women and men on civil service tests. • Because the sample sizes are large, we do not have to assume that test scores have a normal distribution to perform our analyses.

Example 1 – F-test Do an F-test to determine if variances can be assumed to be equal. H0: W2/M2 = 1 (Equal Variances) HA: W2/M2  1 (Unequal Variances) • Select α = .05. • Reject H0 (Accept HA) if Larger s2/Smaller s2 > F.025,DF(Larger s2),DF(Smaller s2) = F.025,31,29 = 2.09 * (*Note this is F.025,30,29 since the table does not give the value for F.025, 31,29) Calculation:sW2/ sM2 = (13.92)2/(11.79)2 = 1.39 Since 1.39 < 2.09, Cannotconclude unequal variances. Do Equal Variance t-test with32+30-2=60 degrees of freedom.

Example 1 The Equal Variance t-Test H0: W - M = 0 HA: W - M > 0 • Select α = .05. • Reject H0 (Accept HA) if t > t.05,60 = 1.658 Since .608 < 1.658, we cannot conclude that women average better than men on the tests.

Example 195% Confidence Interval 95% Confidence Interval 2 ± 6.57 -4.57  8.57

Example 2 LA Lakers LA Clippers Number sampled = 13 Sample Average = 16,675 Sample St’d Dev. = 1014.97 Number sampled = 11 Sample Average = 12,009 Sample St’d Dev. = 3276.73 Based on the following random samples of basketball attendances at the Staples Center, • Can we conclude that the Lakers average attendance is more than 2000 more than the Clippers average attendance at the Staples Center? • Construct a 95% for the difference in average attendance between Lakers and Clippers games at the Staples Center. Since sample sizes are small, we must assume that attendance at Lakers and Clipper games have normal distributions to perform the analyses.

Example 2 – F-test • Do an F-test to determine if variances can be assumed to be equal. H0: C2/L2 = 1 (Equal Variances) HA: C2/L2  1 (Unequal Variances) Note: Clipper variance is the larger sample variance • Choose α = .05. • Reject H0 (Accept HA) if Larger s2/Smaller s2 >F.025,DF(Larger variance),DF(Smaller variance) = F.025,10,12 = 3.37 Calculation:sC2/ sL2 = (3276.73)2/(1014.97)2 = 10.42 Since 10.42 > 3.37, Canconclude unequal variances. Do Unequal Variance t-test.

Degrees of Freedom for the Unequal Variance t-Test = • The degrees of freedom for this test is given by: = 11.626 This rounded to 12 degrees of freedom.

Example 2 – the t-Test Proceed to the hypothesis test for the difference in means with unequal variances: H0: L - C = 2000 HA: L - C > 2000 • Select α = .05. • Reject H0 (Accept HA) if t > t.05,12 = 1.782 Since t = 2.595 > 1.782, wecan conclude that the Lakers average more than 2000 per game more than the Clippers at the Staples Center.

Example 195% Confidence Interval 95% Confidence Interval 4666 ± 2238.47 2427.53  6904.47

Excel Approach • F-test, t-test Assuming Equal Variances, t-test Assuming Unequal Variances are all found in Data Analysis. • Excel only performs a one-tail F-test. • Multiply this 1-tail p-value by 2 to get the p-value for the 2-tail F-test. • Formulas must be entered for the LCL and UCL of the confidence intervals. • All values for these formulas can be found in the Equal or Unequal Variance t-test Output.

Inputting/Interpreting Results From Hypotheses Tests • Express H0 and HA so that the number on the right side is positive (or 0) • The p-value returned for the two-tailed test will always be correct. • The p-value returned for the one-tail test is usually correct. It is correct if: • HA is a “> test” and the t-statistic is positive • This is the usual case • If t < 0, the true p-value is 1 – (p-value printed by Excel) • HA is a “< test” and the t-statistic is negative • This is the usual case • If t>0, the true p-value is 1 – (p-value printed by Excel)

Excel For Example 1 – F-Test Go Tools Select Data Analysis Select F-Test Two-Sample For Variances

Example 1 – F-Test (Cont’d) Use Women (Column A) for Variable Range 1 Use Men (Column B) for Variable Range 2 Check Labels Designate first cell for output.

Example 1 – F-Test (Cont’d) p-value for one-tail test

Example 1 – F-Test (Cont’d) p-value for one-tail test =2*D9 Multiply the one-tail p-value by 2 to get the 2-tail p-value. High p-value (.371671) Cannot conclude Unequal Variances Use Equal Variance t-test

Example 1 – t-Test Go Tools Select Data Analysis Select t-Test: Two-Sample Assuming Equal Variances

Example 1 – t-Test (Cont’d) Since HA is W - M > 0, enter Column A for Range 1 Column B for Range 2 0 for Hypothesized Mean Difference Check Labels Designate first cell for output.

Example 1 – t-test (Cont’d) p-value for the one-tail “>” test p-value for at two-tail “” test High p-value for 1-tail test! Cannot conclude average women’s score > average men’s score

Example 1 – 95% Confidence Interval =(D15-E15)-TINV(.05,D20)*SQRT(D18*(1/D17+1/E17)) * Highlight Cell G19 Add $ Signs Using F4 key Drag to cell G20 Change “-” to “+” *

Excel For Example 2 – F-Test Go Tools Select Data Analysis Select F-Test Two-Sample For Variances

Example 2 – F-Test (Cont’d) Use Lakers (Column B) for Variable Range 1 Use Clippers (Column D) for Variable Range 2 Check Labels Designate first cell for output.

Example 2 – F-Test (Cont’d) p-value for one-tail test Enter =2*F9 to give the p-value for the two-tailed test Low p-value (.000352) – Can conclude Unequal Variances Use Unequal Variance t-test

Example 2 – t-Test Go Tools Select Data Analysis Select t-Test: Two Sample Assuming Unequal Variances

Example 2 – t-Test (Cont’d) Since HA is L - C > 2000, enter Column B for Range 1 Column D for Range 2 2000 for Hypothesized Mean Difference Check Labels Designate first cell for output.

Example 2 – t-test (Cont’d) p-value for the one-tail “>” test p-value for at two-tail “” test Low p-value for 1-tail test (compared to α = .05)! Can conclude the Lakers average more than 2000 more people per game than the Clippers.

Example 2 – 95% Confidence Interval =(F15-G15)-TINV(.05,F19)*SQRT(F16/F17+G16/G17) * Highlight Cell I14 Add $ Signs Using F4 key Drag to cell I15 Change “-” to “+”

Review • Standard Errors and Degrees of Freedom when: • Variances are assumed equal • Variances are not assumed equal • F-statistic to determine if variances differ • t-statistic and confidence interval when: • Variances are assumed equal • Variances are not assumed equal • Hypothesis Tests/ Confidence Intervals for Differences in Means (Assuming Equal or Unequal Variances) • By hand • By Excel

Two Population Means Hypothesis Testing and Confidence Intervals With Unknown Standard Deviations

Two Population Means Hypothesis Testing and Confidence Intervals With Unknown Standard Deviations

Presentation Transcript

Confidence Intervals and Hypothesis Testing with Correlation Coefficients

Introduction to Inference: Confidence Intervals and Hypothesis Testing

Two Population Means Hypothesis Testing and Confidence Intervals With Known Standard Deviations

Confidence Intervals, Hypothesis Testing

Confidence Intervals and Hypothesis Tests

Confidence Intervals for Means

Confidence Intervals and Hypothesis tests with Proportions

Confidence intervals for means

Confidence Intervals and Hypothesis Testing

Confidence Intervals with Means

Statistical inference: confidence intervals and hypothesis testing

Confidence Intervals for Means

Introduction to Inference: Confidence Intervals and Hypothesis Testing

Hypothesis testing and confidence intervals by resampling

Hypothesis Testing for Population Means and Proportions

Two Population Means Hypothesis Testing and Confidence Intervals For Matched Pairs

Confidence Intervals with Means

Confidence Intervals for Means

Confidence intervals and hypothesis testing

Confidence Intervals and Hypothesis Tests for Two Proportions

Confidence Intervals with Means

Confidence Intervals with Means