620 likes | 1.11k Views
Hypothesis Testing. Hypothesis testing involves drawing inferences about two contrasting propositions (hypotheses) relating to the value of a population parameter, one of which is assumed to be true in the absence of contradictory data.We seek evidence to determine if the hypothesis can be rejected
E N D
1. 2007 Pearson Education Chapter 5: Hypothesis Testing and Statistical Inference
2. Hypothesis Testing Hypothesis testing involves drawing inferences about two contrasting propositions (hypotheses) relating to the value of a population parameter, one of which is assumed to be true in the absence of contradictory data.
We seek evidence to determine if the hypothesis can be rejected; if not, we can only assume it to be true but have not statistically proven it true.
3. Hypothesis Testing Procedure Formulate the hypothesis
Select a level of significance, which defines the risk of drawing an incorrect conclusion that a true hypothesis is false
Determine a decision rule
Collect data and calculate a test statistic
Apply the decision rule and draw a conclusion
4. Hypothesis Formulation Null hypothesis, H0 a statement that is accepted as correct
Alternative hypothesis, H1 a proposition that must be true if H0 is false
Formulating the correct set of hypotheses depends on burden of proof what you wish to prove statistically should be H1
Tests involving a single population parameter are called one-sample tests; tests involving two populations are called two-sample tests.
5. Types of Hypothesis Tests One Sample Tests
H0: population parameter ? constant vs.
H1: population parameter < constant
H0: population parameter ? constant vs.
H1: population parameter > constant
H0: population parameter = constant vs.
H1: population parameter ? constant
Two Sample Tests
H0: population parameter (1) - population parameter (2) ? 0 vs.
H1: population parameter (1) - population parameter (2) < 0
H0: population parameter (1) - population parameter (2) ? 0 vs.
H1: population parameter (1) - population parameter (2) > 0
H0: population parameter (1) - population parameter (2) = 0 vs.
H1: population parameter (1) - population parameter (2) ? 0
6. Four Outcomes The null hypothesis is actually true, and the test correctly fails to reject it.
The null hypothesis is actually false, and the hypothesis test correctly reaches this conclusion.
The null hypothesis is actually true, but the hypothesis test incorrectly rejects it (Type I error).
The null hypothesis is actually false, but the hypothesis test incorrectly fails to reject it (Type II error).
7. Quantifying Outcomes Probability of Type I error (rejecting H0 when it is true) = a = level of significance
Probability of correctly failing to reject H0 = 1 a = confidence coefficient
Probability of Type II error (failing to reject H0 when it is false) = b
Probability of correctly rejecting H0 when it is false = 1 b = power of the test
8. Decision Rules Compute a test statistic from sample data and compare it to the hypothesized sampling distribution of the test statistic
Divide the sampling distribution into a rejection region and non-rejection region.
If the test statistic falls in the rejection region, reject H0 (concluding that H1 is true); otherwise, fail to reject H0
9. Rejection Regions
10. Hypothesis Tests and Spreadsheet Support
11. Hypothesis Tests and Spreadsheet Support (contd)
12. One Sample Tests for Means Standard Deviation Unknown Example hypothesis
H0: m ? m0 versus H1: m < m0
Test statistic:
Reject H0 if t < -tn-1, ?
13. Example For the Customer Support Survey.xls data, test the hypotheses
H0: mean response time ? 30 minutes
H1: mean response time < 30 minutes
Sample mean = 21.91; sample standard deviation = 19.49; n = 44 observations
Reject H0 because t = 2.75 < -t43,0.05 = -1.6811
14. PHStat Tool: t-Test for Mean PHStat menu > One Sample Tests > t-Test for the Mean, Sigma Unknown
15. Results
16. Using p-Values p-value = probability of obtaining a test statistic value equal to or more extreme than that obtained from the sample data when H0 is true
17. One Sample Tests for Proportions Example hypothesis
H0: p ? p0 versus H1: p < p0
Test statistic:
Reject if z < -za
18. Example For the Customer Support Survey.xls data, test the hypothesis that the proportion of overall quality responses in the top two boxes is at least 0.75
H0: p ? .75
H0: p < .75
Sample proportion = 0.682; n = 44
For a level of significance of 0.05, the critical value of z is -1.645; therefore, we cannot reject the null hypothesis
19. PHStat Tool: One Sample z-Test for Proportions PHStat > One Sample Tests > z-Tests for the Proportion
20. Results
21. Type II Errors and the Power of a Test The probability of a Type II error, b, and the power of the test (1 b) cannot be chosen by the experimenter.
The power of the test depends on the true value of the population mean, the level of confidence used, and the sample size.
A power curve shows (1 b) as a function of m1.
22. Example Power Curve
23. Two Sample Tests for Means Standard Deviation Known Example hypothesis
H0: m1 m2 ? 0 versus H1: m1 - m2 < 0
Test Statistic:
Reject if z < -za
24. Two Sample Tests for Means Sigma Unknown and Equal Example hypothesis
H0: m1 m2 ? 0 versus H1: m1 - m2 > 0
Test Statistic:
Reject if z > za
25. Two Sample Tests for Means Sigma Unknown and Unequal Example hypothesis
H0: m1 m2 = 0 versus H1: m1 - m2 ? 0
Test Statistic:
Reject if z > za/2 or z < - za/2
26. Excel Data Analysis Tool: Two Sample t-Tests Tools > Data Analysis > t-test: Two Sample Assuming Unequal Variances, or t-test: Two Sample Assuming Equal Variances
Enter range of data, hypothesized mean difference, and level of significance
Tool allows you to test H0: ?1 - ?2 = d
Output is provided for upper-tail test only
For lower-tail test, change the sign on t Critical one-tail, and subtract P(T<=t) one-tail from 1.0 for correct p-value
27. PHStat Tool: Two Sample t-Tests PHStat > Two Sample Tests > t-Test for Differences in Two Means
Test assumes equal variances
Must compute and enter the sample mean, sample standard deviation, and sample size
28. Comparison of Excel and PHStat Results Lower-Tail Test
29. Two Sample Test for Means With Paired Samples Example hypothesis
H0: average difference = 0 versus
H1: average difference ? 0
Test Statistic:
Reject if t > tn-1,a/2 or t < - tn-1,a/2
30. Two Sample Tests for Proportions Example hypothesis
H0: p1 p2 = 0 versus H1: p1 - p2 ? 0
Test Statistic:
Reject if z > za/2 or z < - za/2
31. Hypothesis Tests and Confidence Intervals If a 100(1 a)% confidence interval contains the hypothesized value, then we would not reject the null hypothesis based on this value with a level of significance a.
Example hypothesis
H0: m ? m0 versus H1: m < m0
If a 100(1-a)% confidence interval does not contain m0, then we can reject H0
32. F-Test for Differences in Two Variances Hypothesis
H0: s12 s2 2= 0 versus H1: s12 - s22 ? 0
Test Statistic:
Assume s12 > s22
Reject if F > Fa/2,n1-1,n2-1 (see Appendix A.4)
Assumes both samples drawn from normal distributions
33. Excel Data Analysis Tool: F-Test for Equality of Variances Tools > Data Analysis > F-test for Equality of Variances
Specify data ranges
Use a/2 for the significance level!
If the variance of Variable 1 is greater than the variance of variable 2, the output will specify the upper tail; otherwise, you obtain the lower tail information.
34. PHStat Tool: F-Test for Differences in Variances PHStat menu > Two Sample Tests > F-test for Differences in Two Variances
Compute and enter sample standard deviations
Enter the significance level a, not a/2 as in Excel
35. Excel and PHStat Results
36. Analysis of Variance (ANOVA) Compare the means of m different groups (factors) to determine if all are equal
H0: m1 = m1 = ... mm
H1: at least one mean is different from the others
37. ANOVA Theory nj = number of observations in sample j
SST = total variation in the data
SSB = variation between groups
SSW = variation within groups
38. ANOVA Test Statistic MSB = SSB/(m 1)
MSW = SSW/(n m)
Test statistic: F = MSB/MSW
Has an F-distribution with m-1 and n-m degrees of freedom
Reject H0 if F > Fa/2,m-1,n-m
39. Excel Data Analysis Tool for ANOVA Tools > Data Analysis > ANOVA: Single Factor
40. ANOVA Results
41. ANOVA Assumptions The m groups or factor levels being studied represent populations whose outcome measures are
Randomly and independently obtained
Are normally distributed
Have equal variances
Violation of these assumptions can affect the true level of significance and power of the test.
42. Nonparametric Tests Used when assumptions (usually normality) are violated. Examples:
Wilcoxon rank sum test for testing difference between two medians
Kurskal-Wallis rank test for determining whether multiple populations have equal medians.
Both supported by PHStat
43. Tukey-Kramer Multiple Comparison Procedure ANOVA cannot identify which means may differ from the rest
PHStat menu > Multiple Sample Tests > Tukey-Kramer Multiple Comparison Procedure
44. Chi-Square Test for Independence Test whether two categorical variables are independent
H0: the two categorical variables are independent
H1: the two categorical variables are dependent
45. Example Is gender independent of holding a CPA in an accounting firm?
46. Chi-Square Test for Independence Test statistic
Reject H0 if c2 > c2a, (r-1)(c-1)
PHStat tool available in Multiple Sample Tests menu
47. Example
48. PHStat Procedure Results
49. Design of Experiments A test or series of tests that enables the experimenter to compare two or more methods to determine which is better, or determine levels of controllable factors to optimize the yield of a process or minimize the variability of a response variable.
50. Factorial Experiments All combinations of levels of each factor are considered. With m factors at k levels, there are km experiments.
Example: Suppose that temperature and reaction time are thought to be important factors in the percent yield of a chemical process. Currently, the process operates at a temperature of 100 degrees and a 60 minute reaction time. In an effort to reduce costs and improve yield, the plant manager wants to determine if changing the temperature and reaction time will have any significant effect on the percent yield, and if so, to identify the best levels of these factors to optimize the yield.
51. Designed Experiment Analyze the effect of two levels of each factor (for instance, temperature at 100 and 125 degrees, and time at 60 and 90 minutes)
The different combinations of levels of each factor are commonly called treatments.
52. Treatment Combinations
53. Experimental Results
54. Main Effects Measures the difference in the response that results from different factor levels
Calculations
Temperature effect = (Average yield at high level) (Average yield at low level)
= (B + D)/2 (A + C)/2
= (90.5 + 81)/2 (84 + 88.5)/2
= 85.75 86.25 = 0.5 percent.
Reaction effect = (Average yield at high level) (Average yield at low level)
= (C + D)/2 (A + B)/2
= (88.5 + 81)/2 (84 + 90.5)/2
= 84.75 87.25 = 2.5 percent.
55. Interactions When the effect of changing one factor depends on the level of other factors.
When interactions are present, we cannot estimate response changes by simply adding main effects; the effect of one factor must be interpreted relative to levels of the other factor.
56. Interaction Calculations Take the average difference in response when the factors are both at the high or low levels and subtracting the average difference in response when the factors are at opposite levels.
Temperature Time Interaction
= (Average yield, both factors at same level) (Average yield, both factors at opposite levels)
= (A + D)/2 (B + C)/2
= (84 + 81)/2 (90.5 + 88.5)/2 = -7.0 percent
57. Graphical Illustration of Interactions
58. Two-Way ANOVA Method for analyzing variation in a 2-factor experiment
SST = SSA + SSB + SSAB + SSW
where
SST = total sum of squares
SSA = sum of squares due to factor A
SSB = sum of squares due to factor B
SSAB = sum of squares due to interaction
SSW = sum of squares due to random variation (error)
59. Mean Squares MSA = SSA/(r 1)
MSB = SSB/(c 1)
MSAB = SSAB/(r-1)(c-1)
MSW = SSW/rc(k-1),
where k = number of replications of each treatment combination.
60. Hypothesis Tests Compute F statistics by dividing each mean square by MSW.
F = MSA/MSW tests the null hypothesis that means for each treatment level of factor A are the same against the alternative hypothesis that not all means are equal.
F = MSB/MSW tests the null hypothesis that means for each treatment level of factor A are the same against the alternative hypothesis that not all means are equal.
F = MSAB/MSW tests the null hypothesis that the interaction between factors A and B is zero against the alternative hypothesis that the interaction is not zero.
61. Excel Anova: Two-Factor with Replication
62. Results