200 likes | 380 Views
ENGR 610 Applied Statistics Fall 2007 - Week 7. Marshall University CITE Jack Smith. Overview for Today. Review Hypothesis Testing , 9.1-9.3 One-Sample Tests of the Mean Go over homework problem 9.2 Hypothesis Testing , 9.4-9.7 Testing for the Difference between Two Means
E N D
ENGR 610Applied StatisticsFall 2007 - Week 7 Marshall University CITE Jack Smith
Overview for Today • Review Hypothesis Testing, 9.1-9.3 • One-Sample Tests of the Mean • Go over homework problem 9.2 • Hypothesis Testing, 9.4-9.7 • Testing for the Difference between Two Means • Testing for the Difference between Two Variances • Testing for Paired Data or Repeated Measures • Testing for the Difference among Proportions • Homework assignment
Hypothesis Testing • One-Sample Tests for the Mean • Z Test ( known) • t Test ( unknown) • Two-tailed and one-tailed tests • p-value • Connection with Confidence Interval • Z Test for the proportion
Null hypothesis • A “no difference” claim about a population parameter under suspicion based on a sample • Tested by sample statistics and either rejected or accepted based on critical test (Z, t, F, 2) value • Rejection implies that an alternative (the opposite) hypothesis is more probable • Analogous to a mathematical ‘proof by contradiction’ or the legal notion of ‘innocent until proven guilty’ • Only the null hypothesis involves an equality, while the alternative hypothesis deals only with inequalities
Critical Regions • Critical value of test statistic (Z, t, F, 2,…) • Based on desired level of significance • Acceptance (null hypothesis) region, and a • Rejection (alternative hypothesis) region • One-tailed or two-tailed
Type I and Type II errors • Seek proper balance between Type I and II errors • Type I error - false negative • Null hypothesis rejected when in fact it is true • Occurs with probability • = level of significance - chosen! • (1- ) = confidence coefficient • Type II error - false positive • Null hypothesis accepted when in fact it is false • Occurs with probability • = consumer’s risk • (1- ) = power of test • Depends on , difference between hypothesized and actual parameter value, and sample size
Z Test ( known) - Two-tailed • Critical value (Zc) based on chosen level of significance, • Typically = 0.05 (95% confidence), where Zc = 1.96 (area = 0.95/2 = 0.475) • = 0.01 (99%) and 0.001 (99.9%) are also common, where Zc = 2.57 and 3.29 • Null hypothesis (<X> = µ) rejected if Z > Zc or < -Zc, where
Z Test ( known) - One-tailed • Critical value (Zc) based on chosen level of significance, • Typically = 0.05 (95% confidence), but where Zc = 1.645 (area = 0.95 - 0.50 = 0.45) • Null hypothesis (<X> ≤µ) rejected if Z > Zc, where
tTest ( unknown) - Two-tailed • Critical value (tc) based on chosen level of significance, , and degrees of freedom, n-1 • Typically = 0.05 (95% confidence), where, for exampletc = 2.045 (upper area = 0.05/2 = 0.025), for n-1 = 29 • Null hypothesis rejected if t > tc or < -tc, where t
Z Test on Proportion • Use normal approximation to binomial distribution, where
p-value vs critical value • Use probabilities corresponding to values of test statistic (Z, t,…) • If the p-value , accept null hypothesis • If the p-value < , reject null hypothesis • E.g., compare p to α instead of t to tc • More direct • Does not necessarily assume distribution is normal
Connection with Confidence Interval • Compute the Confidence Interval for the sample statistic (e.g., the mean) as in Ch 8 • If the hypothesized population parameter is within the interval, accept the null hypothesis, otherwise reject it • Equivalent to a two-tailed test • Double α for half-interval (one-tail) test
Z Test for the Difference between Two Means • Random samples from independent groups with normal distributions and known1 and 2 • Any linear combination (e.g., the difference) of normal distributions (k, k) is also normal CLT: Populations 1 & 2 the same
t Test for the Difference between Two Means (Equal Variances) • Random samples from independent groups with normal distributions, but with equal and unknown1 and 2 • Using the pooled sample variance H0: µ1 = µ2
t Test for the Difference between Two Means (Unequal Variances) • Random samples from independent groups with normal distributions, with unequal and unknown1 and 2 • Using the Satterthwaiteapproximation to the degrees of freedom (df) • Use Excel Data Analysis tool!
F test for the Difference between Two Variances • Based on F Distribution - a ratio of 2 distributions, assuming normal distributions • FL(,n1-1,n2-1) F FU(,n1-1,n2-1), whereFL(,n1-1,n2-1) = 1/FU(,n2-1,n1-1), and whereFU is given in Table A.7 (using nearest df)
Mean Test for Paired Data or Repeated Measures • Based on a one-sample test of the corresponding differences (Di) • Z Test for known population D • t Test for unknown D (with df = n-1) H0: D = 0
2 Test for the Difference among Two or More Proportions • Uses contingency table to compute • (fe)i = nip or ni(1-p) are the expected frequencies, where p = X/n, and (fo)i are the observed frequencies • For more than 1 factor, (fe)ij = nipj, where pj = Xj/n • Uses the upper-tail critical 2 value, with the df = number of groups – 1 • For more than 1 factor, df = (factors -1)*(groups-1) Sum over all cells
Other Tests • 2 Test for the Difference between Variances • Follows directly from the 2 confidence interval for the variance (standard deviation) in Ch 8. • Very sensitive to non-Normal distributions, so not a robust test. • Wilcoxon Rank Sum Test between Two Medians
Homework • Work through rest of Appendix 9.1 • Work and hand in Problems 9.69, 9.71, 9.74 • Read Chapter 10 • Design of Experiments: One Factor and Randomized Block Experiments