1 / 67

Inferential Statistics: Hypothesis Testing

Inferential Statistics: Hypothesis Testing. Testing Population Variances Analysis of Variance – ANOVA. Content. Estimation Estimate population means Estimate population proportion Estimate population variance Hypothesis testing Testing population means

darice
Download Presentation

Inferential Statistics: Hypothesis Testing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Inferential Statistics:Hypothesis Testing Testing Population Variances Analysis of Variance – ANOVA

  2. Content • Estimation • Estimate population means • Estimate population proportion • Estimate population variance • Hypothesis testing • Testing population means • Testing categorical data / proportion • Testing population variances • Hypothesis about many population means

  3. Testing Population Variances • Single population variance • Chi-square test • Two populations variance • F-test • Analysis of Variance – test many population means • One-way ANOVA • Two-way ANOVA

  4. Single Population Variance • Assumption • Normal distribution of population • Test the population variance σ2 from a sample against a specified population variance σ02. • Hypothesis

  5. Single Population Variance Test statistic – chi-square Critical Region

  6. Example 1 A fluorescent lamp factor knows that the lifespan of the lamps is normally distributed with variance of 10,000 hr2. In an inspection, 20 sample lamps are tested and it is found that the variance is 12,000 hr2. Can a conclusion be drawn that the variance of lamp’s lifespan has changed at significant level 0.05? Hypothesis H0: σ2 = 10,000 H1: σ2 ≠ 10,000 α = 0.05

  7. Example 1 Calculate test statistic Degree of freedom = 20 – 1 = 19 χ2(1-0.025),19 = 8.91 , χ2(0.025),19 = 32.85 The calculated chi-square: 8.91 < 22.8 < 32.85, not falling in two-tailed critical region. Accept H0 and reject H1 The variance of lamp’s lifespan has not changed at significant level 0.05

  8. Example 2 A company claims that the standard deviation of its thermometer does not exceed 0.5 oC. To verify this, 16 thermometers are sampled. It is found that the standard deviation is 0.7 oC. Is the claim true at significant level 0.01? Hypothesis H0: σ2≤ 0.25 H1: σ2 > 0.25 α = 0.01

  9. Example 2 Calculate test statistic Degree of freedom = 16 – 1 = 15 χ2(0.01),15 = 30.58 The calculated chi-square: 29.4 < 30.58, not falling in two-tailed critical region. Accept H0 and reject H1 The claim is true at significant level 0.01

  10. Two Populations Variances • Campare between two variances • Sir Ronald Fisher found that • Given s12 and s22 are variances of the first and second sample groups of size n1 and n2 respectively, • Both sample groups are randomly selected from normally distributed populations, • The skewness of graph changes corresponding to the degree of freedoms of samples, which are n1-1 and n2-1. • This distribution has become known as Fisher Distribution or F-Distribution

  11. http://en.wikipedia.org/wiki/F-distribution

  12. Application and Limitation F-test is used in model fit such as ANOVA and Linear Regression Analysis in order to determine error (i.e. variance) from the model. Applicable to 2 populations only If populations are not normally distributed, the test will be inaccurate.

  13. Two Populations Variances df1 = n1-1 และ df2 = n2-1

  14. Two Populations Variances Alternate form

  15. Example 1 The sale count of bicycles in one week of two retailers are as follows Retailer A: 65 46 57 43 58 Retailer B: 52 41 43 47 32 49 57 If the sale count in one week is normally distributed, test if the variances of the sale count the two retailers are different at significant 0.02. Hypothesis α = 0.02

  16. Example 1 • Calculate variances • SA2 = 82.7, SB2 = 66.143 • Calculate test statistic • From H0, σ12 =σ22 • Degree of freedom • df1 = 5-1 = 4, df2 = 7-1 = 6

  17. Example 1 Critical region Accept H0 and reject H1 The variances of the sale count the two retailers are not different at significant 0.02

  18. Example 2 From a previous study, the variance of time required for female workers to assemble a product is less than that of male workers. To re-verify the study, 11 male workers and 14 female workers are sampled. The observed standard deviations of the assembling time are 6.1 for male and 5.3 for female. Assuming normal distribution of assembling time, test if the result of the previous study is accurate at significant level 0.01. Hypothesis H0: σm2 ≤ σf2 H1: σm2 > σf2 α = 0.01

  19. Example 2 • Calculate test statistic • Degree of freedom • df1 = 11-1 = 10, df2 = 14-1 = 13 • Critical region F0.01(10,13) = 4.10 • Calculated F is 1.325 < 4.10 • Accept H0 and reject H1 • The variance of time required for female workers to assemble a product is not less than that of male workers at significant level 0.01

  20. Analysis of Variance (ANOVA) • Test if any of multiple means are different from each other • One-way ANOVA: 1 variables – 3 or more groups • Dependent variable is assumed is of interval or ratio scale • Also used with ordinal scale data • Can describe the effect of independent variable on dependent variable • Two-way ANOVA: two independent, one dependent variables • MANOVA: Two or more dependent variables • Can describe interaction between two independent variables

  21. One-way ANOVA • Test the means (of dependent variable) between groups as specified by an independent variable that are organized in 3 or more groups (dichotomous) • Occupation: Student, Lecturer, Doctor (1 var - 3 groups) • Salary: dependent variable • Assumptions • Dependent variable is either an interval or ratio (continuous) • Dependent variable is approximately normally distributed for each category of the independent variable • There is equality of variances between the independent groups (homogeneity of variances). • Independence of cases.

  22. One-way ANOVA Concept • Total Variance = Between-Group Variance + Within-Group Variance • Between-Group Variance • Describe the difference of means between groups, which is the effect on variable of interest • Within-Group Variance • Describe the difference of means within each group, which is the effect caused by other factors, called Error H0 : μ1 = μ2 = μ3 = … = μn H1 : μ1 ≠ μ2 ≠ μ3 ≠ … ≠ μn (at least one different pair)

  23. One-way ANOVA Table • k: number of groups n: number of samples SST = SSB + SSW

  24. One-way ANOVA Table Tj: sum of frequencies in each group T: sum of all frequencies nj: frequency in each group k: number of group xij: the ith data (row) of jth group (column) : the mean of group j : overall mean

  25. Example 1 The survey result of the attitude of the executives in small, medium, and large companies toward management administration is shown in the table. Test if the attitudes of the executives from different company sizes are different at significant level 0.05.

  26. Example 1 Hypothesis Ho : 1 = 2 = 3 H1 : 1 ≠ 2 ≠ 3 α = 0.05 Calculate test statistic = (7-6)2 + (7-6)2 + (5-6)2 + (4-6)2 + (7-6)2 + (4-6)2 + (4-6)2 + (2-6)2 + (2-6)2 + (3-6)2 + (10-6)2 + (10-6)2 + (9-6)2 + (6-6)2 + (10-6)2 = 114

  27. Example 1 = 24 / (15-3) = 2 = 45 /2 = 22.5 = 5(6-6)2 + 5(3-6)2 + 5(9-6)2 = 0+45+45 = 90 SSW = SST – SSB = 114 - 90 = 24 = 90 / (3-1) = 45

  28. Example 1 Degree of freedom dfB=3-1=2, dfW=15-3=12 F0.05(2,12)= 3.89 The calculated F is 22.5 > 3.89 Reject H0 and accept H1 The attitudes of the executives from different company sizes are different at significant level 0.05

  29. Post-hoc Test One-way ANOVA does not tell which pairs have different means Post-hoc test (or Post-hoc Analysis or Multiple Comparison) is used to identify the different pairs. *No need if ANOVA accepts H0 (means are not different)

  30. Post-hoc Test Methods requiring equality of variances 1. Least-Significant Different(LSD) 2. Waller – Duncan 3. S-N-K(Student-Newman-Keuls) 4.Dunnett’s C 5. Bonferroni6. Sidak7. Scheffe8. R-E-G-WF 9.Tukey’s HSD 10.R-E-G-WQ11. Tukey’s–b 12.Duncan 13.Hochberg’s GT2 14.Gabriel Methods not requiring equality of variances 1. Tamhane’s T22.Dunnett’s T3 3.Games-Howell4.Dunnett’s C

  31. Least-Significant Different(LSD) • Fisher’s Least-Significant Difference proposed by R.A. Fisher to compare multiple pairs at the same time • Calculate LSD • If n1 = n2 then • Compare to LSD value • If > LSD then the means of the pair are different i ≠ j • Otherwise, the means are not different i = j

  32. From Example 1 n are equal in the 3 groups =0.05, n – k = 15 – 3 = 12, t(0.025, 12) = 2.18 Comparison At significant level 0.05, the attitude of the executives from small companies is higher than that of the medium ones. And the attitude of the executives from large companies is higher than that of the medium and small ones.

  33. Tukey’s Honesty Significant Difference (HSD) • Require the sample sizes to be the same • k = number of groups, dfw = n-k, q is obtained from q-table • Compare to HSD value • If > HSD then the means of the pair are different i ≠ j • Otherwise, the means are not different i = j

  34. From Example 1 n are equal in the 3 groups so HSD is applicable =0.05, k = 3, n – k = 15 – 3 = 12, q0.05, 3, 12 = 3.77 Comparison At significant level 0.05, the attitude of the executives from small companies is higher than that of the medium ones. And the attitude of the executives from large companies is higher than that of the medium and small ones.

  35. Scheffe • Scheffe or S-Method is applicable to different sample sizes • k = number of groups, dfB= k-1, dfw = n-k, MSW from ANOVA • Compare to S value • If > S then the means of the pair are different i ≠ j • Otherwise, the means are not different i = j

  36. From Example 1 =0.05, k=3, dfb=2, dfW=12, F0.05(2,12) = 3.88

  37. From Example 1 Comparison At significant level 0.05, the attitude of the executives from small companies is higher than that of the medium ones. And the attitude of the executives from large companies is higher than that of the medium and small ones.

  38. Example 2 Four teaching methods are applied to 4 groups of students. Based on the exam scores in the table, test if the four methods give different results at significant level 0.01

  39. Example 2 Hypothesis H0 : 1 = 2 = 3 = 4 H1 : 1 ≠ 2 ≠ 3 ≠ 4 α = 0.01 SSB

  40. Example 2 SSW of each group

  41. One-way ANOVA Table • k: number of groups n: number of samples SST = SSB + SSW

  42. Example 2 F0.01(3,22) = 4.82 The calculated F is 7.01 > 4.82 Reject H0 and accept H1 The four teaching methods give different results at significant level 0.01

  43. Two-way ANOVA • Use to determine the effect of 2 factors /treatments (independent variables) on one dependent variable • Occupation: Student, Lecturer, Doctor • Age: less than 20, 20-30, 31-40, 41 or older • Salary: dependent variable • Assumptions • Dependent variable is either interval or ratio (continuous) • The dependent variable is approximately normally distributed for each combination of levels of the two independent variables • Homogeneity of variances of the groups formed by the different combinations of levels of the two independent variables. • Independence of cases

More Related