1 / 28

Applications of Pearson’s Chi-Square Test in Statistical Inference

Exploring three key applications of Pearson’s chi-square test: testing goodness of fit, independence, and equality of proportions. Learn statistical hypotheses, randomization plans, calculations, critical values, practical significance, assumptions, and more.

edeforest
Download Presentation

Applications of Pearson’s Chi-Square Test in Statistical Inference

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 17 Statistical Inference For Frequency Data I Three Applications of Pearson’s 2 Testing goodness of fit Testing independence Testing equality of proportions

  2. A. Testing Goodness of Fit 1. Statistical hypotheses H0: OPop 1 = EPop 1, . . . , OPop k = EPop k H1: OPop j ≠ EPop jfor some j and j 2. Randomization Plan One random sample of n elements Each element is classified in terms of membership in one of k mutually exclusive categories

  3. B. Testing Independence 1. Statistical hypotheses H0: p(A and B)= p(A)p(B) H1: p(A and B)≠ p(A)p(B) 2. Randomization Plan One random sample of n elements Each element is classified in terms of two variables, denoted by A and B, where each variable has two or more categories.

  4. C. Testing Equality of Proportions 1. Statistical hypotheses H0: p1 = p2 =. . . = pc H1: pj ≠ pjfor some j and j 2. Randomization Plan c random samples, where c ≥ 2 For each sample, elements are classified in terms of membership in one of r = 2 mutually exclusive categories

  5. II Testing Goodness of Fit A. Chi-Square Distribution

  6. B. Pearson’s chi-square statistic 1. Oj and Ej denote, respectively, observed and expected frequencies. k denotesthe number of categories. 2. Critical value of chi square is with  = k – 1 degrees of freedom.

  7. C. Grade-Distribution Example 1. Is the distribution of grades for summer-school students in a statistics class different from that for the fall and spring semesters? Fall and Spring Summer GradeProportion Obs. frequency A .12 15 B .23 21 C .47 30 D .13 6 F .05 0 1.00 24

  8. 2. The statistical hypotheses are H0: OPop 1 = EPop 1, . . . , OPop 5 = EPop 5 H1: OPop j ≠ EPop jfor some j and j 3. Pearson’s chi-square statistic is 4. Critical value of chi square for  = .05, k = 5 categories, and  = 5 – 1 = 4 degrees of freedom is

  9. Table 1. Computation of Pearson’s Chi-Square for n = 72 Summer-School Students (1) (2) (3) (4) (5) (6) Grade Ojpj npj = Ej Oj – Ej A 15 .12 72(.12) = 8.6 6.4 4.763 B 21 .23 72(.23) =16.6 4.4 1.166 C 30 .47 72(.47) = 33.8 –3.8 0.427 D 6 .13 72(.13) = 9.4 –-3.4 1.230 F 0 .05 72(.05) = 3.6 –3.6 3.600 72 1.00 72.0 2 = 11.186* *p < .025

  10. 5. Degrees of freedom when e parameters of a theoretical distribution must be estimated is k – 1 – e. D. Practical Significance 1. Cohen’s w where and denote, the observed and expected proportions in the jth category.

  11. 2. Simpler equivalent formula for Cohen’s 3. Cohen’s guidelines for interpreting w 0.1 is a small effect 0.3 is a medium effect 0.5 is a large effect

  12. E. Yates’ Correction 1. When = 1, Yates’ correction can be applied to make the sampling distribution of the test statistic for Oj– Ej , which is discrete, better approximate the chi-square distribution.

  13. F. Assumptions of the Goodness-of-Fit Test 1. Every observation is assigned to one and only one category. 2. The observations are independent 3. If  = 1, every expected frequency should be at least 10. If > 1, every expected frequency should be at least 5.

  14. III Testing Independence A. Statistical Hypotheses H0: p(A and B)= p(A)p(B) H1: p(A and B)≠ p(A)p(B) B. Chi-Square Statistic for an rc Contingency Table with i = 1, . . . , r Rows and j = 1, . . . , c Columns

  15. C. Computational Example: Is Success on an Employment-Test Item Independent of Gender? Observed Expected b1 b2 b1 b2 Fail Pass Fail Pass a1 Man 84 18 102 88.9 13.1 a2 Women 93 8 101 88.1 12.9 177 26 203

  16. D. Computation of expected frequencies 1. A and B are statistically independent if p(ai and bj) = p(ai)p(bj) 2. Expected frequency, for the cell in row i and column j

  17. Observed Expected b1 b2 b1 b2 a1 84 18 102 88.9 13.1 a2 93 8 101 88.1 12.9 177 26 203    

  18. E. Degrees of Freedom for an rc Contingency Table df = k – 1 – e = rc – 1 – [(r – 1) + (c – 1)] = rc – 1 – r + 1 – c + 1 = rc – r – c + 1 = (r – 1)(c – 1) = (2 – 1)(2 – 1) = 1

  19. F. Strength of Association and Practical Significance 1. Cramér’s where s is the smaller of the number of rows and columns.

  20. 2. Practical significance, Cohen’s ŵ 3. For a contingency table, an alternative formula for is

  21. G. Three-By-ThreeContingency Table 1. Motivation and education of conscientious objectors during WWII High Grade College School School Total Coward 12 25 35 72 Partly Coward 19 23 30 72 Not Coward 71 56 24 151 Total 102 104 89 295

  22. 2. Strength of Association, Cramér’s 3. Practical significance

  23. H. Assumptions of the Independence Test 1. Every observation is assigned to one and only one cell of the contingency table. 2. The observations are independent 3. If  = 1, every expected frequency should be at least 10. If  > 1, every expected frequency should be at least 5.

  24. IV Testing Equality of c ≥ 2 Proportions A. Statistical Hypotheses H0: p1 = p2 =. . . = pc H1: pj ≠ pjfor some j and j 1. Computational example: three samples of n = 100 residents of nursing homes were surveyed. Variable A was age heterogeneity in the home; variable B was resident satisfaction.

  25. Table 2. Nursing Home Data Age Heterogeneity Low b1 Medium b2 High b3 Satisfied a1O = 56 O = 58 O = 38 E = 50.67 E = 50.67 E = 50.67 Not Satisfied a2O = 44 O = 42 O = 52 E = 49.33 E = 49.33 E = 49.33

  26. B. Assumptions of the Equality of Proportions Test 1. Every observation is assigned to one and only one cell of the contingency table.

  27. 2. The observations are independent 3. If  = 1, every expected frequency should be at least 10. If  > 1, every expected frequency should be at least 5. C. Test of Homogeneity of Proportions 1. Extension of the test of equality of proportions when variable A has r > 2 rows

  28. 2. Statistical hypotheses for columns j and j'

More Related