1 / 13

Lesson #29 2  2 Contingency Tables

Lesson #29 2  2 Contingency Tables. In general, contingency tables are used to present data that has been “cross-classified” by two categorical variables. Begin with a 2  2 table, where both variables are dichotomous. Variable 2. a+b. Variable 1. c+d. a+c. b+d. a. b. c. d.

Download Presentation

Lesson #29 2  2 Contingency Tables

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lesson #29 22 Contingency Tables

  2. In general, contingency tables are used to present data that has been “cross-classified” by two categorical variables. Begin with a 22 table, where both variables are dichotomous.

  3. Variable 2 a+b Variable 1 c+d a+c b+d a b c d n = a+b+c+d In the table, we have observed frequencies (a, b, c, and d). These can also be denoted by: Oii = 1, 2, 3, 4

  4. Arthritis Yes No 126 High 197 Exercise Low 117 206 = 35 91 82 115 323 (35) (115) = 0.54 OR (91) (82)

  5. We can also test for an association between the two independent variables. This is called a test of independence, or a test of homogeneity. The null hypothesis is: - no association between the two variables or - the two variables are independent or • the distributions of one variable are • homogeneous over levels of the other

  6. To perform the test, we first need to calculate expected frequencies, Ei, in each cell. This indicates how many observations we expect to see, if the null hypothesis is true. Recall that if two events are independent, P(A and B) = P(A)P(B)

  7. P(an observation being in any cell) = P(being in that row and being in that column) Then, “under H0” = P(being in that row)P(being in that column) Thus, under H0, we can estimate this by

  8. To get the expected number in any cell, multiply the probability of being in that cell by n. This is done for all 4 cells in the 22 table

  9. Reject H0 if ~ under H0 The test statistic is then: 2 ( ) - Ei Oi Ei

  10. Yes No Yes No 126 126 35 91 High High 197 197 82 115 Low Low 117 117 206 206 323 323 = Observed Expected 45.64 80.36 71.36 125.64 (126) (117) E1 = 45.64 323

  11. Reject H0 if = 3.841 = 6.38 Reject H0 Arthritis is less likely among those who exercised

  12. Variable 2 a+b Variable 1 a b c+d c d a+c b+d n = a+b+c+d For a 22 table, there is a “shortcut” method:

  13. Arthritis Yes No 126 35 91 High 197 Exercise 82 115 Low 117 206 323 = 6.38

More Related