1 / 20

Please turn off cell phones, pagers, etc. The lecture will begin shortly.

Please turn off cell phones, pagers, etc. The lecture will begin shortly. Lecture 24. This lecture will cover one topic from Chapter 13. Test for independence in a 2 ×2 table (Section 13.3). 1. Test for independence in 2 ×2 table. Last time, we introduced the idea of a hypothesis testing.

Download Presentation

Please turn off cell phones, pagers, etc. The lecture will begin shortly.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Please turn off cell phones, pagers, etc. The lecture will begin shortly.

  2. Lecture 24 This lecture will cover one topic from Chapter 13. • Test for independence in a 2×2 table • (Section 13.3)

  3. 1. Test for independence in 2×2 table Last time, we introduced the idea of a hypothesis testing. Hypothesis testing helps us to decide whether an interesting effect that we see in a sample is “strong enough” to conclude that it also holds in the population. The first step in hypothesis testing is to identify the null hypothesis and the alternative hypothesis. • The null hypothesis is that the interesting effect that we see in the sample is not real, i.e. that it does not hold in the population. • The alternative hypothesis is that the effect is real. Notice that the null and alternative hypotheses are statements about the population, not about the sample!

  4. Freq Yes 520 No or undecided 380 520 = 57.7% 900 Total 900 Example Sample n = 900 likely voters and ask: “Will you vote for X?” Is there enough evidence to conclude that X will win? Null: The rate of support in the population is 50%. Alternative: The rate of support in the population is not 50%. The margin of error is 100/sqrt(900) = 3.3%, and the range of likely values is 57.7 – 3.3 = 54.4% to 57.7 + 3.3 = 61.0%. So we can safely reject the null hypothesis.

  5. Response Yes No A B Group 1 C D Group 2 Relationship between two binary variables Today we introduce a new test that will help us to decide whether a relationship that we observe between two binary variables in a 2×2 table also exists in the population. This is a test for independence in a 2×2 table. It is commonly called a chisquare test. The explanatory variable is Group (1 or 2). The outcome variable is the response (yes or no).

  6. Yes No A B Group 1 C D Group 2 Null hypothesis The null hypothesis for the chisquare test is: • The explanatory variable and the outcome are independent. • The proportions of “Yes” in Group 1 and Group 2 in the population are the same. • The population relative risk is 1.0. • The population odds ratio is 1.0. These different ways of expressing the null hypotheses are all equivalent.

  7. Yes No A B Group 1 C D Group 2 Alternative hypothesis The alternative hypothesis for the chisquare test is: • The explanatory variable and the outcome are related. • The proportions of “Yes” in Group 1 and Group 2 in the population are not the same. • The population relative risk is not 1.0. • The population odds ratio is not 1.0. Again, these expressions are equivalent.

  8. Response Yes No Total Group 1 A B Group 2 C D A+B C+D Total A+C B+D A+B+C+D Step 1: Compute the marginal totals • First, compute the row totals. • Next, compute the column totals. • Finally, compute the grand total.

  9. row total × column total expected frequency = grand total Step 2: Compute the “expected frequencies” The expected frequencies represent our best guess for what the four frequencies (A, B, C, D) would be if the null hypothesis were true. We must compute the expected frequency for each cell. The expected frequency for a cell is that cell’s row total, multiplied by that cell’s column total, divided by the grand total.

  10. Response Yes No Total Group 1 R1 Group 2 R2 Total C1 C2 N Let’s call the row totals R1 and R2, the column totals C1 and C2, and the grand total N. (R1×C1)/ N (R1×C2)/ N (R2×C1)/ N (R2×C2)/ N The expected frequency for the cell in the first row and first column is (R1 × C1) / N. The expected frequency for the cell in the first row and second column is (R1 × C2) / N. …and so on.

  11. Response Response • The new table of expectedfrequencies • The original table of observed frequencies Yes Yes No No Group 1 Group 1 A’ A B’ B Group 2 Group 2 C C’ D D’ Now we have two tables: If the observed and expected frequencies are “far apart,” then we have evidence that the null hypothesis is false. So we must compute an overall “distance” between the two sets of frequencies.

  12. 2 (observed – expected) distance = expected Step 3: Compute the “distances” The “distance” between the observed frequency and expected frequency for a cell is given by this formula:

  13. Response Response Yes Yes No No Group 1 Group 1 A A’ B B’ Group 2 Group 2 C’ C D’ D 2 distance = (A – A’) / A’ 2 distance = (B – B’) / B’ 2 distance = (C – C’) / C’ 2 distance = (D – D’) / D’ Cell A: Cell B: Cell C: Cell D:

  14. This is also called the Pearson chisquare statistic, and is often denoted by X 2 2 2 2 (A – A’) / A’ X = + (B – B’) / B’ 2 2 + (C – C’) / C’ + (D – D’) / D’ Step 4: Add up the distances The “total distance” is the sum of the four distances. Larger values of the chisquare statistic provide greater evidence that the null hypothesis is false, i.e. that a relationship exists between the two variables.

  15. 2 If X ≥ 3.84, then reject the null hypothesis. You may safely conclude that a relationship exists. 2 If X < 3.84, then do not reject the null hypothesis. There is not enough evidence to conclude that a relationship exists. Step 5: Make the decision We can safely reject the null hypothesis if the chisquare statistic is greater than 3.84.

  16. Comments • Notice that if the chisquare statistic is less than 3.84, we do not say that we “accept the null hypothesis.” A small chisquare value does not prove that the null hypothesis is true; it only means that there is insufficient evidence to reject the null hypothesis. • This chisquare test with a cutoff of 3.84 is reliable only if the sample size is “large enough.” In this case, “large enough” means that all of the expected frequencies (A’, B’, C’ and D’) should be at ≥ 5.0. • If any of the expected frequencies is less than 5.0, do not use this chisquare test.

  17. Legal abortion for any reason Yes No Total Men 215 269 484 Women 172 244 416 Total 387 513 900 Example “Should it be possible for a pregnant woman to obtain a legal abortion if the woman wants it for any reason?” In this sample, the rate of support is slightly higher among men than among women (OR = 1.13). Can we safely conclude that that this is also true for the population?

  18. Legal abortion for any reason Yes No Total Men 215 269 484 Women 172 244 416 Total 387 513 900 Yes No Men 208.12 275.88 Women 178.88 237.12 Compute the expected frequencies: A’ = (484 × 387) / 900 = 208.12 B’ = (484 × 513) / 900 = 275.88 C’ = (416 × 387) / 900 = 178.88 D’ = (416 × 513) / 900 = 237.12

  19. Yes No Yes No Men 215 269 Men 208.12 275.88 Women 172 244 Women 178.88 237.12 2 2 2 2 (215 – 208.12) / 208.12 = 0.22 (269 – 275.88) / 275.88 = 0.18 (172 – 178.88) / 178.88 = 0.28 (244 – 237.12) / 237.12 = 0.19 2 X = 0.22 + 0.18 + 1.28 + 0.19 = 0.87 Compute the distances:

  20. 2 Because X = 0.87 is less than 3.84, we cannot reject the null hypothesis. Conclusion The rate of support is greater among men than among women in this sample. But there is not enough evidence to conclude that the rate of support is greater among men than among women in the population.

More Related