1 / 38

Session IV: Contingency Tables Tests of Association and Independence (Zar, Chapters 23, 24.10)

Session IV: Contingency Tables Tests of Association and Independence (Zar, Chapters 23, 24.10). Chapter 23: Contingency Tables Another (and last look!!) at peas!. Remember the pea color & texture:. Put the data into a matrix:. What was the hypothesis from the last session?.

Download Presentation

Session IV: Contingency Tables Tests of Association and Independence (Zar, Chapters 23, 24.10)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Session IV: Contingency Tables • Tests of Association and Independence • (Zar, Chapters 23, 24.10)

  2. Chapter 23: Contingency Tables • Another (and last look!!) at peas!

  3. Remember the pea color & texture: • Put the data into a matrix:

  4. What was the hypothesis from the last session? • Answer: 9 : 3 : 3: 1, or ? • (1) Y is dominant! • Test the hypothesis with:

  5. (2) S is dominant! • Test the hypothesis with: Then c2for the answer.

  6. (3) Color & Texture are independent • Problem: How to test this w/o assuming • proportions in the hypothesis? Estimate from the Marginal Totals

  7. Given the marginal probability estimates, if the Ho of the independence is true, what are the probabilities in the table “cells”?

  8. Given the marginal probability estimates • and table estimates, • how do we get the expected number? Expected # = probability x total number

  9. In General: 2 x 2 Tables: DF: 2  2 –1 –1 –1= 1 row total 4cells column

  10. In general:

  11. Where: Estimates of the “cells”: Degrees of Freedom: total cells rows columns

  12. Special cases:

  13. Example 23.1: Hair Color and Gender Therefore, reject H0

  14. Okay, we have now rejected the null hypothesis: H0: The rows and the columns are independent And accepted the alternative hypothesis: HA: The rows and the columns are not independent What now? Look for sub-hypotheses? Look at the individual chi-squares! MBk MBr MBl MR FBk FBr FBl Fr

  15. It makes sense to combine hair color with similar gender ratios. Blonds vs. Non-Blonds? But are all the Non-Blonds the same? H0: Black=Brown=Red

  16. Now H0: Blonds=Non-Blonds Hair Color independent of Gender (1) Too Few Male Blonds (3) Too Many Male NBl (2) Too Many Female Blonds (4) Too Few Female NBl

  17. The 2 x 2 Table re-examined: • (1) The problem of counts vs. estimates (2) The Yates Correction The subtraction of 0.5 is “conservative”. Rounding would be better.

  18. (3) The Computing formulae

  19. (a) Determine each of the four expected frequencies • denoting the smallest as . (b) Calculate the absolute difference between the smallest expected frequency and its corresponding observed frequency is . • (c) If define D=the largest multiple of 0.5 that is < d; (d) If define D=d - 0.5. (e) Calculate • (4) The Cochran/Haber –Correction • The Cochran/Haber (Haber(1980)) correction gives better results when routinely employed than does either the Yates-corrected or non-corrected chi-square calculation.

  20. Uncorrected: Yates: Cochran: Use the Blond/Non-Blond Example:

  21. (5) The Fisher – Exact Method 2 x 2 tables only Hypergeometric Distribution • Just like the binomial, we need • this probability + all more extreme.

  22. To fine those more extreme, look for the smallest expected frequency as in Cochran’s Method. Form successive tables as in the next example:

  23. From 2 to 1 And 1 to 0 Prob=0.02923 + 0.003498 + 0.0001499 = 0.03288. Reject H0: Independence or ?

  24. A heterogeneity chi-square analysis of 2 x2 contingency tables. Example 6.4a (Edition 2) Or see Example 23.8a (Edition 4) Ho: The four samples are homogeneous. HA The four samples are heterogeneous.

  25.  Accept H0, Experiments Homogeneous

  26. THE LOG-LIKELIHOOD RATIO The log-likelihood ratio was introduced in Session 2. In tables, it is calculated: Since G is approximately distributed as c2 , Table B.1 may be used with (r-1)(c-1) degrees of freedom.

  27. Log-likelihood Example: Hair Color

  28. R o w #2 not comp l et e 10 r e mi s s i on 12 >8 % LI 16 17 < 50 < 8% >50 a ge • Three and Higher Dimensional Tables • Ex: Problem III: • Row: RemissionColumn: Age • Tier: LI

  29. N o t ati o n : R o w i; co l umn j; ti e r l f 112 f = # in i, j, l ijl f f 122 111 f 121 f 211 f 221 • M ut ua l 3-D: Types of Independence:

  30. Partial R vs C&T: Column & Tier Spread Out as a single variable

  31. C vs R & T T vs R & C

  32. Pairwise

  33. Testing Proportions: A standard problem for which contingency tables and the independence null hypothesis is used is a test of proportions (more about this in Session 7). The problem in it’s simplest form is a 2 by 2 contingency table similar to the following table comparing two groups, control and treatment, say for which there are two outcomes positive/negative, alive/dead, remission/no remission

  34. The control group percent positive = The treatment group percent positive = The hypothesis of independence with Fisher-Exact or the chi-square approximation is a test of this hypothesis

  35. Power Calculations and Sample size Determination: Proportion positive control (p1) Proportion positive treatment (p2) Sample Size (n) Type I Error (a) = Pr(Rejecting H0 | H0 is true) Type II Error (b) = Pr (Rejecting HA| HA is true) Or 1-b=Pr(Accepting HA | HA is true) = power Given any four of these (p1, p2, n, a, 1-b), the fifth one is specified.

  36. This is using the formula of Casagrande JT, Pike MC (An Improved Formula for Calculating Sample Sizes, for Comparing Two Binomial Distributions, Biometrics, 34, 483-486): This equation can be solved for the parameter that is not defined. With the t-distribution, iteration is required.

More Related