190 likes | 352 Views
Introduction to Biostatistics (BIO/EPI 540) Contingency Tables. Acknowledgement: Thanks to Professor Pagano (Harvard School of Public Health) for lecture material. Contingency Tables. Nominal data that are grouped into categories are often presented in the form of contingency tables
E N D
Introduction to Biostatistics(BIO/EPI 540) Contingency Tables Acknowledgement: Thanks to Professor Pagano (Harvard School of Public Health) for lecture material
Contingency Tables • Nominal data that are grouped into categories are often presented in the form of contingency tables • Rows denote levels of one variable (e.g. disease) • Columns denote the levels of the other variable (e.g. exposure)
Example – Discrete Outcomes Consider whether the rate of caesareans is different for subjects receiving an electronic fetal monitoring (EFM), as compared to those without EMF. Sample 5,824 deliveries: of these 2,850 were EFM exposed and 2,974 were not. 358 of the 2,850 had c-sections as did 229 of the 2,974. Binomial with n huge.
Do the c-section rates differ? Example – Discrete Outcomes Chi square test • Proceed as usual: • If there is no difference • (null hypothesis) what do we expect to see? • 2. How does this compare to what we have observed? (statistic & its distribution)
Data-Contingency table If the c-section rate is the same in both populations, then ignore column classification and go with totals.
2x2 Table – Null Hypothesis • Ho: The proportion of C-sections among patents receiving EFM is identical to the proportion of C-sections among patients who do not receive EMF • Ha: The proportion of C-sections among patents receiving EFM is different from the proportion of C-sections among patients who do not receive EMF
Probability of c-section From the totals we can estimate:
Expected counts under Ho What do we expect to see if EFM has no effect? EFM exposed (2,850 mothers): No EFM (2,974 mothers)
Observed and Expected counts – Contingency Table Expected, if independence of row and column classification is true, in boxes:
Chi Square Goodness of fit Chi Square Test (Table page A-26)
Continuity correction factor In 2x2 tables (only) we apply a continuity correction factor:
Example For the EFM and c-section example, above: Note: This is a 2 sided test
Equivalent Tests • The above example can be analyzed equivalently using a two sample test of proportions (Chapter 14.6) • 2 sample test of proportions (Z test) and Chi-Square test are mathematically equivalent
Assumptions – Chi Square test • Chi square test – is an asymptotic test. i.e. Works only when sample size is large • Chi Square test – treats the row total and column total of the data as fixed (i.e. not random)
Assumptions – 2 sample test of proportions • Z test – is also an asymptotic test. Assumes that the Central Limit Theorem for sample means (i.e. proportions) holds. Thus this test is appropriate only when sample size is large • Z test – assumes that the proportions in each group being compared are random variables
Extending to multiple categories: r x c Tables e.g. Accuracy of Death Certificates
e.g. tabi 157 18 54 \ 268 44 34
Summary • Contingency Tables – • Analysis of 2x2 tables • Analysis of rxc tables • Equivalence between Chi square test and two sample test of proportions