1 / 41

Chapter 18 Cross-Tabulated Counts

Chapter 18 Cross-Tabulated Counts. In Chapter 18:. 18.1 Types of Samples 18.2 Naturalistic and Cohort Samples 18.3 Chi-Square Test of Association 18.4 Test for Trend 18.5 Case-Control Samples 18.6 Matched Pairs. §18.1 Types of Samples.

shilah
Download Presentation

Chapter 18 Cross-Tabulated Counts

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 18Cross-Tabulated Counts

  2. In Chapter 18: • 18.1 Types of Samples • 18.2 Naturalistic and Cohort Samples • 18.3 Chi-Square Test of Association • 18.4 Test for Trend • 18.5 Case-Control Samples • 18.6 Matched Pairs

  3. §18.1 Types of Samples • The prior chapter considered categorical response variables with two possible outcomes • This chapter considers categorical variables with any number of possible outcomes

  4. Types of Samples, cont. Data may be generated by: I. Naturalistic Samples. An SRS with data then cross-classified according to the explanatory variable and response variable. II. Purposive Cohort Samples. Fixed numbers of individuals selected according to the explanatory factor. III. Case-Control Samples. Fixed numbers of individuals selected according to the outcome variable.

  5. Naturalistic Samples Take an SRS from the population; then cross-classify individuals with respect to explanatory and response variables.

  6. Purposive Cohort Samples Select predetermined numbers of exposed and nonexposed individuals; then ascertain outcomes in individuals.

  7. Case-Control Samples Identify individuals who are positive for the outcome (cases); then sample the population for negative (controls).

  8. §18.2 Naturalistic and Cohort Samples • Data from a naturalistic sample are shown in this 5-by-2 table • Let us always put the explanatory variable in row of such table (for uniformity) • Totals are tallied in table margins

  9. Marginal Distributions • For naturalistic samples (only) describe marginal distributions • These may be reported graphically or in terms of percentages • Top figure: column marginal distribution • Bottom figure: row marginal distribution

  10. Conditional Percents • The relationship between the row variable and column variable is explored with conditional percents. Two types of conditional percents : • Row percents  use in cohort and naturalistic samples (describe prevalence and incidence) • Column percents  use in case-control samples

  11. Incidence and Prevalence (Naturalistic and Cohort Samples only) • The top table demonstrates R-by-C table notation (R rows and C columns) • For naturalistic and cohort samples, row percents in column 1 represent group incidence or prevalences

  12. Prevalences - Example This table shows prevalence by education level Example of calculation, prevalence group 1:

  13. Relative Risks, R-by-2 Tables Let group 1 represent the least exposed group Relative risks are calculated as follows:

  14. RRs, R-by-2 Tables, Example This table lists RR for the illustrative data Notice the downward dose-response in RRs Example of calculation

  15. Responses with More than Two Levels of Outcome Efficacy of Echinacea. A randomized controlled clinical trial pitted echinacea vs. placebo in the treatment of upper respiratory symptoms in children. The response variable was severity of illness classified as: mild, moderate or severe. Source: JAMA 2003, 290(21), 2824-30

  16. Echinacea, Conditional Distributions • Row percents are calculated to determine the incidence of each outcome. • Example of calculation, top right table cell (data prior slide) % severe w/echinacea = 48 / 329 × 100% = 14.6% • Conclusion: the treatment group fared slightly worse than the control group: 14.6% of treatment group experienced severe symptoms compared to 10.9% of the control group.

  17. §18.3 Chi-Square Test of Association A. Hypotheses. H0: no association in population versus Ha: association in population B. Test statistic. C. P-value. Convert the X2stat to a P-value with a a Table E or software program.

  18. Chi-Square Test - Example Data below reveal a negative association between smoking and education level. Let us test H0: no association in the population vs. Ha: association in the population.

  19. χ2, Expected Frequencies

  20. Chi-Square Statistic - Example

  21. Chi-Square Test, P-value • X2stat= 13.20 with 4 df • Using Table E, find the row for 4 df • Find the chi-square values in this row that bracket 13.20 • Bracketing values are 11.14 (P = .025) and 13.28 (P = .01). • Thus, .025 < P < .01 (closer to .01)

  22. Illustrative example X2stat= 13.20 with 4 df The P-value = AUC in the tail beyond X2stat

  23. Chi-Square By Computer Here are results for the illustrative data from WinPepi > Compare2.exe > Program F Categorical Data

  24. Yates’ Continuity Corrected Chi-Square Statistic • Two different chi-square statistics are used in practice • Pearson’s chi-square statistic (covered) is • Yates’ continuity-corrected chi-square statistic is: • The continuity-corrected method produces smaller chi-square statistics and larger P-values. • Both chi-square are used in practice.

  25. Chi-Square, cont. • How the chi-square works. When observed values = expected values, the chi-square statistic is 0. When the observed minus expected values gets large and evidence against H0 mounts • Avoid chi-square tests in small samples. Do not use a chi-square test when more than 20% of the cells have expected values that are less than 5.

  26. Chi-Square, cont. 3. Supplement chi-squares with measures of association. Chi-square statistics do not measure the strength of association. Use descriptive statistics or RRs to quantify “strength”. 4. Chi-square and z tests (Ch 17) produce identical P-values. The relationship between the statistics is:

  27. 18.4 Test for Trend See pp. 431 – 436

  28. §18.5 Case-Control Samples Case-control sampling method • Identify all cases in the population • From the same source population, randomly select a series of non-cases (controls) • Ascertain the exposure status of cases and controls • Cross-tabulate the exposure status of cases and controls This provides an efficient way to study rare outcomes

  29. Incidence Density Sampling As cases are identified in the population; select at random one or more noncases (controls) for each case at time of occurrence. This advanced concepts allows students to see that case-control studies are a type of longitudinal “time-failure” design.

  30. Case-Control Illustrative Example • Cases: men diagnosed with esophageal cancer • Controls: noncases selected at random from electoral lists in same region • Exposure = alcohol consumption dichotomized at 80 gms/day Interpretation: The rate ratio associated with high-alcohol consumption is about 5.6

  31. (1– α)100% CI for the OR Note use of the natural logarithmic scale

  32. 90% CI for the OR – Example

  33. WinPepi uses a slightly different formula than ours; the Mid-P results are similar to ours. Case-Control - Example Results from WinPepi > Compare2.exe > A.

  34. Case-Control Studies with Multiple Levels of Exposure With an ordinal exposure, compare each exposure level to the non-exposed group (next slide):

  35. Note dose-response relationship Case-Control, Ordinal Levels of Exposure

  36. 18.6 Matched Pairs • With matched-pair samples, each participant is carefully matched to a unique individual as part of the selection process • This technique is used to mitigate confounding by the matching factor • Both cohort and case-control samples may avail themselves of matching

  37. Here’s the notation for matched-pair case-control data: The odds ratio associate with exposure is: The confidence interval is:

  38. Matched Pairs - Example A matched case-control study found 45 pairs in which the case but not the control had a low fruit/veg diet; it found 24 pairs in which the control but not the case had a low fruit/veg diet The odds ratio suggests 88% higher risk in low fruit/veg consumers.

  39. Matched Pair Example, cont. Data are compatible with ORs between 1.14 and 3.07 WinPepi’s PairEtc.exe program A calculates exact confidence intervals for ORs from matched-pair data. Hand calculated limits will be similar except in small samples.

  40. Hypothesis Test, Matched Pairs A. H0: OR = 1 B. McNemar’s test statistic. C. P-values. Convert zstat to P-value with Table B or Table F If fewer than 5 discordancies are expected, use an exact binomial procedure (see text).

  41. Hypothesis Test, Example

More Related