120 likes | 253 Views
Topic 8. Association between Two Qualitative Variables: Chis-quare Test. Example. This is called a (2-way) contingency table .
E N D
Topic 8 Association between Two Qualitative Variables: Chis-quare Test
Example This is called a (2-way) contingency table. • Daniel B Mark (1994, NEJM) reported quality of life of patients in Canada and the United States. The study looked at many outcomes a year after the heart attack. There was no significant difference in the patients’ survival rate. Another key outcome was the patients’ own assessment of their quality of life relative to what it had been before the heart attack. Here are the data for the patients who survived a year:
Questions Regarding the Previous Example • Identify two variables from the example. Are they qualitative? Which one is the response variable? Which is the explanatory variable? • Find the distribution of quality of life for patients from Canada. • Find the distribution of quality of life for patients from US. • Is there a significant difference between the two distributions?
Layout of a r x c Contingency Table • A contingency table, having r rows and c columns—rc total cells, looks like this:
Expected Counts for the Quality of Life Example Keep at least 2 decimal places!!
The Chi-square Test of Homogeneity H0: Homogeneity Against H1: Not Test Statistic P-value • Found with chi-square table using degrees of freedom = (r – 1)(c – 1) r is the number of rows and c is the number of columns • Chi-square tests are always right-sided and P-value is the area of the right tail under the chi-square curve beyond the value of the test statistic. Where O for observed cell frequency and E for expected cell frequency.
Example: Quality of Life • Test the claim that the distribution of quality of life is the same in Canada and the United States. Use a chi- square test at a significance level of 0.05.
The Chi-square Test for Goodness-of-Fit • H0: p1 = p10 , p2 = p20 , …,pk= pk0 against H1: at least one pi ≠ pi0 (called Goodness-of-Fit Test) • Test Statistic where k = number of categories, and Ei = npi0, called expected counts. • P-value Large values of the test statistic suggests rejection of the null hypothesis, so goodness-of-fit tests are always right-tailed and P-value is the area of the right tail under the chi-square curve beyond the value of the test statistic.
Example: Goodness-of-Fit Test • Toss a die 300 times with the following results. Is the die fair or biased? • We test: H0: p1= p2 = … = p6 = 1/6 (die is fair) H1: at least one pi is different from 1/6 (die is biased)
Example: Fisher’s Reexamination of Mendel’s Data • Mendel crossed 556 smooth, yellow male peas with wrinkled, green female peas. According to now established genetic theory, the relative frequencies of the progeny should be as given in the following table. Observed Type Counts Smooth yellow 315 Smooth green 108 Wrinkled yellow 102 Wrinkled green 31 Relative Type Frequency Smooth yellow 9/16 Smooth green 3/16 Wrinkled yellow 3/16 Wrinkled green 1/16 Are the data consistent with the Mendelian expected ratios of 9:3:3:1 for the four types?
When Can We Safely Use the Chi-square Test? All expected cell counts are at least 1 and no more than 20% are less than 5.