140 likes | 465 Views
Section 4.4: Contingency Tables and Association. Contingency table What and why a contingen cy table Marginal distribution Conditional distribution Simpson’s Paradox What is it? What causes it?. Contingency tables are for summarizing bivariate (or multivariate) qualitative data .
E N D
Section 4.4:Contingency Tables and Association • Contingency table • What and why a contingency table • Marginal distribution • Conditional distribution • Simpson’s Paradox • What is it? • What causes it?
Contingency tables are for summarizing bivariate (or multivariate) qualitative data. sex height shoe eyes hair hand male 70 9 brown brown right male 71 11 blue blond left male 73 11.5 blue blond right female 64 7 brown black right male 66 7.5 brown lightbrown right female 63 6.5 brown black right female 64 6.5 blue red right male 72 10 brown blond left male 66 8.5 green lightbrown right female 67 8 brown lightbrown right male 74 11.5 brown brown left male 72 12 blue brown right female 68 8.5 blue lightbrown right male 78 12 blue blond right male 70 12 green blond right female 68 8 blue red both female 68 9.5 green brown left female 66 7 blue blond right male 66 10 brown brown right :::: :: :: ::::: ::::: :::::
Contingency table results:Rows: hairColumns: eyes Often it is arbitrary which variable gets to be the row variable.
Displaying three variables (sex, eye color, hair color). We will focus on two variables. Contingency table results for sex=female:Rows: eyesColumns: hair Contingency table results for sex=male
The 793 adult male passenger survival, by 1st class, 2nd class, and 3rd class fares: http://www.encyclopedia-titanica.org/titanic-statistics.html
Relative Frequency marginal distribution: (in parentheses) • Margins show relative amount in each row or column • Add to one.
Conditional Distribution Either rows or columns add to one (100%). Percentages conditioned on survival status
What proportion of passengers were women & children? What proportion of the passengers were lost? What proportion of the women & children were lost? Of the passengers who were lost, what proportion of the passengers were women and children?
Simpson’s Paradox: Example Hypothetical graduate school acceptance data: Men do better
But if a third variable is accounted for the story changes… Women actually do better
Simpson’s Paradox represents a situation in which an association between two variables inverts or goes away when a third variable is introduced to the analysis. See: http://users.humboldt.edu/rizzardi/Handouts.dir/SimpsonParadoxExample.xlsx