390 likes | 406 Views
Contingency analysis. Sample. Null hypothesis. Test statistic. Null distribution. compare. How unusual is this test statistic?. P > 0.05. P < 0.05. Reject H o. Fail to reject H o. Using one tail in the c 2. We always use only one tail for a c 2 test Why?.
E N D
Sample Null hypothesis Test statistic Null distribution compare How unusual is this test statistic? P > 0.05 P < 0.05 Reject Ho Fail to reject Ho
Using one tail in the c2 • We always use only one tail for a c2 test • Why?
Data match null expectation exactly 0 Data deviate from null expectation in some way
Reality Ho true Ho false Result correct Reject Ho Type I error Do not reject Ho correct Type II error
If null hypothesis is really true… Do not reject Ho Correct answer Reject Ho Type I error Test statistic
If null hypothesis is really false… Do not reject Ho Type II error Reject Ho correct Test statistic
Errors and statistics • These are theoretical - you usually don’t know for sure if you’ve made an error • Pr[Type I error] = • Pr[Type II error] = … • Requires power analysis • Depends on sample size
Contingency analysis • Estimates and tests for an association between two or more categorical variables
Odds ratio • Odds of success = probability of success divided by the probability of failure
Estimating the Odds ratio • Odds of success = probability of success divided by the probability of failure
Example • Out of 48 bottles of wine, 40 were French
Example • Out of 48 bottles of wine, 40 were French Interpretation: people are about 5 times more likely to buy a French wine
Failure more likely Success and failure equally likely Success more likely O=1
Odds ratio • The odds of success in one group divided by the odds of success in a second group
Estimating the Odds ratio • The odds of success in one group divided by the odds of success in a second group
Music and wine buying • Group 1 = French music, Group 2 = German music • Success = French wine
Group 2 • Out of 34 bottles of wine, 12 were French
Music and wine buying • Group 1 = French music, Group 2 = German music • Success = French wine
Music and wine buying • Group 1 = French music, Group 2 = German music • Success = French wine Interpretation: people are about 9 times more likely to buy French wine in Group 1 compared to Group 2
Success more likely in Group 2 Success equally likely in both groups Success more likely in Group 1 OR=1
Hypothesis testing • Contingency analysis • Is there a difference in odds between two groups?
Hypothesis testing • Contingency analysis • Is there an association between two categorical variables?
Contingency analysis • Is there a difference in the odds of buying French wine depending on the music that is playing? • Is there an association between wine bought and music playing? • Is the nationality of the wine independent of the music playing when it is sold?
Hypotheses • H0: The nationality of the bottle of wine is independent of the nationality of the music played when it is sold. • HA: The nationality of the bottle of wine sold depends on the nationality of the music being played when it is sold.
Calculating the expectations With independence, Pr[ French wine AND French music] = Pr[French wine] Pr[French music]
Calculating the expectations Pr[French wine] = 52/82=0.634 Pr[French music] = 48/82= 0.585 By H0, Pr[French wine AND French music] = (0.634)(0.585)=0.37112
Calculating the expectations By H0, Pr[French wine AND French music] = (0.634)(0.585)=0.37112
Degrees of freedom For a 2 Contingency test, df = # categories -1- # parameters df= (# columns -1)(# rows -1) For music/wine example, df = (2-1)(2-1) = 1
Conclusion c2 = 20.0 >> c21,a=0.05 = 3.84, So we can reject the null hypothesis of independence, and say that the nationality of the wine sold did depend on what music was played.
Assumptions • This c2 test is just a special case of the c2 goodness-of-fit test, so the same rules apply. • You can’t have any expectation less than 1, and no more than 20% < 5
Fisher’s exact test • For 2 x 2 contingency analysis • Does not make assumptions about the size of expectations • JMP will do it, but cumbersome to do by hand
Other extensions you might see • Yates correction for continuity • G-test • Read about these in your book