Comparing Two Proportions

Comparing Two Proportions

A new twist on some old ideas… • When combining proportions we treat proportions and standard errors in much the same way as we did previously when combining sample means • Make sure you are aware of the slightly different way in which the formulae appear • Ultimately, this will usually boil down to a z-score test

Means and Standard Deviations The difference in sample proportions is When n is large, the standard deviation in D is etc!

Confidence testing, intervals, margins etc… • These concepts are still essentially the same • Sometimes a Wilson estimate will be used – note the slight change in formulae • Make sure to grasp the subtlety between standard deviation and standard error in the mean. SE often “creeps in” un-announced.

Examples… Go to Minitab solution • Look at example 8.8 • Are men more frequent binge drinkers? How confident can you be of this result? D = 0.227-0.170 = 0.057 or 5.7% Choose a 95% C: z* = 1.960, SED= 0.00622  m = z* SED The margin of error is 0.012 or 1.2%, so we can conclude that male college students are 5.7% more likely to be binge drinkers than female college students with a margin of error of 1.2%, 19 times out of 20.

How to abuse statistics! • A well meaning senate member of the university gets a hold of the previous stats. She concludes that since 17% of female students and nearly 23% of male students are binge drinkers it means that the ratio of female to male drinkers is 17/23 or 0.75 or 75% This implies that men are 25% more likely to be binge drinkers than women. We should enforce a dry-campus rule. Comment on this conclusion.

Pooled estimates… • An estimate is pooled when it combines the data from two or more data sets • For example – in our analysis of binge drinkers on campus we tacitly assumed that the 5.7% difference between male and female drinkers was significant. How can we justify this?

The incidence of binge drinking could be calculated via: • This is a pooled estimate for the overall incidence of binge drinking since it pooled the results for both groups • The estimate for the standard error takes on a slightly different form: “p” signifies pooled

Relative Risk • Our “zealous” senate member was actually computing a relative risk RR when she expressed the ratio of the two proportions of binge drinkers on campus. • RR is a useful statistic when comparing populations. In our example we could conclude that male college students are about 1.34 times more likely to be binge drinkers than female students. • RR is expressed as

Group Work… • 8.37 • 8.42 • 8.48

In conclusion… • Read the summary on page 595 carefuuly and note the similarity between the ideas developed here and those developed in Chapters 5 and 7 • Be sure to note the role of z-scores in this chapter • Review the new terms – pooled data and relative risk

Comparing Two Proportions