260 likes | 513 Views
Chapter 22 Comparing Two Proportions. Comparing Two Proportions. Comparisons between two percentages are much more common (and interesting) than questions about isolated percentages.
E N D
Chapter 22 Comparing Two Proportions
Comparing Two Proportions • Comparisons between two percentages are much more common (and interesting) than questions about isolated percentages. • Why? We often want to know how two groups differ, whether a treatment is better than a placebo control, or whether this year’s results are better than last year’s.
Example Suppose you are interested in whether men and women differ with regard to how often they wash their hands in public restrooms?
Another Standard Deviation… • In order to examine the difference between two proportions, we need another standard deviation formula… • Recall that standard deviations don’t add, but variances do.
Example from Chapter 16 Two empty fields are used as parking lots for concerts and festivals. The number of vehicles that can park in Lot A has a mean of 219 and standard deviation of 13. Lot B can hold an average of 193 cars with a standard deviation of 11. a. What is the expected difference for the number of vehicles parked in the two lots. b. Find the standard deviation of that difference.
The Standard Deviation of the Difference Between Two Proportions • Proportions observed in independent random samples are independent. Thus, we can add their variances. So… • The standard deviation (really “standard error”) of the difference between two sample proportions is
Assumptions and Conditions • SRS (or RAT):EACH sample is a SRS from its’ own population (or 2 experimental groups randomly assigned to treatments) • 10% Condition:n1 and n2 are both <10% of their respective populations • Sample Size Condition (normality): Both groups are big enough that at least 5 successes and 5 failures have been observed in each. • Independent Samples: The two groups we’re comparing must be independent of each other. (ie, drawn independently)
The Sampling Distribution • We already know that for large enough samples, each of our proportions has an approximately Normal sampling distribution. • The same is true of their difference.
Two-Proportion z-Interval • When the conditions are met, we are ready to find the confidence interval for the difference of two proportions: • The confidence interval is where
Conclusion Statement We are ___% confident that the true proportion of [p1 in context] is between ___% and ___% [more/less] than the proportion of [p2 in context].
Back to the example… Suppose you are interested in whether men and women differ with regard to how often they wash their hands in public restrooms? Researchers monitored the behavior of public restroom users at major venues such as Turner Field and Grand Central Station and found that 2393 out of 3206 men washed their hands and 2802 of 3130 women washed their hands. Create a 95% confidence interval to describe the difference.
Example At Community Hospital, the burn center is experimenting with a new plasma compress treatment. A random sample of 316 patients with minor burns received the plasma compress treatment. Of these patients, it was found that 259 had no visible scars after treatment. Another random sample of 419 patients with minor burns received no plasma compress treatment. For this group, it was found that 94 had no visible scars after treatment. What is a 95% confidence interval of the difference in proportion of people who had no visible scars between the plasma compress treatment & control group?
pT: proportion of ppl who received plas comp treatment & had no visible scars pN: proportion of ppl who did NOT receive plas comp treat & had no visable scars Since these are all burn patients (come from the same pop.), we can add 316 + 419 = 735. If not the same – you MUST list separately. • Have 2 independent, randomly assigned treatment groups • 735<10% of all burn patients • nTpT=259, nTqT=57, nNpN=94, nNqN=325 -> all > 5, so can use Normal model (all the p’s and q’s have hats) 2-Proportion Z-Interval We are 95% confident that the true proportion of people who received plasma compress treatment and had no visible scars was between 53.7% and 65.4% more that the proportion of those who didn’t receive the treatment.
Ch22 (page 433) #6 • In 1995, 24.8% of 550 white adults surveyed reported that they smoke cigarettes, while 25.7% of the 550 black adults surveyed were smokers. • Create a 90% confidence interval for the difference in percentages of smokers among black and white American adults. • Does this survey indicate a race-based difference in smoking among American adults?
Hypothesis Testing • The typical hypothesis test for the difference in two proportions is the one of no difference. • In symbols, H0: p1 – p2 = 0 • Or H0: p1 = p2
Hypothesis statements: H0: p1 = p2 H0: p1 - p2 = 0 Ha: p1 - p2 > 0 Ha: p1 - p2 < 0 Ha: p1 - p2 ≠ 0 Be sure to define both p1 & p2! Ha: p1 > p2 Ha: p1 < p2 Ha: p1 ≠ p2
Standard Deviation/Error: • Remember that when you find the SD in a hypothesis test, you use the p from the H0 • Since we are hypothesizing that there is no difference between the two proportions, that means that p1 and p2 are the same, and so are their standard deviations. • Since this is the case, we combine (pool) the counts to get one overall proportion.
Pooling • The pooled proportion is If the numbers of successes are not whole numbers, round them first. (This is the only time you should round values in the middle of a calculation.)
Standard Error Formula • We then put this pooled value into the formula, substituting it for both sample proportions in the standard error formula:
Two-Proportion z-Test summary • [P] Define p1 and p2 (in words) • [H] We are testing the hypothesis H0: p1 = p2 • [no difference between p1 = p2] • Alternative hypothesis either • HA: p1 > p2 or HA: p1 < p2 or HA: p1≠p2 • [A] The assump/cond for the two-proportion z-test are the same as for the two-proportion z-interval. • [N] Name the test [2-proportion z-test]
Two-Proportion z-Test (cont.) • [T] State signif. level (usually α = .05) • Because we hypothesize that the proportions are equal, we pool them to find • We use the pooled value to estimate the standard error:
Two-Proportion z-Test (cont.) • Now we find the test statistic: Usually p1 – p2 =0
Two-Proportion z-Test (cont.) • [O] Use the Normal model to obtain a P-value. • [M] Make a decision • Since the p-value ([state p-value]) is [less than / greater than]α ([state α]), I will [reject / fail to reject] the null hypothesis. • [S] State a conclusion in context. • There [is / is not] sufficient evidence to suggest that [state HA in words]
Example A forest in Oregon has an infestation of spruce moths. In an effort to control the moth, one area has been regularly sprayed from airplanes. In this area, a random sample of 495 spruce trees showed that 81 had been killed by moths. A second nearby area receives no treatment. In this area, a random sample of 518 spruce trees showed that 92 had been killed by the moth. Do these data indicate that the proportion of spruce trees killed by the moth is different for these areas?
pt : the proportion of trees killed by moths in the treated area pu : the proportion of trees killed by moths in the untreated area • Conditions: • Both samples of spruce trees are SRS and independently selected • since ntpt=81, ntqt=414, nupu=92, nuqu=426 and all > 5, can use Normal model (all these p’s and q’s have hats) • Reasonable to assume that 1013 is less than 10% of all spruce trees. H0: pt=pu Ha: pt≠pu 2-proportion Z-test xt=81 xu=92 nt=495 nu=518 α = .05 P-value = 0.5547 Since p-value (.5547) > a(.05), I fail to reject H0. There is not sufficient evidence to suggest that the proportion of spruce trees killed by the moth is different for the treated and untreated areas.