220 likes | 564 Views
AP STATISTICS LESSON 12 - 1. INFERENCE FOR A POPULATION PROPORTION. ESSENTIAL QUESTION: What are the procedures for creating significance tests and confidence intervals for population proportion problems?. Objectives: To create confidence intervals for population proportions.
E N D
AP STATISTICSLESSON 12 - 1 INFERENCE FOR A POPULATION PROPORTION
ESSENTIAL QUESTION: What are the procedures for creating significance tests and confidence intervals for population proportion problems? Objectives: • To create confidence intervals for population proportions. • To find significance for proportion populations.
Introduction We often want to answer questions about the proportion of some outcome in a population, or to compare proportions across several populations.
Population Proportion ProblemsPage 685 • Example 12.1 Risky Behavior in the Age of AIDS (estimating a single population proportion) • Example 12.2 Does Preschool Make a Difference? (comparing two population proporations) • Example 12.3 Extracurriculars and Grades (comparing more than two population proportions)
Inference for a Population Proportion We are interested in the unknown proportion p of a population that has some outcome. For convenience, call the outcome we are looking for a “success.”
Sample Proportion ρ = count of successes in the sample count of observations in the sample Read the sample proportion ρ as “p-hat.” ^ ^
Conditions for Inference • As always, inference is based on the sampling distribution of a statistic. • The mean is p. That is, the sample proportion p is an unbiased estimator of the population proportion p. The standard deviation of p is √ p(1-p)/n, provided that the population is at least 10 times as large as the sample. If the sample size is large enough that both np and n(1 – p ) are at least 10, the distribution of p is approximately normal. ^
z Statistic z = (p – p)/ √p(1 – p )/n The statistic z has approximately the standard normal distribution N(0,1) if the sample is not too small and the sample is not a large part of the population. ^
Working Without p • To test the null hypothesis Ho: p = p0 that the unknown p has a specific value po, just replace p by po in the z statistic and in checking the values of np and n(1 – p). • In a confidence interval for p, we have no specific value to substitute. In large samples, p will be close to p. So we replace p by p in determining the values of np and n(1 – p). We also replace the standard deviation by the standard error of p SE = √p(1 – p)/n to get a confidence interval estimate ± z*SE ^ ^ ^ ^
Conditions for Inference About a Proportion • The data are an SRS from the population of interest. • The population is at least 10 times as large as the sample. • For a test Ho: p = po , the sample size n is so large that both npo and n(1 – po) are 10 or more. For a confidence interval, n is so large that both the count of successes np and the count of the failures n( 1 – p ) are 10 or more. ^ ^
Example 12.4 Page 688Are the Conditions Met? • The sampling design was in fact a complex stratified sample, and the survey used inference procedures for that design. The overall effect is close to an SRS, however. • The number of adult heterosexuals (the population) is much larger than 10 times the sample size, n = 2673
Inference for a Population Proportion • Draw an SRS of size n from a large population with unknown proportion p of success. An approximate level C confidence interval for p is p ± z*√ p(1 – p ) / n Where z* is the upper (1-C)/2 standard normal critical value. To test the hypothesis Ho: p = po compute the z statistic z = (p – po )/√po(1 – po)/n ^ ^ ^
Inference for Population Proportion (continued…) In terms of a variable Z having the standard normal distribution, the approximate P-value for a test Ho against Ha: p > po is P(Z ≥ z ) Ha: p < po is P(Z ≤ z ) Ha: p ≠ po is 2P(Z ≥ lzl )
Example 12.5 Page 690 Estimating Risky Behavior The National AIDS Behavioral Surveys found that 170 of a sample of 2673 adult heterosexuals had multiple partners. That is, p = 0.0636. A 99% confidence interval for the proportion p of all adult heterosexuals with multiple partners uses the standard normal critical value z* = 2.576 (use the bottom row of Table C for standard normal critical values) We are 99% confident that the percent of adult heterosexuals who had more than one sexual partner in the past year lies between about 5.1% and 7.6% ^
Example 12.6 Page 691Binge Drinking in College Binge drinking for men = 5 or more drinks (women = 4 or more drinks) on at lease one occasion within two weeks. In a representative sample of 140 colleges and 17,592 students (SRS), 7741 students identified themselves as binge drinkers. Does this constitute strong evidence that more than 40% of all college students engage in binge drinking? Answer: The P-value tells us that there is virtually no change of obtaining a sample proportion as far away from0.40 as p = 0.44. We reject H0 and conclude that more than 40% of U.S. college students have engaged in binge drinking. ^
Example 12.7 Page 692Is That Coin Fair? A coin that is balanced should come up heads half the time in the long run. The French naturalist Count Buffon tossed a coin 4040 times and got 2048 heads (p = 0.5069) Is this evidence that Buffon’s coin was not balanced? (hint: use the p-value for the two-sided test) Answer: We failed to find good evidence against H0: p = 0.5. We cannot conclude that H0 is true, that is, that the coin is perfectly balanced. NOTE: The test of significance only shows that the results of Buffon’s 4040 tosses can’t distinguish this coin from one that is perfectly balanced. To see what values of p are consistent with sample results, use a confidence interval.
Example 12.8 Page 693Confidence Interval For p We are 95% confident that the probailiby of a head is between 0.4915 and 0.52223. The confidence interval is more informative than the text in Example 12.7.
Choosing the Sample Size In planning a study, we may want to choose a sample size that will allow us to estimate the parameter within a given margin of error. m = z* √ p(1 – p )/ n Here z* is the standard normal critical value for the level of confidence we want. Because the margin of error involves the sample proportion of success p, we need to guess this value when choosing n. Call our guess p*. Here are two ways to get p*. ^ ^
Ways to Get p* • Use a guess or p* based on a pilot study or on past experience with similar studies. You should do several calculations that cover the range of p-values you might get. • Use p* = 0.5 as the guess. The margin of error m is larger when p = 0.5, so this guess is conservative in the sense that if we get other p when we do our study, we will get a margin of error smaller than planned. ^ ^
Sample Size for Desired Margin of Error To determine the sample size n that will yield a level C confidence interval for a population proportion p with a specified margin of error m, set the following expression for the margin of error to be less than or equal to m, and solve for n: z* √p*(1 – p*) / n ≤ m Where p* is a guessed value for the sample proportion. The margin of error will be less than or equal to m if you take the guess p* to be 0.5
Choosing p* The method for finding the guess p* does not matter that much in most cases. The n you get doesn’t change much when you change p* as long as p* is not too far from .5. So use the conservative guess p* = 0.5 if you expect the true p to be roughly between 0.3 and 0.7. If the true p is close to 0 or 1, using p* as your guess will give a sample much larger than you need. So try to use a better guess from a pilot study when you suspect that p will be less than 0.3 or greater than 0.7. ^ ^
Example 12.9 Page 696Determining Sample Size for Election Polling Find sample size for 2.5% margin of error (sample size n = 1537), and for 2% margin of error (n = 2041).