Understanding Confidence Intervals for Population Proportions

Chapter 19 Confidence intervals for proportions math2200

Sample proportion • A good estimate of the population proportion • Natural sampling variability • How does the sample proportion vary from sample to sample? • Approximately normal • About 95% of all samples have within 2 sd of p • Population proportion, p, is unknown • Standard deviation is unknown

We can estimate! • Standard error estimates standard deviation • If estimates p accurately, by the 68-95-99.7% Rule, we know • about 68% of all samples will have ’s within 1 SE of p • about 95% of all samples will have ’s within 2 SEs of p • about 99.7% of all samples will have ’s within 3 SEs of p

What can we say about p? • The true value of p is . (Wrong!) • The true value of p is close to . (OK!) • We are 95% confident that the true value of p is between and • This is a 95% confidence interval of p • In this specific context, we can also call it one-proportion z-interval

Margin of error • The extent of the confidence interval on either side of is called the margin of error • The margin of error for our 95% confidence interval is 2se. • For the 99.7% confidence interval, the margin of error is 3se. • The more confident we want to be, the larger the margin of error must be. • Certainty versus precision • Confidence interval: estimate ± margin of error

What Does “95% Confidence” Really Mean? • Each confidence interval uses a sample statistic to estimate a population parameter. • But, since samples vary, the statistics we use, and thus the confidence intervals we construct, vary as well.

What Does “95% Confidence” Really Mean? (cont.) • The figure to the right shows that some of our confidence intervals capture the true proportion (the green horizontal line), while others do not.

What Does “95% Confidence” Really Mean? (cont.) • Our confidence is in the process of constructing the interval, not in any one interval itself. • Thus, we expect 95% of all “95%-confidence intervals” to contain the true parameter that they are estimating.

Margin of Error: Certainty vs. Precision • We can claim, with 95% confidence, that the interval contains the true population proportion. • The extent of the interval on either side of is called the margin of error (ME). • In general, confidence intervals have the form estimate ± ME. • The more confident we want to be, the larger our ME needs to be.

Margin of Error: Certainty vs. Precision (cont.)

Margin of Error: Certainty vs. Precision (cont.) • To be more confident, we wind up being less precise. • We need more values in our confidence interval to be more certain. • Because of this, every confidence interval is a balance between certainty and precision. • The tension between certainty and precision is always there. • Fortunately, in most cases we can be both sufficiently certain and sufficiently precise to make useful statements.

Margin of Error: Certainty vs. Precision • The choice of confidence level is somewhat arbitrary, but keep in mind this tension between certainty and precision when selecting your confidence level. • The most commonly chosen confidence levels are 90%, 95%, and 99% (but any percentage can be used).

Critical value • The number of SE’s affects the margin of error • This number is called the critical value, denoted by z* • For a 95% confidence interval, the precise critical value is 1.96 • For a 90% confidence interval, the precise critical value is 1.645

Critical Values (cont.) • Example: For a 90% confidence interval, the critical value is 1.645:

Margin of error • A confidence interval too wide is not very useful • How large a margin of error can we tolerate? • Reduce level of confidence makes margin of error smaller (not a good idea, though) • You should think about this ahead of time, when you design your study. Choose a larger sample! • Generally, a margin of error of 5% or less is acceptable

Sample size • Suppose a candidate is planning a poll and wants to estimate voter support within 3% with 95% confidence. How large a sample does she need? • But we do not know before we conduct the poll! • However,

Sample size • To be conservative, we set • Solve for n, we have n = 1067.1 • So, we will need at least 1068 respondents to keep the margin of error as small as 3% with a confidence level of 95% • In practice, you may need more since many people do not respond • If the response rate is too low, then the study might be a voluntary response study, which can be biased • To cut the se in half, we must quadruple the sample size n

Assumptions • Independence • Plausible independence condition • Randomization condition • Sampled at random? • From a properly randomized experiment? • 10% condition • Sampling without replacement can be viewed roughly the same as sampling with replacement

Assumptions • Sample size assumption • Normal approximation comes from the CLT • Sample size must be large enough • Check the success/failure condition

Example • In May 2002, the Gallup Poll asked 537 randomly sampled adults “generally speaking, do you believe the death penalty is applied fairly or unfairly in this country today?” • 53% answered “fairly” • 7% said “don’t know” • What can we conclude from this survey?

Example • Plausible independence? (Yes.) • Randomization condition? (Yes.) • 10% condition? (Yes.) • Success/failure condition (Yes.) • Let’s find a 95% confidence interval • Standard error: • Margin of error: • 95% confidence interval

A deeper understanding of confidence interval • How should we interpret the 95% confidence? • Randomness is from sample to sample • This does not say p is random • We are the one who is uncertain, not the parameter. • The interval itself is random (varies from sample to sample) • This interval is about p, not about • The sample proportion varies • Report confidence interval or margin or error

Always pay attention to the required assumptions • Independence • Sample size • In practice, watch out for biased samples

What have we learned? • Finally we have learned to use a sample to say something about the world at large. • This process (statistical inference) is based on our understanding of sampling models, and will be our focus for the rest of the book. • In this chapter we learned how to construct a confidence interval for a population proportion. • And, we learned that interpretation of our confidence interval is key—we can’t be certain, but we can be confident.

Understanding Confidence Intervals for Population Proportions