Chapter 19

Chapter 19 Confidence Intervals for Proportions

Sampling Distribution Models Population Parameter? p Population Inference Sample Sample Statistic

Sampling Distribution of • If our two conditions hold then we know: • Shape: Approximately Normal • Center: The mean is p. • Spread: The standard deviation is

Sampling Distribution of • Recall the two conditions: • 10% Condition: The size of the sample should be less than 10% of the size of the population. • Success/Failure Condition: np and n(1 – p) should both be greater than 10.

Sampling Distribution of • So If we know the population proportion p and the sample size is big enough then we can intelligently think about possible by using the normal model. • We can find the probability of obtaining a particular • We can determine if observing a particular is unlikely or not.

Sampling Distribution of • For example by using the 68-95-99.7 Rule we can say something like this: • 95% of the time the sample proportion, will be between

Inference • Unfortunately the population parameter, p, is usually unknown. We would like to use a sample to tell us something about p. • Use the sample proportion, , (as our best guess) to make inferences about the population proportion p.

War on Terrorism • According to a March 2nd 2006 ABC News/Washington Post poll of 1,000 adult Americans, 46% of those surveyed disapprove of the way that Bush is handling the US campaign against terrorism. • This poll was conducted by The Washington Post, so lets assume (hope) that they randomized correctly and obtained a representative sample. • What is the population here? • What is the population parameter?

War on Terrorism • So what is the sampling distribution of the proportion of US adults who disapprove of the way that Bush is handling the US campaign against terrorism.? • We know? • n = 1,000 • = .46 • if conditions hold.(do they?)

War on Terrorism • What don’t we know? • p - we don’t know the actual proportion of US adults who disapprove of the way that Bush is handling the US campaign against terrorism. • Since we don’t know p then we also don’t know • What can we do? • We can use to find an estimate of

Estimation • We expect that p and should be similar so we can use to estimate • When we use to estimate the standard deviation, this is called the standard error of

What does this tell us?

What does this tell us? • Once again using the 68-95-99.7 rule we know that: • About 68% of the time (i.e. for about 68% of random samples), will be no more than 1 away from p. • About 95% of the time (i.e. for about 95% of random samples), will be no more than 2 away from p. • About 99.7% of the time (i.e. for about 99.7% of random samples), will be no more than 3 away from p.

What does this tell us? • Let’s think about the second interval (95%). • Start with (because we know this value) and go out about 2 in either direction. • We can be 95% sure (confident) that the interval will contain p.

War on Terrorism • About 95% of the time (i.e. for about 95% of random samples), will be no more than 0.032 away from p • Start at = 0.46 and go out about 0.032 in each direction • We can be 95% confident that this interval will contain p. • We are 95% confident that the true proportion of adults who are dissatisfied with the way the war on terrorism is going is between 32% and 38%.

Interpretation • Plausible values for the population parameter p. • 95% confidence in the process that produced this interval.

Statistical Confidence • Two things can happen when we create the interval as above: • p can either be in the interval (which will happen in about 95% of the intervals). • p can be outside the interval (which will happen in about 5% of the intervals). • One thing that can’t happen: • The parameter value can’t change!!

Statistical Confidence • We don’t know which is true. • So, we rely on our statistical confidence. • The best we can say is, “We are 95% confident that the true population proportion lies within the interval we construct.”

Statistical Confidence • WE ARE NOT SAYING that p is in our interval 95% of the time. • The above phrase implies that p is “moving around” which we already said this cannot happen (remember p is some unknown fixed value). • If we were to calculate lots of intervals, the population parameter will be in about 95% of them.

Confidence Intervals • Confidence intervals come from the fact that we could take multiple samples and calculate multiple 95% confidence intervals and, if we were using the same method to find all the intervals, we would expect that about 95% of the intervals we constructed would contain the true parameter (population proportion).

95% Confidence • If one were to repeatedly sample at random 1000 registered voters and compute a 95% confidence interval for each sample, 95% of the intervals produced would contain the population proportion p.

Confidence Intervals • So, what does the interval look like? • Confidence intervals for the population proportion have the form • For 95% confidence intervals,

Confidence Intervals • ME is the margin of error • The extent of the interval on either side of our estimate • In general, where is called a critical value.

Construction of CI • , the point estimate, is the center of the interval. It merely shifts the interval along the axis. • , the critical value, is the number of multiples of the standard error needed to form the desired CI. This will depend on the level of confidence you want.

How to find • Need z-tables • Based on Normal model • Between what two z-values do 95% of the observations lie on N(0,1)?

How to find • The z-values for a 95% confidence interval are not exactly 2 and –2. • We use these numbers as an approximation. • 1.96 and –1.96 are more exact. • So, a 95% CI for the population proportion looks like

How to find • What does a 99% CI for p look like? • What does a 90% CI for p look like? • What does an 80% CI for p look like?

Construction of CI • , the standard error, is the estimate for • , the margin of error, is ½ the width of the CI. This merely determines the width of the interval. • What happens to ME if n increases?

Construction of CI • So, the CI for p looks like

Confidence Intervals • Now we know what the interval looks like, but how do we know we can do all this? • It is based on the Normal model • Did we check any assumptions beforehand? • What were the assumptions needed for sampling distribution for sample proportion?

Confidence Intervals • We don’t know p or q. • So, check the following assumptions: • Random sample • Were data sampled randomly or are they from a randomized experiment? • Independence • Do data values affect one another? • n < 10% of population size • Success/Failure

Step to Forming a CI 1) Describe the population parameter of concern. • Ex: p = proportion of adults dissatisfied with the way the war on terrorism is going 2) Specify the confidence interval criteria a) check assumptions • random sample • independence • n < 10% • success/failure

Steps to a Confidence Interval b) state the level of confidence c) determine the critical value, z* 3) Collect and present sample information a) collect the data from the population b) find the point estimate, 4) Determine the confidence interval a) find the standard error,

Steps to a Confidence Interval b) find the margin of error, c) find the interval, d) describe your results; interpret the interval • I am ___% confident that the true population proportion falls within the interval I constructed.

Example • Ch. 19 #7 • True or False? • For a given sample size, higher confidence means a smaller margin of error. • For a specified confidence level, larger samples provide smaller margins of error. • For a fixed margin of error, larger samples provide greater confidence. • For a given confidence level, halving the margin of error requires a sample twice as large.

Example • For a given sample size, higher confidence means a smaller margin of error. Solution ME = z*(SE( )) = z* -fixed n implies fixed SE -higher confidence implies higher z* (see Table T) -so, with fixed SE and increasing z*, ME increases, the statement is FALSE

Example b) For a specified confidence level, larger samples provide smaller margins of error. Solution ME = z*(SE( )) = z* -certain confidence interval = fixed z* -bigger sample = increasing n implies smaller -so, with fixed z* and decreasing , ME decreases, the statement is TRUE

Example c) For a fixed margin of error, larger samples provide greater confidence. Solution ME = z*( ) = z* -fixed ME -larger samples imply smaller -so for ME to remain the same, z* must increase -increasing z* implies larger confidence, so the statement is TRUE

Example d) For a given confidence level, halving the margin of error requires a sample twice as large. Solution ME = z*( ) = z* -given confidence level implies fixed z* -halving the margin of error means dividing ME by 2 -if you divide one side by 2, must divide the other by 2: -so, if you divide ME by 2, you need to multiply the sample size by 4, not 2, the statement is FALSE

Example • Ch. 19, #20 • A city ballot includes a local initiative that would legalize gambling. The issue is hotly contested and two groups decide to conduct polls to predict the outcome. The local newspaper finds that 53% of 1200 randomly selected voters plan to vote “yes”, while a college statistics class finds 54% of 450 randomly selected voters are in support. Both groups will create 95% confidence intervals.

Example a) Without finding the confidence intervals, explain which one will have the larger margin of error. Because the classes sample size is smaller, its interval will be larger.

Example b) Find both confidence intervals. Newspaper: (50.2%, 55.8%) We are 95% confident that the true proportion of people who will vote to legalize gambling is between 50.2% and 55.8%. Class: (49.4%, 58.6%) We are 95% confident that the true proportion of people who will vote to legalize gambling is between 49.4% and 58.6%.

Example c) Which group concludes that the outcome is too close to call? Why? The students should conclude that their interval is too close to call because 50% is in the interval, meaning that it is quite likely that p could be 50%.

Cautions about Confidence Intervals • Do NOT suggest that the parameter p varies! • Do NOT imply you are certain about the parameter p! • Be sure to remember that the confidence interval is about the parameter, NOT the sample proportion(s)!

Sample Size and the ME • How precise should our margin of error be? • We know that we cannot be exact, but we don’t want our margin of error to be too large. • If it is too large, it may not be useful.

Sample Size and the ME • There are two ways to adjust our ME. • You can reduce your confidence level. • As you reduce confidence, the value of z* decreases. • However, confidence levels less than 80% are rarely used in real studies. 95% and 99% are more common. • You can change your sample size. • If we look at the equation for ME, we see that changing the sample size will change ME. • In many cases, we may want to know how large of a sample we should take to guarantee a certain ME.

Sample Size and the ME • Determining sample size. • We know that ME = z* • We can manipulate this equation with algebra.

Sample Size and the ME • This will allow us to calculate the minimum sample size needed to have a certain margin of error. • The worst case scenario, the one that needs the largest sample size, is when p = 0.5. So, if we use this value for , we will be safe, meaning that we won’t choose a sample size too small to meet our required margin or error. • If you get a decimal, always round up!

Sample Size and the ME • Example: • Suppose that we want to estimate the proportion of ISU students who like Stat 101 within 3% with 95% confidence. How large of a sample size is needed? • ME = 0.03, z* = 1.960, and = 0.5

Chapter 19

Chapter 19

Presentation Transcript

Chapter 19

Chapter 19

Chapter 19

19 Chapter 19

Chapter 19

Chapter 19

Chapter 19

CHAPTER 19

Chapter 19

Chapter 19

Chapter 19

Chapter 19

Chapter 19

Chapter 19

Chapter 19

Chapter 19

Chapter 19

Chapter 19

Chapter 19

Chapter 19

Chapter 19

CHAPTER 19