270 likes | 527 Views
Confidence Interval for the Population Mean.
E N D
What a way to start a section of notes – but anyway. Imagine you are at the ground level in front of my house at the curb. The picture below is the view of a sprinkler turned on full blast. The one thing bad about the picture is the sprinkler does not shoot in both directions at once. It shoots left and then right. But I put both for illustrative purposes.
When I put my sprinkler in the center of my yard I can cover the middle 95% of the front yard. Let’s think about an experiment we could undertake. Say you are outside my house late at night when all the lights are out and you are blindfolded. Then we spin you around a lot. Your job then is to put the sprinkler down in the yard. What is the probability that the center of the yard will get wet? Did you say 95%? Sure you did and here is why. If, when in the center, 95% of the yard can be hit, then putting the sprinkler at different places in the yard would mean that 95% of the time the center of the yard should be hit. Hope this helps you understand confidence intervals. If not, well, sorry.
Overview • In this section we study one of the two basic inference methods - • confidence intervals (hypothesis testing is the other) • Confidence intervals are used when our interest is estimating an unknown population parameter.
Another story Say there are five people in a room and the ages of the people are 18, 19, 20, 21, 22. If this is a population, the population mean is 20. Now let’s think about samples of size 2. Say I got the first two people – 18, and 19. The sample mean would be 18.5 – this is not the population mean. Some samples of size two will have sample mean = population mean, some won’t. In the real world we do not know the population mean, but the properties of the distribution of sample means help us learn about the population.
From my example you see that sometimes the sample mean will not be the population mean. So, a confidence interval builds in a margin of error around our point estimate in the hopes that the interval will include the population mean. The way we calculate the interval is 1) Take the sample mean 2) Calculate another value I will explain about more later 3) Get two numbers by taking the sample mean and subtracting the other value and taking the sample mean and adding the other value. This interval, from a low value to a high value, is hoped to contain the true unknown population mean.
From the last slide I now reiterate some ideas. The line represents sample means. In our sample we get the one represented by the vertical maker. Then we calculate another value – I show you later. Take this number and subtract it from the sample mean to get the lower limit of the interval and also take this number and add it to the sample mean to get the upper limit of the interval. Lower limit sample mean upper limit X
Overview • An example of when we do confidence intervals is when we want to estimate the unknown population mean. • The inference is based on the sampling distribution of the statistic of interest.
Overview • In a previous section we saw that the sampling distribution of sample means has properties based on the parameters of the population. Namely, the sampling distribution of sample means • 1) has a normal distribution • 2) has the same mean as the mean of the population from which the sample is drawn, and • 3) has a standard error equal to the standard deviation of the population from which the sample was drawn divided by the square root of the sample size. • These properties will be exploited in this section.
Note sample means distribution is “thinner” because of property 3 on the previous screen. Overview quantitative variable in population This value is one standard error on the low side of the mean. This value is one standard error on the high side of the mean. sample means This is the mean value of the variable in the population as well as the mean of the sampling distribution.
Estimating with confidence - overview • When we do not know the value of a population parameter, we may want to estimate it. • The population mean is estimated by the sample mean – in fact we saw before that the sample mean is a point estimate of the population mean.
Estimating with confidence - overview • Now, when we look just at the sampling distribution of the sample mean, we know this is the long run pattern of the sample mean. • This is similar to the idea that we don’t know what will come up on the next flip of a coin, but we know heads will come up 50% of the time.
Estimating with confidence - confidence interval • A property we learned earlier, combined with our more precise notion of the 68 - 95 - 99.7 rule, is that 95% of sample means lie within 1.96 standard errors of the mean. • Imagine 1.96 standard errorsis the length of my sprinkler in one direction. If in the center 95% of the yard can get wet, then by putting then sprinkler at other parts of the yard the center will get wet 95% of the time.
Estimating with confidence - confidence interval This is a visual of where the middle 95% of sample means will fall. Sample means If we start at the pop. Mean and add 1.96 times the standard error we get here. The mean of the distribution of sample means is the population mean Start at the pop. Mean and subtract 1.96 times the standard error.
Estimating with confidence - confidence interval Even if we do not know the value of the population mean, the center of the distribution of sample means will still be located at the population mean. 95% of the sample means will be within 1.96 standard errors from the mean. 1.96 standard errors can be thought of as a distance. We will use this distance to help us make up an interval where we think the unknown population mean will be located.
Estimating with confidence - confidence interval • To get a confidence interval for the unknown population mean we • 1. Calculate the sample mean. • 2. Calculate 1.96 standard errors. • 3. Take the sample mean and subtract 1.96 standard errors. Take the sample mean and add 1.96 standard errors.
Estimating with confidence - confidence interval This is the same slide as on slide 14, but with new information A sample mean example Sample means 95% of the sample means are within 1.96 standard errors of the true mean. Using the same 1.96 standard errors as length, then when we get a sample mean and place the same interval around the sample mean, then the interval should contain the unknown population mean 95% of the time.
Example Say a company is interested in customer satisfaction. It has created a survey such that from the consumer the company gets a score that measures satisfaction. The company would like to know the population mean score. Say the population standard deviation is 20 (this is a heroic assumption, but let’s use it) and a sample of size 100 has been taken. The standard error of the sampling distribution of sample means is then 20/square root 100 = 20/10 = 2. 1.96 standard errors is then 1.96(2) = 3.92
Estimating with confidence Say the sample mean (x bar) = 82 (I pulled this number out of thin air – in problems given to you the context will dictate what to use). The 95% confidence interval is found by two calculations 1) the low value of the interval is 82 – 3.92 = 78.08 and 2) the high value of the interval is 82 + 3.92 = 85.92 The interval is typically reported by writing (78.08, 85.92) The way we report what this interval means is to say: “We can be 95% confident that the true unknown population mean is in the interval (78.08, 85.92).” But, what we really mean is, “we got these numbers by a method that gives correct results 95% of the time.”
Estimating with confidence - critical z • The z of 1.96 was the z to get a 95% confidence interval. 1.96 is called the critical z, or z*. .025 .475 .475 .025 x .025 is the area to the right of the critical z = 1.96. Let’s call this area to the right of the critical z the upper p critical value.
Estimating with confidence - critical z • What if we want a 90% confidence interval? .05 .45 .45 .05 x The Z we should use is 1.645
Estimating with confidence Let’s redo the example we did before, but do a 90% confidence interval. Sample mean (x bar) = 82 Standard error = 2 and 1.645 standard errors is 3.29. The 90% confidence interval is found by two calculations 1) the low value of the interval is 82 – 3.29 = 78.71 and 2) the high value of the interval is 82 + 3.29 = 85.29 Or (78.71, 85.29)
Estimating with confidence Note that when we went from a 95% to a 90% interval the interval shrank. The 90% interval leaves us less confident and we get a smaller interval. This also means a 95% interval leaves us more confident and gives us a bigger interval. If you want to be 100% sure the interval includes the unknown mean, guess the interval is between a minus infinity and infinity. You can be sure the number is in that rather large interval – but this is not very practical.
99% confidence interval The z to use if you want a 99% confidence interval is 2.575 – so ignore what the dude says on the soundtrack
Estimating with confidence - summary • A C% confidence interval means we can be C% confident the unknown parameter lies within Z* standard deviations of the sample mean. • This really means we arrived at these numbers by a method that gives correct results C% of the time. • Here C is the Confidence Coefficient.
Level of significance – alpha The book uses the Greek letter alpha to stand for what is called the level of significance. Alpha = 1 – Confidence Coefficient. Well, if we are , for example, 95% confident the interval includes the unknown mean then there is a 5% probability the interval will not include the unknown mean.