420 likes | 438 Views
Sampling Distributions. OBJECTIVE. Set Up a Sampling Distribution, CLT, & Applications. RELEVANCE. To see how sampling can be used to predict population values. The U.S. census is only done once every 10 years because it is impractical to do it often.
E N D
OBJECTIVE Set Up a Sampling Distribution, CLT, & Applications
RELEVANCE To see how sampling can be used to predict population values.
The U.S. census is only done once every 10 years because it is impractical to do it often. • Therefore, the sample becomes very important.
The Sampling Issue…… • The goal of the survey is to get the same results that would be obtained if all had answered from the entire population. • It is important that every member of the population has an equal chance of being chosen.
Activity…… • Get in groups of 4. • Find the mean height of your group. • Now find the mean height of the class. • Is the class mean the same as your group’s mean?
Sample Mean vs. Population Mean…… • Not every sample mean will be the same as the population mean, but if you take good samples the means will be very close.
What is a Sampling Distribution? Sampling Distribution – the distribution of values for a sample obtained from repeated samples, all of the same size and all drawn from the same population.
Example…… • Consider the following set: {0,2,4,6,8}. a. Make a list of all possible samples of size 2 that can be drawn from this set. b. Construct a sampling distribution of the sample means for samples of size c. Graph the histogram of the population and sampling distribution. What do you notice?
a. {0,2,4,6,8} Sets of 2 • (0,0) (2,0) (4,0) (6,0) (8,0) (0,2) (2,2) (4,2) (6,2) (8,2) (0,4) (2,4) (4,4) (6,4) (8,4) (0,6) (2,6) (4,6) (6,6) (8,6) (0,8) (2,8) (4,8) (6,8) (8,8)
b. 1st find the means for each sample…… • (0,0) 0 (2,0) 1 (4,0) 2 (6,0) 3 (8,0) 4 (0,2) 1 (2,2) 2 (4,2) 3 (6,2) 4 (8,2) 5 (0,4) 2 (2,4) 3 (4,4) 4 (6,4) 5 (8,4) 6 (0,6) 3 (2,6) 4 (4,6) 5 (6,6) 6 (8,6) 7 (0,8) 4 (2,8) 5 (4,8) 6 (6,8) 7 (8,8) 8
Sample Space • Notice that each of these sample means is equally likely to occur. • Therefore, the probability of each is 1/25 = 0.04.
The sampling distribution of the sample means (SDSM) Notice it is NORMAL!
Example – You Try…… • Let’s say I picked out all the grades for the last quiz that were either 57, 67, 77, 87, or 97 and put them in a pile. Find every possible combination of quiz grades I could get if I picked 2 quizzes from this pile. • NOTE: There will be 25 possible combinations.
Now lets find the mean for each pair (57, 57) (67, 57) (77, 57) (87, 57) (97, 57) (57, 67) (67, 67) (77, 67) (87, 67) (97, 67) (57, 77) (67, 77) (77, 77) (87, 77) (97, 77) (57, 87) (67, 87) (77, 87) (87, 87) (97, 87) (57, 97) (67, 97) (77, 97) (87, 97) (97, 97)
There are 25 possible combinations (57, 57) (67, 57) (77, 57) (87, 57) (97, 57) 57 62 67 72 77 (57, 67) (67, 67) (77, 67) (87, 67) (97, 67) 62 67 72 77 82 (57, 77) (67, 77) (77, 77) (87, 77) (97, 77) 6772 77 82 87 (57, 87) (67, 87) (77, 87) (87, 87) (97, 87) 7277 82 87 92 (57, 97) (67, 97) (77, 97) (87, 97) (97, 97) 77 82 87 92 97
Each has a probability of 1/25 chance of selection. • Let’s make a chart.
If all possible random samples, each of size n, are taken from any population with mean and st. deviation , then the SDSM will: Have a sampling distribution mean equal to the population mean. Have a sampling distribution standard deviation equal to the population st. dev. divided by the square root of the sample size. SDSM……
If the population has a normal distribution, then the sampling distribution of the sample means will also be normal. If the population is NOT a normal distribution, then we use the Central Limit Theorem to make the sampling distribution approximately normal. The shape of the distribution……
The CLT…… • Definition – The SDSM will more closely resemble the normal distribution as the sample size increases. • The CLT can be used to answer questions about sample means in the same manner that the normal distribution can be used to answer questions about individual values. • **The CLT is used when the sampled population is NOT normal. The sampling distribution will be approximately normal under the right conditions.
The Standard Error of the Mean…… • The symbol used to represent the standard deviation of the samples, also known as the standard error of the mean, is
The SDSM follows these rules……. 1. The 2. The This measures the spread. (Note: “n” is the size of each sample) • a. A normal parent population produces a normal sampling distribution. b. Use the CLT when the sample size is large enough to make a sampling distribution normal when the parent population is NOT normal.
Let’s show how this works using an example….. • Consider all possibilities of sample size 2 of {2,4,6}. Find the probability distribution of the population with the histogram and then find the sampling distribution of the sample means and draw the histogram.
Now, let’s do a sampling distribution of sets of 2 from this population we just described.
The sets of 2 and their means…… • (2,2) 2 (4,2) 3 (6,2) 4 (2,4) 3 (4,4) 4 (6,4) 5 (2,6) 4 (4,6) 5 (6,6) 6
Sampling Distribution…… • Find the mean of the sampling distribution: • Find the st. dev. of the sampling dist:
Now, take a look at the shape of the histogram of the sampling distribution. It is approximately normal. The Histogram……
Sample Question • A certain population has a mean of 437 and a standard deviation of 63. Many samples of size 49 are randomly selected and the means are calculated. • A. What value would you expect to find for the mean of all these samples? • B. What value would you expect to find for the st. deviation of all these samples? • C. What shape would you expect the distribution of all these sample means to have?
Remember…… • Use “ncdf (z, z)” to find area or probability under the curve. • Change all “real” values to z-scores if the mean is not 0. • Population Mean = Sample Mean • St. Error of the Mean:
If What happens as the sample size increases? Answer: As the sample size increases, the standard deviation of the sample decreases. This means that the variation is decreasing. Remember, less variation is better. Why is Sample Size Important? Larger sample size- smaller variation Smaller sample size- larger variation
A normal population has a population mean of 100 and a population st. deviation of 20. If a sample of size 16 is selected, what is the probability that this sample will have a mean value between 90 and 110? Draw the normal distribution curve and shade it. You need to change 90 and 110 to z-scores. Then use normalcdf (z, z) to find the probability. Example……Follow the steps
The z-score formula will be a little bit different now because the st. deviation of the population must be changed to a sample st. deviation. You now use Let’s change the mean values of 90 and 110.
Now use normalcdf from where you started shading to where you stopped shading: ncdf(-2,2) = 0.9545
Example……You Try • Kindergarten children have heights that are approximately normally distributed with a population mean of 39 inches and a population standard deviation of 2 inches. A sample of 25 is taken. What is the probability that this sample will have a mean value between 38.5 inches and 40 inches?
Cutoff Example • If the population mean of a distribution is 39 and the population st. deviation is 2, within what limitsdoes the middle 90% fall for a sample of 100? • Hint: This is a cutoff score in the middle. First, you find the z-scores. Next, you substitute them back into the z-score formula.
Find the z-score for the middle 90%: z = InvNorm(.5 - .90/2) z = + - 1.64 Now, plug these into the formula with the new standard deviation for a sample. Answer……