390 likes | 490 Views
Chapter 8. Sampling Distribution of a Statistic. Binomial Distribution. A binomial random variable X is the total number of successes in n independent Bernoulli trials, on which each trial, the probability of success is p. We say X is B ( n,p ).
E N D
Chapter 8 Sampling Distribution of a Statistic
Binomial Distribution • A binomialrandom variable X is the total number of successes in n independent Bernoulli trials, on which each trial, the probability of success is p. We say X is B(n,p).
Approximating the Binomial with the Normal • We can use the normal distribution to approximate the binomial when np ≥ 5 and nq ≥ 5. • If X is B(n, p) and np ≥ 5 and nq ≥ 5 then X can be approximated by
Homework 15 • Read: pages 499-504, 509-510, 522, 525 • LDI: 8.5, 8.6, 8.7 (LDI 8.1 and 8.2 are EC) • EX: 8.1, 8.4, 8.7, 8.8, 8.11, 8.12
Definition • The sampling distribution of a statistic is the distribution of values of the statistic in all possible samples of the same size n taken from the same population.
Proportion of Women • In rural China, many villages are experiencing a lack of women. This suggests the the proportion of women in the population is less than 50%. • We want to estimate the proportion of women in rural villages in China, so we’ll take a sample
What Do We Expect of Sample Proportions? • The values of the sample proportion vary from random sample to random sample in a predictable way. • The shape of the distribution of the sample proportion is approximately symmetric and bell-shaped. • BISIM
What Do We Expect of Sample Proportions? • The center of the distribution of the sample proportion values is at the true population proportion p. • With a larger sample size n, the sample proportion values tend to be closer to the true population proportion p. The values vary less around p.
The Definition of p-hat • Let x be a binomial RV that is the number of successes out of n trials, then: • Also recall the definition of m and s for a binomial distribution:
Linear Transformation • Notice that the distribution of p-hat is just a linear transformation of the binomial distribution. The random variable now takes on the values of:
So, what are the mean and SD of the distribution of p-hat? If we are going to model this distribution with a normal distribution, then we need to know the values of mu and sigma. • Recall the rules for linear tranformation?
Linear Transformation Rules • If X represents the original values, x is the average of the original values, and sx is the standard deviation of the original values, and if the new values are a linear transformation of X, Y=aX+b, then the new mean is given by: and the new standard deviation by:
Normal Approximation of Binomial • We can use the normal distribution to approximate the binomial when np ≥ 5 and nq ≥ 5. • If X is B(n, p) and np ≥ 5 and nq ≥ 5 then X can be approximated by
Normal Approximation of p-hat (page 519) • If n is sufficiently large (np≥5 and nq≥5), the distribution of p-hat will be approximately normal.
p and q • Keep in mind that p + q = 1 • Also that q = 1 – p
Example 8.3 • Suppose of all voters in a state, 30% are in favor of Proposition A. • If we sample 400 voters what is the approximate probability that less than 25% will be in favor of proposition A? • What is the approximate probability that the proportion of voters will be between 25% and 35%? • Let X = the number of voters out of 400 that are in favor of Proposition A. What is the exact distribution of X and the approximate distribution of X ?
Let’s Do It • LDI 8.5: Page 521
68-95-99.7 Rule • What percentage of p-hats fall within 2 standard deviations of the mean: p? • About 95% of all random samples should result in a sample proportion p-hat that is within two standard deviations of the population proportion p.
Works Both Ways • If 95% of the p-hats are within 2 standard deviations of p then 95% of the time p should be within an interval that is 2 standard deviations from p-hat. • Standard Error:
Standard Error • In practice we do not have the population standard deviation of the sampling statistic. So, we have to estimate it with the standard error. In this case it is an estimate of the average distance of possible p-hat values from the population proportion p.
Basic Idea • We are quite confident that the true population proportion is in the interval that is plus or minus two standard errors of p-hat
Example 8.4 • Suppose that a random sample of 400 voters yields a p-hat of 28%. Calculate the standard error and interpret it. • Knowing that the true population proportion p is usually within two standard deviations of the observed proportion p-hat, give the interval of values that we can be quite confident contains the true population proportion of voters that favor Proposition A.
What it does not say! • Note again that the 95% here is a probability associated with the method. We say that 95% of the time the interval will work in capturing the true value of p. But once we have an interval, there is no more discussion about the probability of the parameter being contained in the interval.
Let’s Do It • LDI 8.6: page 523 • LDI 8.7: page 525
Homework 16 • Read: pages 531-532, 534-541, 543-544 • LDI: 8.8, 8.9, 8.10, 8.11 • Exercises: 8.17, 8.18, 8.22, 8.25, 8.28, 8.30
Sampling Distribution of the Mean (x-bar) • The sampling distribution of the mean is the distribution of values of the sample mean in all possible random samples of the same size n taken from the same population.
Sampling Distribution of the Mean (x-bar) • The distribution of x-bar will be approximately normal if the sample size is large enough no matter what the original distributions shape. • If the original distribution is normal, then the distribution of x-bar will be exactly normal
Let’s Do It! • Let’s simulate the distribution of x-bar using these programs. We’ll use XBARINT for LDI 8.8 (page 532). • We’ll then try it again using the AGE data as our distribution to sample from. That is done using XBARSIM • LDI 8.9: page 541
Sampling Distribution of the Mean • If the original distribution has mean m and standard deviation s, then for large enough samples the distribution of x-bar will be approximately :
Sampling Distribution of the Mean • If the original distribution is normal with mean m and standard deviation s, then the distribution of x-bar will be exactly :
The Standard Error of the Mean (SEM) • Again, we will rarely know the population standard deviation of the mean ( ) we instead have to estimate it using the sample standard deviation s. We replace with s to get an estimate standard deviation of x-bar.
Let’s Do It • LDI 8.10: page 543 • LDI 8.11: page 544