590 likes | 841 Views
Statistics. Chapter 7: Inferences Based on a Single Sample: Estimation with Confidence Interval. Where We’ve Been. Populations are characterized by numerical measures called parameters Decisions about population parameters are based on sample statistics
E N D
Statistics Chapter 7: Inferences Based on a Single Sample: Estimation with Confidence Interval
Where We’ve Been • Populations are characterized by numerical measures called parameters • Decisions about population parameters are based on sample statistics • Inferences involve uncertainty reflected in the sampling distribution of the statistic McClave, Statistics, 11th ed. Chapter 7: Inferences Based on a Single Sample: Estimation by Confidence Intervals
Where We’re Going • Estimate a parameter based on a large sample • Use the sampling distribution of the statistic to form a confidence interval for the parameter • Select the proper sample size when estimating a parameter McClave, Statistics, 11th ed. Chapter 7: Inferences Based on a Single Sample: Estimation by Confidence Intervals
Statistics inAction Speed ---- Can a high school football player improve his sprint time? McClave, Statistics, 11th ed. Chapter 1: Statistics, Data and Statistical Thinking 4
7.1: Identifying the Target Parameter • The unknown population parameter that we are interested in estimating is called the target parameter. McClave, Statistics, 11th ed. Chapter 7: Inferences Based on a Single Sample: Estimation by Confidence Intervals
7.2: Large-Sample Confidence Interval for a Population Mean • A point estimator of a population parameter is a rule or formula that tells us how to use the sample data to calculate a single number that can be used to estimate the population parameter. McClave, Statistics, 11th ed. Chapter 7: Inferences Based on a Single Sample: Estimation by Confidence Intervals
7.2: Large-Sample Confidence Interval for a Population Mean • Suppose a sample of 225 college students watch an average of 28 hours of television per week, with a standard deviation of 10 hours. • What can we conclude about all college students’ television time? McClave, Statistics, 11th ed. Chapter 7: Inferences Based on a Single Sample: Estimation by Confidence Intervals
7.2: Large-Sample Confidence Interval for a Population Mean • Assuming a normal distribution for television hours, we can be 95%* sure that *In the standard normal distribution, exactly 95% of the area under the curve is in the interval -1.96 … +1.96 McClave, Statistics, 11th ed. Chapter 7: Inferences Based on a Single Sample: Estimation by Confidence Intervals
7.2: Large-Sample Confidence Interval for a Population Mean • An interval estimator orconfidence intervalis a formula that tell us how to use sample data to calculate an interval that estimates a population parameter. McClave, Statistics, 11th ed. Chapter 7: Inferences Based on a Single Sample: Estimation by Confidence Intervals
7.2: Large-Sample Confidence Interval for a Population Mean • The confidence coefficient is the probability that a randomly selected confidence interval encloses the population parameter. • The confidence level is the confidence coefficient expressed as a percentage. (90%, 95% and 99% are very commonly used.) 95% sure Z=1.96 McClave, Statistics, 11th ed. Chapter 7: Inferences Based on a Single Sample: Estimation by Confidence Intervals
7.2: Large-Sample Confidence Interval for a Population Mean • The area outside the confidence interval is called 95 % sure So we are left with (1 – 95)% = 5% = uncertainty about µ McClave, Statistics, 11th ed. Chapter 7: Inferences Based on a Single Sample: Estimation by Confidence Intervals
7.2: Large-Sample Confidence Interval for a Population Mean • Large-Sample 100(1-)% Confidence Interval for µ • If is unknown and n is large, the confidence interval becomes s is the sample standard deviation McClave, Statistics, 11th ed. Chapter 7: Inferences Based on a Single Sample: Estimation by Confidence Intervals
7.2: Large-Sample Confidence Interval for a Population Mean • If n is large (i.e., n30), the sampling distribution of the sample mean is normal, and s is a good estimate of McClave, Statistics, 11th ed. Chapter 7: Inferences Based on a Single Sample: Estimation by Confidence Intervals
EXAMPLE 7.1A LARGE-SAMPLE CONFIDENCE INTERVAL FOR —Mean Number of unoccupied Seats per Flight Problem Unoccupied seats on flights cause airlines to lose revenue. Suppose a large airline wants to estimate its average number of unoccupied seats per flight over the past year. To accomplish this, the records of 225 flights are randomly selected, and the number of unoccupied seats is noted for each of the sampled flights. (The data are saved in the AIRNOSHOWS file.) Descriptive statistics for the data are displayed in the MINITAB printout of Figure 7.6. Estimate , the mean number of unoccupied seats per flight during the past year, using a 90% confidence interval.
Solution The general form of the 90% confidence interval for a population mean is From Figure 7.6, we find (after rounding) that Since we do not know the value of (the standard deviation of the number of unoccupied seats per flight for all flights of the year), we use our best approximation—the sample standard deviation, s = 4.1 shown on the MINITAB printout. Then the 90% confidence interval is approximately or from 11.15 to 12.05.That is, at the 90% confidence level, we estimate the mean number of unoccupied seats per flight to be between 11.15 and 12.05 during the sampled year. This result is verified (except for rounding) on the right side of the MINITAB printout in Figure 7.6. FIGURE 7.6 MINITAB printout with descriptive statistics and 90% confidence interval for Example 7.1
Interpretation of a Confidence Interval for a Population Mean “100(1-)% Confidence Interval for µ” means we can be 100(1-)% confident that μlies between the lower and upper bounds of the confidence interval McClave, Statistics, 11th ed. Chapter 7: Inferences Based on a Single Sample: Estimation by Confidence Intervals 16
7.2: Large-Sample Confidence Interval for a Population Mean Sometimes we yield a confidence interval that is too wide for use. So, to reduce the width of the interval to obtain a more precise estimate of μ. One way, to decrease the confidence coefficient, 1-, narrowing the interval Alternative, to decrease the width of an interval without sacrificing “confidence” is to increase the sample size n McClave, Statistics, 11th ed. Chapter 7: Inferences Based on a Single Sample: Estimation by Confidence Intervals 17
Statistics inAction Revisited Estimating the Mean Decrease in sprint time McClave, Statistics, 11th ed. Chapter 1: Statistics, Data and Statistical Thinking 18
7.3: Small-Sample Confidence Interval for a Population Mean Large Sample Small Sample • Sampling Distribution on is normal • Known or large n Standard Normal (z)Distribution • Sampling Distribution on is unknown • Unknown and small n Student’s t Distribution (with n-1 degrees of freedom) McClave, Statistics, 11th ed. Chapter 7: Inferences Based on a Single Sample: Estimation by Confidence Intervals
7.3: Small-Sample Confidence Interval for a Population Mean Large Sample Small Sample McClave, Statistics, 11th ed. Chapter 7: Inferences Based on a Single Sample: Estimation by Confidence Intervals
7.3: Small-Sample Confidence Interval for a Population Mean * If not, see Chapter 14 McClave, Statistics, 11th ed. Chapter 7: Inferences Based on a Single Sample: Estimation by Confidence Intervals
7.3: Small-Sample Confidence Interval for a Population Mean • Suppose a sample of 25 college students watch an average of 28 hours of television per week, with a standard deviation of 10 hours. • What can we conclude about all college students’ television time? McClave, Statistics, 11th ed. Chapter 7: Inferences Based on a Single Sample: Estimation by Confidence Intervals
7.3: Small-Sample Confidence Interval for a Population Mean • Assuming a normal distribution for television hours, we can be 95% sure that McClave, Statistics, 11th ed. Chapter 7: Inferences Based on a Single Sample: Estimation by Confidence Intervals
EXAMPLE 7.2A SMALL-SAMPLE CONFIDENCE INTERVAL FOR —Destructive Sampling Problem Some quality control experiments require destructive sampling (i.e., the test to determine whether the item is defective destroys the item) in order to measure a particular characteristic of the product. The cost of destructive sampling often dictates small samples. Suppose a manufacturer of printers for personal computers wishes to estimate the mean number of characters printed before the printhead fails. The printer manufacturer tests n = 15 printheads and records the number of characters printed until failure for each. These 15 measurements (in millions of characters) are listed in Table 7.5, followed by a MINITAB summary statistics printout in Figure 7.10.
EXAMPLE 7.2A SMALL-SAMPLE CONFIDENCE INTERVAL FOR —Destructive Sampling Problem (續) a. Form a 99% confidence interval for the mean number of characters printed before the printhead fails. Interpret the result. b. What assumption is required for the interval you found in part a to be valid? Is that assumption reasonably satisfied? FIGURE 7.10 MINITAB printout with descriptive statistics and 99% confidence interval for Example 7.2
Solution a. For this small sample (n = 15) we use the t-statistic to form the confidence interval.We use a confidence coefficient of .99 and n – 1 = 14 degrees of freedom tofind in Table VI: and s = .19. Substituting these (rounded) values into the confidence interval formula, we obtain This interval is highlighted in Figure 7.10.Our interpretation is as follows: The manufacturer can be 99% confident thatthe printhead has a mean life of between 1.09 and 1.39 million characters. If the manufacturerwere to advertise that the mean life of its printheads is (at least) 1 millioncharacters, the interval would support such a claim. Our confidence is derived fromthe fact that 99% of the intervals formed in repeated applications of thisprocedurewill contain
Solution b. Since n is small, we must assume that the number of characters printed before the printhead fails is a random variable from a normal distribution. That is, we assume mthat the population from which the sample of 15 measurements is selected is distributed normally. One way to check this assumption is to graph the distribution of data in Table 7.5. If the sample data are approximately normally distributed, then the population from which the sample is selected is very likely to be normal. A MINITAB stem-and-leaf plot for the sample data is displayed in Figure 7.11. The distribution is mound shaped and nearly symmetric. Therefore, the assumption of normality appears to be reasonably satisfied. FIGURE 7.11 MINITAB stem-and-leaf display of data in Table 7.4
EXAMPLE 7.3ESTIMATING A POPULATION PROPORTION—Fraction Who Trustthe President Problem Public-opinion polls are conducted regularly to estimate the fraction of U.S. citizens who trust the president. Suppose 1,000 people are randomly chosen and 637 answer that they trust the president. How would you estimate the true fraction of all U.S. citizens who trust the president?
Solution What we have really asked is how you would estimate the probability p of success in a binomial experiment in which p is the probability that a person chosen trusts the president. One logical method of estimating p for the population is to use the proportion of successes in the sample. That is, we can estimate p by calculating.
7.4: Large-Sample Confidence Interval for a Population Proportion • Sampling distribution of • The mean of the sampling distribution is p, the population proportion. • The standard deviation of the sampling distribution is where • For large samples the sampling distribution is approximately normal. Large is defined as McClave, Statistics, 11th ed. Chapter 7: Inferences Based on a Single Sample: Estimation by Confidence Intervals
7.4: Large-Sample Confidence Interval for a Population Proportion • Sampling distribution of McClave, Statistics, 11th ed. Chapter 7: Inferences Based on a Single Sample: Estimation by Confidence Intervals
7.4: Large-Sample Confidence Interval for a Population Proportion We can be 100(1-)% confident that where and McClave, Statistics, 11th ed. Chapter 7: Inferences Based on a Single Sample: Estimation by Confidence Intervals
7.4: Large-Sample Confidence Interval for a Population Proportion A nationwide poll of nearly 1,500 people … conducted by the syndicated cable television show Dateline: USA found that more than 70 percent of those surveyed believe there is intelligent life in the universe, perhaps even in our own Milky Way Galaxy. What proportion of the entire population agree, at the 95% confidence level? McClave, Statistics, 11th ed. Chapter 7: Inferences Based on a Single Sample: Estimation by Confidence Intervals
EXAMPLE 7.4 A LARGE-SAMPLE CONFIDENCE INTERVAL FOR p—ProportionOptimistic about the Economy Problem Many public polling agencies conduct surveys to determine the current consumer sentiment concerning the state of the economy. For example, the Bureau of Economic and Business Research (BEBR) at the University of Florida conducts quarterly surveys to gauge consumer sentiment in the Sunshine State. Suppose that BEBR randomly samples 484 consumers and finds that 257 are optimistic about the state of the economy. Use a 90% confidence interval to estimate the proportion of all consumers in Florida who are optimistic about the state of the economy. Based on the confidence interval, can BEBR infer that the majority of Florida consumers are optimistic about the economy?
Solution The number x of the 484 sampled consumers who are optimistic about the Florida economy is a binomial random variable if we can assume that the sample was randomly selected from the population of Florida consumers and that the poll was conducted identically for each consumer sampled. The point estimate of the proportion of Florida consumers who are optimistic about the economy is We first check to be sure that the sample size is sufficiently large that the normal distribution provides a reasonable approximation to the sampling distribution of We require the number of successes in the sample, and the number of failures, both to be at least 15. Since the number of successes is and the number of failures is we may conclude that the normal approximation is reasonable. We now proceed to form the 90% confidence interval for p, the true proportion of Florida consumers who are optimistic about the state of the economy:
Solution (續) (This interval is also shown on the MINITAB printout of Figure 7.13.) Thus, we can be 90% confident that the proportion of all Florida consumers who are confident about the economy is between .494 and .568. As always, our confidence stems from the fact that 90% of all similarly formed intervals will contain the true proportion p and not from any knowledge about whether this particular interval does. FIGURE 7.13 Portion of MINITAB printout with 90% confidence interval for p
Solution (續) Can we conclude on the basis of this interval that the majority of Florida consumers are optimistic about the economy? If we wished to use this interval to infer that a majority is optimistic, the interval would have to support the inference that p exceeds .5—that is, that more than 50% of Florida consumers are optimistic about the economy. Note that the interval contains some values below .5 (as low as .494) as well as some above .5 (as high as .568).Therefore, we cannot conclude on the basis of this 90% confidence interval that the true value of p exceeds .5.
7.4: Large-Sample Confidence Interval for a Population Proportion • If p is close to 0 or 1, Wilson’s adjustment for estimating p yields better results where McClave, Statistics, 11th ed. Chapter 7: Inferences Based on a Single Sample: Estimation by Confidence Intervals
7.4: Large-Sample Confidence Interval for a Population Proportion Suppose in a particular year the percentage of firms declaring bankruptcy that had shown profits the previous year is .002. If 100 firms are sampled and one had declared bankruptcy, what is the 95% CI on the proportion of profitable firms that will tank the next year? McClave, Statistics, 11th ed. Chapter 7: Inferences Based on a Single Sample: Estimation by Confidence Intervals
EXAMPLE 7.5USING THE ADJUSTED CONFIDENCE INTERVAL PROCEDURE—Proportion Who Are Victims of a Violent Crime Problem According to True Odds: How Risk Affects Your Everyday Life (Walsh, 1997), the probability of being the victim of a violent crime is less than .01. Suppose that, in a random sample of 200 Americans, 3 were victims of a violent crime. Use a 95% confidence interval to estimate the true proportion of Americans who were victims of a violent crime.
Solution Let p represent the true proportion of Americans who were victims of a violent crime. Since p is near 0, an “extremely large” sample is required to estimate its value by the usual large-sample method. Since we are unsure whether the sample size of 200 is large enough, we will apply Wilson’s adjustment outlined in the box. The number of “successes” (i.e., number of victims of a violent crime) in the sample is x = 3 Therefore, the adjusted sample proportion is
Solution(續) Note that this adjusted sample proportion is obtained by adding a total of four observations—two “successes” and two “failures”—to the sample data. Substituting into the equation for a 95% confidence interval, we obtain or (.004, .046). Consequently, we are 95% confident that the true proportion of Americans who are victims of a violent crime falls between .004 and .046.
Statistics inAction Revisited Estimating the Proportion of Sprinters who Improve after Speed Training McClave, Statistics, 11th ed. Chapter 1: Statistics, Data and Statistical Thinking 43
7.5: Determining the Sample Size • To be within a certain sampling error (SE) of µ with a level of confidence equal to 100(1-)%, we can solve for n: McClave, Statistics, 11th ed. Chapter 7: Inferences Based on a Single Sample: Estimation by Confidence Intervals
7.5: Determining the Sample Size • The value of will almost always be unknown, so we need an estimate: • s from a previous sample • approximate the range, R, and use R/4 • Round the calculated value of nupwards to be sure you don’t have too small a sample. McClave, Statistics, 11th ed. Chapter 7: Inferences Based on a Single Sample: Estimation by Confidence Intervals
7.5: Determining the Sample Size • Suppose we need to know the mean driving distance for a new composite golf ball within 3 yards, with 95% confidence. A previous study had a standard deviation of 25 yards. How many golf balls must we test? McClave, Statistics, 11th ed. Chapter 7: Inferences Based on a Single Sample: Estimation by Confidence Intervals
7.5: Determining the Sample Size Suppose we need to know the mean driving distance for a new composite golf ball within 3 yards, with 95% confidence. A previous study had a standard deviation of 25 yards. How many golf balls must we test? McClave, Statistics, 11th ed. Chapter 7: Inferences Based on a Single Sample: Estimation by Confidence Intervals
EXAMPLE 7.6SAMPLE SIZE FOR ESTIMATING —Mean Inflation Pressure of Footballs Problem Suppose the manufacturer of official NFL footballs uses a machine to inflate the new balls to a pressure of 13.5 pounds. When the machine is properly calibrated, the mean inflation pressure is 13.5 pounds, but uncontrollable factors cause the pressures of individual footballs to vary randomly from about 13.3 to 13.7 pounds. For quality control purposes, the manufacturer wishes to estimate the mean inflation pressure to within .025 pound of its true value with a 99% confidence interval. What sample size should be specified for the experiment?
Solution We desire a 99% confidence interval that estimates with a sampling error of SE = .025 For a 99% confidence interval, we have To estimate we note that the range of observations is R = 13.7 – 13.3 = .4 and we use Next, we employ the formula derived in the box to find the sample size n: We round this up to n = 107 Realizing that was approximated by R/4 we might even advise that the sample size be specified as n = 110 to be more certain of attaining the objective of a 99% confidence interval with a sampling error of .025 pound or less.
7.5: Determining the Sample Size To estimate p, use the sample proportion from a prior study, or use p = .5. Round the value of n upward to ensure the sample size is large enough to produce the required level of confidence. • For a confidence interval on the population proportion, p, we can solve for n: McClave, Statistics, 11th ed. Chapter 7: Inferences Based on a Single Sample: Estimation by Confidence Intervals