240 likes | 468 Views
Section 8.1. Estimating When is Known In this section, we develop techniques for estimating the population mean μ using sample data. We assume that the population standard deviation σ is known. Estimating When is Known. Assumptions:
E N D
Section 8.1 Estimating When is Known In this section, we develop techniques for estimating the population mean μ using sample data. We assume that the population standard deviation σis known.
Estimating When is Known Assumptions: 1. We have a simple random sample of size nfrom a population of x values. 2. The value of σ, the population standard deviation of x, is known. 3. If the x distribution is normal, then our methods work for any sample size n. 4. If the distribution is unknown, a sample size of at least 30 (sometimes even more) is required.
Point Estimate • A point estimate is an estimate of a population parameter given by a single number. • We use (the sample mean) as a point estimate for μ (the population mean) • is the point estimate for μ.
Examples of Point Estimates • x is used as a point estimate for m . • s is used as a point estimate for s. • Margin of Error • Is the magnitude of the difference between the point estimate and the true parameter value. • The margin of error using xas a point estimate for m is
Confidence Level A confidence level, c, is a measure of the degree of assurance we have in our results. The value of c may be any number between zero and one. Typical values for c include 0.90, 0.95, and 0.99. Critical Value for a Confidence Level, c the value zc such that the area under the standard normal curve falling between – zc and zc is equal to c.
Confidence Level The area under the normal curve from is the probability that the standardized normal variable z lies in that interval. This means that
Example Find the critical value Find a number z0.90such that 90% of the area under the standard normal curve lies between z0.90 and z0.90 That is, we will find z0.90 such that P(– z0.90 < z < z0.90 ) = 0.90 Solution We know that to find the z value when we were given the area between -z and z. The first thing we do is to find the corresponding area to the left of –z. If A is the area between –z and z, then (1-A)/2 is the area to the left of z. In our case the area between –z and z is 0.90. The corresponding area in the left tail is (1-0.90)/2=0.05
Example cont. • According to Appendix Table 3, 0.0500 lies exactly halfway between two values in the table ( .0505 and .0495). • Averaging the z values associated with areas gives • z0.90 = 1.645 and z0.90= 1.645. • z0.90 = 1.645 is the critical value for a confidence level of c = 0.90. • We have • P(-1.645 < z < 1.645) = 0.90.
Some Common Levels of Confidence and Their Corresponding Critical Values
An estimate is not very valuable unless we have some kind of measure of how “good” it is. The probability can give us an idea of the size of the margin of error caused by using the sample mean as an estimate for the population mean. Remember that is a random variable. Each time we draw a sample of size n from a population, we can get a different value for . According to the central limit theorem, if the sample size is large, then has a distribution that is approximately normal with mean the population mean we are trying to estimate. The standard deviation is If x has a normal distribution, these results are true for any sample size.
This information, together with our work on confidence levels, leads us to the probability statement (1) Comment: To derive Equation (1), we start with the probability statement Since we can use the central limit theorem and replace z by Finally, we multiply all parts of the inequality by to obtain Equation (1).
Equation (1) uses the language of probability to give us an idea of the size of the margin of error for the corresponding confidence level c. In other words, Equation (1) states that the probability is c that our point estimate is within a distance of the population mean μ. The margin of error (or absolute error) using is a point estimate for μis | - μ|. In most practical problems, μis unknown, so the margin of error is also unknown. However, Equation (1) allows us to compute an error tolerance E, which serves as a bound on the margin of error.
Using a c% level of confidence, we can say that the point estimate differs from the population mean μby a maximal margin of error (2) Note: Formula (2) for E is based on the fact that the sampling distribution for is exactly normal, with mean μand standard deviation This occurs whenever the x distribution is normal with mean μand standard deviation σ. If the x distribution is not normal, then according to the central limit theorem, large samples produce an distribution that is approximately normal with mean μand standard deviation
Using Equations (1) and (2), we conclude that P(-E < - μ< E) = c (3) and P( - E < μ< + E) = c (4) Equation (4) states that there is a chance of c that the interval from - E to + E contains the population mean μ. We call this interval a c confidence interval for μ. A c confidence interval for is an interval computed from sample data in such a way that c is the probability of generating an interval containing the actual value of .
A c confidence interval for is an interval computed from sample data in such a way that c is the probability of generating an interval containing the actual value of . We may get a different confidence interval for each different sample that is taken. Some intervals will contain the population mean μand others will not. However, in the long run, the proportion of confidence intervals that contain μis c.
How Find a Confidence Interval for When is Known: Let xbe a random variable appropriate to your application.Obtain a simple random sample (of size n) of xvalues for which youcompute the sample mean, The value of σis already known (perhaps from a previous study). If you cannot assume x has a normal distribution, use a sample size of 30 or more. Confidence Interval for When is Known
Example Confidence interval for μ with σ known Create a 95% confidence interval for the mean driving time between Philadelphia and Boston. Assume that the mean driving time of 64 trips was =6.4 hours and that the standard deviation is σ = 0.9 hours. Solution The interval from - E to + E will be a 95% confidence interval for μ. In this case, = 6.4 hours = 0.9 hoursc = 95%, so zc = 1.96 and n = 64 and or 6.4 – .2205 < μ < 6.4 + .22056.1795 < μ < 6.6205We are 95% sure that the true time is between 6.18 and 6.62 hours.
Commentsof the term confidence interval It is important to realize that the endpoints are really statistical variables the equation States that we have a chance c of obtaining a sample such that the interval, once it is computed, will obtain the parameter μ. Of course, after the confidence interval, is numerically fixed, it either does or does not contain μ. So the probability is 1 or 0 that the interval, when it is fixed, will contain μ. A nontrivial probability statement can be made only about variables, not constants. Therefore, the above equation really states that if we repeat the experiment many times and get lots of confidence intervals (for the same sample size), then the proportion of all intervals that will turn out to contain the mean μis c.
We may get different confidence intervals for different samples.
Guided Exercise Walter jogs 3 miles. He knows that the standard deviation is σ = 2.40 minutes for his jogging times. For a random sample of 90 jogging times, his mean time was = 22.50 minutes. Let μbe the mean jogging time for the entire distribution of Walter’s 3-mile jogging times over the past several years. Find the 0.99 confidence interval for μ. • What is the value of (Tables) • Is the distribution approximately normal? Yes, we know this from the central limit theorem.
Guided Exercise cont. • What is the value of E? • What are the end points for a 0.99 confidence interval for μ? (e) How can we interpret the confidence interval? We are 99% certain that the interval from 21.85 to 23.15 is an interval that contains the population mean time μ.
PROCEDURE How to find the sample size n for estimating μ when σ is known When estimating the mean, how large a sample must be used in order to assure a given level of confidence?Use the formula: If n is not a whole number, increase n to the next higher whole number. Note that n is the minimal sample size for a specified confidence level and maximal error of estimate E. Note: Use the standard deviation, s, of a preliminary sample of size 30 or larger to estimate s.
Example Determine the sample size necessary to determine (with 99% confidence) the mean time it takes to drive from Philadelphia to Boston. We wish to be within 15 minutes of the true time. Assume that a preliminary sample of 45 trips had a standard deviation of 0.8 hours.Solution:z0.99 = 2.58 E = 15 minutes = 0.25 hoursSince the preliminary sample is large enough, we can assume that the population standard deviation is approximately equal to 0.8 hours. Minimum Sample Size is: n» 68.16 Round to the next higher whole number. To be 99% confident in our results, the minimum sample size = 69.
Example A study is designed to find the mean weight of salmon caught by an Alaskan fishing company. A recent study of a random sample of 50 salmon showed s = 2.15 lb. How large a sample should be taken to be 99% confident that the sample mean is within 0.20 lb of the true mean weight μ? Solution: WE conclude that a sample size of 770 or larger is enough to satisfy the specifications. Assignment 19