220 likes | 231 Views
Learn about normal distribution, statistical estimation, sampling distribution, and confidence intervals in statistical inference. Explore examples and calculations on normal curves for population and sample mean.
E N D
Chapter 7 Statistical Inference and Sampling
x μ Normal Curve for Population • Individual observations, X’s, follow a normal distribution with mean = μ and standard deviation = σ. The following figure portrays the shape of normal population. • That is, X is a normal random variable. The corresponding standard normal variable Z can be obtained by the following.
Examples on Normal Curve for Population • The estimated miles-per-gallon ratings of a class of trucks are normally distributed with a mean of 12.8 and a standard deviation of 3.2. What is the probability that one of these trucks selected at random would get between 13 and 15 miles per gallon? 13 z1 X z 15 z2 12.8 0 Or, the area from mean to z1 = 0.0239 Or, the area from mean to z2 = 0.2549 Or, the area from z1 to z2 = ? So, the area from z1 to z2 = 0.2549 – 0.0239 = 0.231
Examples on Normal Curve for Population • The examination committee of the American Society for Quality passes 40% of those that take the exam. If the scores follow a normal distribution with an average score of 75 and a standard deviation of 16, what is a minimum passing score? 40% X X 75 40% z Z 0 The area from mean to z = 0.50 – 0.40 = 0.10 So, z = 0.26 [From Normal Dist. Table]
Estimation • Statistical estimation is the process of estimating a parameter of a population from a corresponding sample statistic. • Example: Usually population means (μ) are unknown and have to be estimated from sample means ( ). • Two Approaches to Statistical Estimation • Point estimate: A single value that represents the best estimate of the population value. For example, the sample mean ( ) is the best point estimate for the population mean (μ). Similarly, the sample standard deviation (s) is the best point estimate for the population standard deviation (σ). That is, μ = X-bar, and σ = s. • Interval estimation: Builds on point estimate to arrive at a range of values that we are confident contain the population parameter. The range of values is called a confidence interval. For example, the confidence interval for population mean (μLL≤ μ ≤ μUL) can be estimated from the sample mean. μLL μUL X-bar Note that μLL and μUL are equidistant from X-bar, and are estimated from X-bar
Distribution of X-bar • X-bar is a random variable, because different samples drawn from the same population on a specific characteristic will result in different values of X-bar. • Since the sample mean, X-bar, is used to estimate the population mean, μ, we need to understand how X-bar behaves. That is, if we observe values of X-bar indefinitely, where will they center and how will they spread out? • X-bar is normally distributed regardless of the shape of the sampled population. That is, if we observe values of X-bar indefinitely and plot these values in a graph, we will obtain a normal curve. • The distribution of X-bar is based on the Central Limit Theorem. Central Limit Theorem states that when obtaining large samples (generally n > 30) from any population, the sample mean, X-bar, will follow an approximate normal distribution. X • The probability distribution of X-bar is called the sampling distribution of X-bar.
σ n x = X X follows a normal distribution, centered at µ with a standard deviation / n µx = μ Sampling Distribution of the Sample Mean • The mean of the distribution of X-bar is denoted by μX-barand equals μ. That is, μX-bar = μ. • The standard deviation of the distribution (denoted by σX-bar) equals σ/SQRT(n). That is, σX-bar = σ/SQRT(n). The standard deviation of the distribution is called the standard error. • The corresponding standard normal variable Z of X-bar can be obtained by the following.
Population (mean = µ, standard deviation = ) Random sample (mean = X, standard deviation = s Assumes the individual observations follow a normal distribution X = value from this population σ n x x = X follows a normal distribution, centered at µ with a standard deviation / n X µx = μ μ Normal Curves for Population and Sample Mean
= 30 P(X>445) 30 4 x x = P(X-bar>445) 445 X μ = 400 445 µx =μ=400 Example of Normal Population and Sampling Distribution of Mean • The life span of Good Old Everglo Bulbs follows a normal distribution with a mean of life of 400 hours and a standard deviation of 30 hours. a) What percentage of bulbs sold would you expect to last more than 445 hours? b) What is the probability that 4 bulbs selected at random willhave an average life span of more than 445 hours? P(X > 445) = P(Z>1.5) = 0.5 – 0.4332 = 0.0668 P(X-bar > 445) = P(Z>3.0) = 0.5 – 0.4987 = 0.0013
σ n x = σ n x = X µLL µUL X µx = μ Confidence Intervals (CI) for Population Mean • According to the distribution of X-bar, the mean of all possible values of X-bar gives the population mean. Then why estimate the population mean? • CI for µ builds on sample mean to arrive at a range of values that will include the population mean. The boundaries of these values are called confidence limits. There are two confidence limits – lower limit and upper limit.
σ n x = X µLL -zα/2 µUL +zα/2 Confidence Intervals (CI) for Population Mean • How can we obtain μLL and μUP? We know that, For µLL, For µLL,
Significance Level Confidence Level X µLL µUL -zα/2 +zα/2 Confidence Level + Significance Level = 1 Confidence Intervals (CI) for Population Mean • How to obtain Z values? The values (–z) and (+z) are equidistant from the center of the curve. The area from (-z) to (+z) is called the confidence level (CL). The significance level equals (1 – CL) and is denoted by α (alpha). We can obtain Z values if we know either the significance level or the confidence level. To obtain the Z value, we need to know the area from the center of the curve to the Z value. This area equals (CL/2). Use Normal Distribution Table to obtain Z value. • When the population standard deviation, σ, is known, the distribution of X-bar follows a Z normal distribution. Therefore, we use the following to calculate the CI for population mean when σ is known.
Examples on CI for Population Mean When σ Is Known • A random sample of 100 observations is obtained from a normally distributed population with a standard deviation of 10. What is a 95% confidence interval for the mean of the population if the sample mean is 40? 0.475 0.95 X-bar = 40, n = 100, σ = 10, Zα/2 = 1.96 - Zα/2 Z α/2 = 1.96
Examples on CI for Population Mean When σ Is Known • Find the 90% confidence interval for the mean of a normally distributed population using the following data. Assume a standard deviation of 5. 49 50 43 65 52 45 60 38 62 0.45 0.90 - Zα/2 Zα/2 = 1.65 X-bar = 464/9 = 51.56, n = 9, σ = 5, Zα/2 = 1.65
s n x = X µLL -t α/2,n-1 µUL +t α/2,n-1 CI for Population Mean When σ Is Unknown • When σ is unknown, (1) the distribution of X-bar follows a t normal distribution instead of Z normal distribution, and (2) σ is estimated by the sample standard deviation, s. We know that, For µLL, For µLL,
CI for Population Mean When σ Is Unknown (Cont.) • However, when the sample size is large (n ≥ 30), t values get closer to z values. Also not all t values are available when degrees of freedom is more than 30. Therefore, for convenience’s sake, when n ≥ 30 and σ is unknown, we use z distribution instead. That is, • How to obtain t values? We need two parameters: (1) The area at the right of t value (2) Degrees of Freedom = n – 1. α tα
Examples on How to Obtain t Values • For a t distribution with 20 degrees of freedom, what is the value of the t value such that the following are true? • 10% of the area under the t distribution is to the right of the t value. 0.10 0.05 t0.10, 20 t0.05, 20 t0.10, 20 = 1.325 • 10% of the area under the t distribution is to the right of the t value. 0.90 t0.90, 20 = -1.325 0.10 - t0.10, 20 = t0.90, 20 • 5% of the area under the t distribution is to the left of the t value. -t0.05, 20 = -1.725 0.05 - t0.05, 20
Examples on CI for Population Mean When σ Is Unknown • A random sample of size 20 is selected from a normally distributed population. The sample mean is 50 and the sample standard deviation is 10. Find a 90% confidence interval for the population mean. α/2 = 0.05 α/2 = 0.05 0.45 0.90 X-bar = 50, n = 20, s = 10, tα/2, n-1 = 1.729 t0.05, 19 = 1.729
Examples on CI for Population Mean When σ Is Unknown • Find the 95% confidence interval for the mean of a normally distributed population using the following data. 49 50 43 65 52 45 60 38 62 α/2 = 0.025 α/2 = 0.025 0.475 0.95 X-bar = 464/9 = 51.55, n = 9, s = 9.15, tα/2, n-1 = 2.306 t0.025, 8 = 2.306
Margin of Error, E, And Determination of the Sample Size The general formula for constructing CI is: CI = statistic ± (critical value) × (standard error of the statistic) Sample Size for Unknown σ: CI = statistic ± (Margin of Error) (1) σ is estimated by s. (2) σ is approximated by (H – L)/4.
Examples on Determination of the Sample Size • A national retail association wants to estimate the average amount of dollars lost each month due to theft in its member stores. Past records show that the highest and lowest dollar amounts lost due to theft were $1325 and $25, respectively. If it wants to be 95% confident that the error in its estimate is no more than $100, how many stores would need to be included in the sample to produce an estimate of the desired accuracy? 0.475 0.95 - Zα/2 Zα/2 = 1.96 n = ?, E = 100, Zα/2 = 1.96, σ≈ (H – L)/4 = (1325 – 25)/4 = 325
Examples on CI for Population Mean When σ Is Unknown • A national retail association wants to estimate the average amount of dollars lost each month due to theft in its member stores. Nine of its member stores lost the following dollar amounts last month. If it wants to be 95% confident that the error in its estimate is no more than $5, how many stores would need to be included in the sample to produce an estimate of the desired accuracy? 49 50 43 65 52 45 60 38 62 n = ?, E = 5, Zα/2 = 1.96, σ = ? σ = s = 9.15