400 likes | 421 Views
Inference About One Population. Chapter 12. 12.1 Introduction. In this chapter we utilize the approach developed before to describe a population. Identify the parameter to be estimated or tested. Specify the parameter’s estimator and its sampling distribution.
E N D
Inference About One Population Chapter 12
12.1 Introduction • In this chapter we utilize the approach developed before to describe a population. • Identify the parameter to be estimated or tested. • Specify the parameter’s estimator and its sampling distribution. • Construct a confidence interval estimator or perform a hypothesis test.
12.1 Introduction • We shall develop techniques to estimate and test three population parameters. • Population mean m • Population variance s2 • Population proportion p
12.2 Inference About a Population Mean When the Population Standard Deviation Is Unknown • Recall that when sis known we use the following • statistic to estimate and test a population mean • When sis unknown, we use its point estimator s, and the z-statistic is replaced then by the t-statistic
The t - Statistic Z Z t t Z t t Z t t Z t t Z t t t s s s s s s s s s s When the sampled population is normally distributed, the t statistic is Student t distributed.
The t - Statistic Using the t-table t s The “degrees of freedom”, (a function of the sample size) determine how spread the distribution is (compared to the normal distribution) The t distribution is mound-shaped, and symmetrical around zero. d.f. = v2 d.f. = v1 v1 < v2 0
Testing m when s is unknown • Example 12.1 - Productivity of newly hired Trainees
Testing m when s is unknown • Example 12.1 • In order to determine the number of workers required to meet demand, the productivity of newly hired trainees is studied. • It is believed that trainees can process and distribute more than 450 packages per hour within one week of hiring. • Can we conclude that this belief is correct, based on productivity observation of 50 trainees (see file Xm12-01).
Testing m when s is unknown • Example 12.1 – Solution • The problem objective is to describe the population of the number of packages processed in one hour. • The data are interval. H0:m = 450 H1:m > 450 • The t statistic d.f. = n - 1 = 49 We want to prove that the trainees reach 90% productivity of experienced workers
Testing m when s is unknown • Solution continued (solving by hand) • The rejection region is t > ta,n – 1ta,n - 1 = t.05,49 @ t.05,50 = 1.676.
Testing m when s is unknown Rejection region • The test statistic is 1.676 1.89 • Since 1.89 > 1.676 we reject the null hypothesis in favor of the alternative. • There is sufficient evidence to infer that the mean productivity of trainees one week after being hired is greater than 450 packages at .05 significance level.
.05 .0323 Testing m when s is unknown • Since .0323 < .05, we reject the null hypothesis in favor of the alternative. • There is sufficient evidence to infer that the mean productivity of trainees one week after being hired is greater than 450 packages at .05 significance level.
Estimating m when s is unknown • Confidence interval estimator of m when sis unknown
Estimating m when s is unknown • Example 12.2 • An investor is trying to estimate the return on investment in companies that won quality awards last year. • A random sample of 83 such companies is selected, and the return on investment is calculated had he invested in them. • Construct a 95% confidence interval for the mean return.
Estimating m when s is unknown • Solution (solving by hand) • The problem objective is to describe the population of annual returns from buying shares of quality award-winners. • The data are interval. • Solving by hand • From the Xm12-02 we determine t.025,82@ t.025,80
Checking the required conditions • We need to check that the population is normally distributed, or at least not extremely nonnormal. • There are statistical methods to test for normality (one to be introduced later in the book). • From the sample histograms we see…
A Histogram for Xm12- 01 Packages A Histogram for Xm12- 02 Returns
12.3 Inference About a Population Variance • Sometimes we are interested in making inference about the variability of processes. • Examples: • The consistency of a production process for quality control purposes. • Investors use variance as a measure of risk. • To draw inference about variability, the parameter of interest is s2.
12.3 Inference About a Population Variance • The sample variance s2 is an unbiased, consistent and efficient point estimator for s2. • The statistic has a distribution called Chi-squared, if the population is normally distributed. d.f. = 5 d.f. = 10
Testing and Estimating a Population Variance • From the following probability statement P(c21-a/2 < c2 < c2a/2) = 1-awe have (by substituting c2 = [(n - 1)s2]/s2.)
Testing the Population Variance • Example 12.3 (operation management application) • A container-filling machine is believed to fill 1 liter containers so consistently, that the variance of the filling will be less than 1 cc (.001 liter). • To test this belief a random sample of 25 1-liter fills was taken, and the results recorded (Xm12-03) • Do these data support the belief that the variance is less than 1cc at 5% significance level?
Testing the Population Variance • Solution • The problem objective is to describe the population of 1-liter fills from a filling machine. • The data are interval, and we are interested in the variability of the fills. • The complete test is: H0:s2 = 1 H1: s2 <1 We want to know whether the process is consistent
Solving by hand • Note that (n - 1)s2 = S(xi - x)2 = Sxi2 – (Sxi)2/n • From the sample (Xm12-03) we can calculate Sxi = 24,996.4, and Sxi2 = 24,992,821.3 • Then (n - 1)s2 = 24,992,821.3-(24,996.4)2/25 =20.78 Testing the Population Variance There is insufficient evidence to reject the hypothesis that the variance is less than 1.
Testing the Population Variance a = .05 1-a = .95 Rejection region 13.8484 20.8 Do not reject the null hypothesis
Estimating the Population Variance • Example 12.4 • Estimate the variance of fills in Example 12.3 with 99% confidence. • Solution • We have (n-1)s2 = 20.78.From the Chi-squared table we havec2a/2,n-1 = c2.005, 24 = 45.5585c21-a/2,n-1c2.995, 24 = 9.88623
Estimating the Population Variance • The confidence interval estimate is
12.4 Inference About a Population Proportion • When the population consists of nominal data, the only inference we can make is about the proportion of occurrence of a certain value. • The parameter p was used before to calculate these probabilities under the binomial distribution.
Under certain conditions, [np > 5 and n(1-p) > 5], is approximately normally distributed, withm = p and s2 = p(1 - p)/n. 12.4 Inference About a Population Proportion • Statistic and sampling distribution • the statistic used when making inference about p is:
Testing and Estimating the Proportion • Test statistic for p • Interval estimator for p (1-a confidence level)
Additional example Testing the Proportion • Example 12.5 (Predicting the winner in election day) • Voters are asked by a certain network to participate in an exit poll in order to predict the winner on election day. • Based on the data presented in Xm12-05 where 1=Democrat, and 2=Republican), can the network conclude that the republican candidate will win the state college vote?
Testing the Proportion • Solution • The problem objective is to describe the population of votes in the state. • The data are nominal. • The parameter to be tested is ‘p’. • Success is defined as “Vote republican”. • The hypotheses are: H0: p = .5 H1: p > .5 More than 50% vote Republican
Testing the Proportion • Solving by hand • The rejection region is z > za = z.05 = 1.645. • From file we count 407 success. Number of voters participating is 765. • The sample proportion is • The value of the test statistic is • The p-value is = P(Z>1.77) = .0382
Testing the Proportion There is sufficient evidence to reject the null hypothesis in favor of the alternative hypothesis. At 5% significance level we can conclude that more than 50% voted Republican.
Estimating the Proportion • Nielsen Ratings • In a survey of 2000 TV viewers at 11.40 p.m. on a certain night, 226 indicated they watched “The Tonight Show”. • Estimate the number of TVs tuned to the Tonight Show in a typical night, if there are 100 million potential television sets. Use a 95% confidence level. • Solution
Estimating the Proportion • Solution A confidence interval estimate of the number of viewers who watched the Tonight Show:LCL = .099(100 million)= 9.9 millionUCL = .127(100 million)=12.7 million
Selecting the Sample Size to Estimate the Proportion • Recall: The confidence interval for the proportion is • Thus, to estimate the proportion to within W, we can write
Selecting the Sample Size to Estimate the Proportion • The required sample size is
Sample Size to Estimate the Proportion • Example • Suppose we want to estimate the proportion of customers who prefer our company’s brand to within .03 with 95% confidence. • Find the sample size. • Solution W = .03; 1 - a = .95, therefore a/2 = .025, so z.025 = 1.96 Since the sample has not yet been taken, the sample proportion is still unknown. We proceed using either one of the following two methods:
Method 2: • There is some idea about the value of • Use the value of to calculate the sample size Sample Size to Estimate the Proportion • Method 1: • There is no knowledge about the value of • Let . This results in the largest possible n needed for a 1-a confidence interval of the form . • If the sample proportion does not equal .5, the actual W will be narrower than .03 with the n obtained by the formula below.