Chapter 12

Chapter 12 Inference About A Population

Inference With Variance Unknown… • Previously, we looked at estimating and testing the population mean when the population standard deviation ( ) was known or given: • But in general we do not know the actual population standard deviation and have to estimate it from the data? • The minute we do this, the “Z-statistic” used in all formulas changes to a “t-statistic” [Student t-statistic], provided you are sampling from a normal distribution.

Inference With Variance Unknown… • When is unknown, we use its point estimator s • Note that the t statistic has one parameter called “degrees of freedom”. [the normal has 2 parameters, etc.] • The degrees of freedom for the single mean problems we are working is given by d.f. = = n–1.

Testing when is unknown… • When the population standard deviation is unknown and the population is normal, the test statistic for testing hypotheses about is: • which is Student t distributed with = n–1 degrees of freedom. The confidence interval estimator of is given by:

Example 12.1… IDENTIFY • In a clinical trial, if the average time for a drug to take effect is greater than 450 minutes, it is declared ineffective. Thus, each new drug must be subjected to the following hypothesis test. If the null hypothesis is rejected in favor of the alternative hypothesis, the drug is not approved. • H0: < 450 • H1: > 450 • In general we would use a 5% level of significance and in this example we are going to randomly sample 50 patients.

Example 12.1… COMPUTE • Our test statistic is: • With n=50 data points, we have n–1=49 degrees of freedom. Our hypothesis under question is: • H1: > 450 • Our rejection region becomes: • Thus we will reject the null hypothesis in favor of the alternative if our calculated test static falls in this region.

Example 12.1… COMPUTE • From the data, we calculate = 460.38, s=38.83 and thus: • Since • we reject H0 in favor of H1. That is, there is sufficient evidence to conclude that the new drug is not effective.

Example 12.2… IDENTIFY • How do we estimate the mean time for a drug to take effect when the standard deviation is unknown? Same problem but different data. • Assume we take a random sampleof n = 83 patients and measure the time it takes for the drug to take effect. We want to construct a 95% confidence interval for the mean time, i.e. what is:

Example 12.2… COMPUTE • From the data, we calculate: • For this term • and so:

Example 12.2… INTERPRET • We are 95% confident that the population mean, , i.e. the mean time for the drug to become effective, lies between 13.20 minutes and 16.84 minutes. • If the sample size had been n = 21, what value of t would you use? • If the sample size had been n = 5, what value of t would you use?

Check Requisite Conditions… • The Student t distribution is robust, which means that if the population is nonnormal, the results of the t-test and confidence interval estimate are still valid provided that the population is “not extremely nonnormal”. • To check this requirement, draw a histogram of the data and see how “bell shaped” the resulting figure is. If a histogram is extremely skewed, that could be considered “extremely nonnormal” and hence t-statistics would be not be valid in this case. There are formal statistical tests available to test the hypothesis that your data comes from a normal distribution. Always wise to check this out, especially if the sample size is small [?]

Inference About Population Variance… • If we are interested in drawing inferences about a population’s variability, the parameter we need to investigate is the population variance: • The sample variance (s2)is an unbiased, consistent and efficient point estimator for . Moreover, • the statistic, , has a chi-squared distribution, • with n–1 degrees of freedom.

Testing & Estimating Population Variance • The test statistic used to test hypotheses about is: • (which is chi-squared with = n–1 degrees of freedom).

Inference: Population Proportion… • Test statistic for p: • The confidence interval estimator for p is given by: • (both of which require that np>5 and n(1–p)>5) • If this condition is not satisfied we can still work the problem with a different statistical approach.

Determine the Sample Size necessary to estimate the population proportion within + B with 95% confidence… • Two methods – in each case we choose a value for then • solve the equation for n. • Method 1 : no knowledge of even a rough value of . This is a ‘worst case scenario’ so we substitute = .50 • Method 2 : we have some idea about the value of . This is a better scenario and we substitute in our estimated value.

Selecting the Sample Size… • Method 1 : no knowledge of value of , use 50%: • Method 2 : some idea about a possible value, say 20%: • Thus, we can sample fewer people if we already have a reasonable estimate of the population proportion before starting.

Chapter 12

Chapter 12

Presentation Transcript

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

CHAPTER 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12

Chapter 12