Estimating Population Mean: T-Procedures Robustness and Confidence Intervals

SECTION 10.2 Estimating a Population Mean

What’s the difference between what we did in Section 10.1 and what we are beginning in Section 10.2? • In reality, the standard deviation σ of the population is unknown, so the procedures from last section are not useful. However, the understanding of the logic of the procedures will continue to be of use. • In order to be more realistic, σ is estimated from the data collected using s

Conditions for Inference about a Population Mean • The data is an SRS from the population • Observations from the population have a normal distribution with an unknown mean () and unknown standard deviation (σ) • Independence is assumed for the individual observations when calculating a confidence interval. When we are sampling without replacement from a finite population, it is sufficient to verify that the population is at least 10 times the sample size.

CAUTION • Be sure to check that the conditions for constructing a confidence interval for the population mean are satisfied before you perform any calculations.

ROBUSTNESS • ROBUST: Confidence levels do not change when certain assumptions are violated • Fortunately for us, the t-procedures are robust in certain situations. • Therefore . . .

This is when we use the t-procedures: • It’s more important for the data to be an SRS from a population than the population has a normal distribution • If n is less than 15, the data must be normal to use t-procedures • If n is at least 15, the t-procedures can be used except if there are outliers or strong skewness • If n≥30, t-procedures can be used even in the presence of strong skewness, but outliers must still be examined • Essentially, as long as there are no significant departures from Normality (especially outliers) then the t procedures still work quite well.

Standard Error • In this setting, each sample is a part of a sampling distribution that is a normal distribution with a mean equal to the population’s mean • Since we do not know σ, we will replace the standard deviation formula of with this formula: This is called the standard error of the sample mean

Degrees of Freedom • Commonly listed as df • Equal to n-1 • When a t-distribution has k degrees of freedom, we will write this as t(k) • When the actual df does not appear in Table C, use the greatest df available that is less than your desired df • This guarantees a wider confidence interval than needed to justify a given confidence level

Density Curves for t Distributions • Bell-shaped and symmetric • Greater spread than a normal curve • As degrees of freedom (or sample size) increases, the t density curves appear more like a normal curve

Confidence Intervals ± t* • t* is the upper (1-C)/2 critical value for the t(n-1) distribution • We find t* using the table or our calculator • t*=invT(area to left of t*, df) • We interpret these the same way we did in the last chapter. • This interval is exactly correct when the population distribution is Normal and is approximately correct for large n in other cases.

INFERENCE TOOLBOX (p 631) DO YOU REMEMBER WHAT THE STEPS ARE??? Steps for constructing a CONFIDENCE INTERVAL: • 1—PARAMETER—Identify the population of interest and the parameter you want to draw a conclusion about. • 2—CONDITIONS—Choose the appropriate inference procedure. VERIFY conditions (SRS, Normality, Independence) before using it. • 3—CALCULATIONS—If the conditions are met, carry out the inference procedure. • 4—INTERPRETATION—Interpret your results in the context of the problem. CONCLUSION, CONNECTION, CONTEXT(meaning that our conclusion about the parameter connects to our work in part 3 and includes appropriate context)

Example: GOT MILK? A milk processor monitors the number of bacteria per milliliter in raw milk received for processing. A random sample of 10 one-milliliter specimens from milk supplied by one producer give the following data: 5370, 4890, 5100, 4500, 5260, 5150, 4900, 4760, 4700, 4870 Construct a 90% confidence interval. • --We want to estimate = the mean number of bacteria per milliliter in all of the milk from this supplier • --Since we don’t know σ, we should construct a one-sample t interval for . • We must be confident that the data are an SRS from the producer’s milk. We must learn how the sample was chosen to see if it can be regarded as an SRS (we are only told that it is a “random sample”). • A boxplot and a Normal probability plot of the data show no outliers and no strong skewness. This gives us little reason to doubt the Normality of the population from which this sample was drawn. In practice, we would probably rely on the fact that past measurements of this type have been roughly Normal. • Since these measurements came from a random sample of specimens, they should be independent (assuming that there were many, at least 100, one-milliliter specimens available at the milk processing facility).

Example: GOT MILK? Cont. • --Entering these data into a calculator gives =4950 and s=268.45. So a 90% confidence interval for the mean bacteria count per milliliter in this producer’s milk is • --We can say that we are 90% confident that the actual mean number of bacteria per milliliter of milk from this supplier is between 4794.4 and 5105.6 because we used a method that yields intervals such that 90% of all these intervals will capture the true mean desired. df = 10-1 = 9

Paired t Procedures • Recall, matched pairs studies are a form of block design in which just two treatments are being compared • Also, experiments are rarely done on randomly selected subjects. Random selection allows us to generalize results to a larger population, but random assignment of treatments to subjects allows us to compare treatments. • Be careful to distinguish a matched pairs setting from a two-sample setting. • The real key is independence. • TREAT THE DIFFERENCES from a matched pairs study as a single sample.

TECHNOLOGY • As always, you will be allowed unrestricted use of your calculator on quizzes and tests (as well as the actual AP Exam). For this reason, ALWAYS be certain to write down the values of key numbers that are being used (means, standard deviations, degrees of freedom, significance levels, etc.) along with results of the calculator procedures in order to receive full credit. • The calculator information is available in your book on pages 661-662. • We are now using the T Interval instead of the Z Interval • Plug in exactly what you are asked for

Estimating Population Mean: T-Procedures Robustness and Confidence Intervals

Estimating Population Mean: T-Procedures Robustness and Confidence Intervals

Presentation Transcript

Section 10.2 Compound Interest

AP Statistics Section 10.2 B

Section 10.2

Section 10.2 Tests of Significance

Section 10.2

10.2 Section Objectives – page 263

Section 10.2

Section 10.2 Permutations and Combinations

SECTION 10.2

Section 10.2

Section 10.2

Section 10.2

Section 10.2-1

Section 10.2

Section 10.2

Section 10.2

Section 10.2

Section 10.2 Summary – pages 263-273

Section 10.2: Applications of Trees

Section 10.2

Section 10.2

Section 10.2: Applications of Trees