Sampling Distributions & Confidence Intervals: Central Limit Theorem Overview

Lecture 8 Dan Piett STAT 211-019 West Virginia University

Last Week • Continuous Distributions • Normal Distributions • Normal Probabilities • Normal Percentiles

Overview • 8.1 Distribution of X-bar and the Central Limit Theorem • 8.2 Large Sample Confidence Intervals for Mu • 8.3 Small Sample Confidence Intervals for Mu

Section 8.1 The Sampling Distribution of the Sample Mean and the Central Limit Theorem

Distribution of the Mean • Suppose we generate multiple samples of size n from a population, we will get a sample mean from each group. • These sample means will have their own distribution. • The sample mean is a random variable with it’s own mean and standard deviation aka. Standard error. (See notation on board) • This is known as the Sampling Distribution of the Sample Mean • Some questions to think about: • What is the shape of the sampling distribution? • What is the mean and standard error of the sampling distribution?

The Central Limit Theorem • The distribution of the sample mean is determined by the shape of the distribution of X • X is Normal • The distribution of the sample mean is normal • Mean mu • (The same as the mean of X) • Standard error sigma/sqrt(n) • (The standard deviation of X divided by the sample size) • What if X is not normal?

Central Limit Theorem Contd • So what if X is not Normal. Assume X~? • The shape of X will depend on the sample size • If n<20 • We cannot be certain the distribution of the sample mean. It is not necessarily normal or even approximately normal • If n≥20 • The Central Limit Theorem States that the distribution of the sample mean will approach normality • Mean mu • (The same as the mean of X) • Standard error sigma/sqrt(n) • (The standard deviation of X divided by the sample size)

Examples • Give the sampling distribution of the sample mean for the following distributions: • X is normally distributed with a mean of 50 and a standard deviation of 20. What is the distribution of the sample mean of n = 25 • X is Exponentially distributed with mean of 20 and standard deviation of 10. What is the distribution of the sample mean of n = 100? • X is normally distributed with a mean of 100 and a standard deviation of 18. What is the distribution of the sample mean of n = 9

Probabilities • Since the distribution of the sample mean follows a normal distribution (under the appropriate conditions) we can calculate probabilities much like last week. • All methods are exactly the same except now we calculate the z-score using our new mean and standard error.

Example: Back to SAT Scores • From last week we said that SAT Math Scores are Normally distributed with a mean of 500 and a standard deviation of 100. • X~N(500,100) • What is the sampling distribution of the sample mean of a class of 25 students? • What is the probability that A RANDOMLY SELECTED STUDENT scores above a 600 on the SAT Math section? • What is the probability that THE MEAN SCORE OF THE 25 STUDENTS is above 600?

Section 8.2 Large Sample Confidence Intervals

Point Estimator • Suppose we do not know the true population mean. How can we estimate it? • We could find a sample and use a statistic such as the sample mean to estimate it • This is known as a point-estimate • But it is unlikely that our sample mean is going to exactly match the population mean even under perfect conditions • For this reason, it is better to state that we believe the true mean is between two numbers a and b • a < µ < b • We can predict a and b using a confidence interval

Confidence Intervals • Our confidence intervals will always be of the form: • Sample Statistic ± critical value * error term • For the population mean, our sample statistic is x-bar • Our critical value will be either Z or t (I will explain t later) • Our error term will be the standard error of the sample mean

Large Sample Confidence Intervals • Recall if n is “large” (≥20), X-bar’s dist. is approximately normal with mean mu and standard error sigma/sqrt(n) • 95% CI

Example • Find the 95% confidence interval for the mean SAT Math Score for x-bar = 502, s = 8, n = 36 • 502 ± 1.96*8/6 • (499.387,504.613) • Conclusion: • We are 95% confident that the true mean SAT Math Score is between 499.387 and 504.631 • Always be sure to state your conclusion

Confidence Levels • In the previous slide, we used a confidence level of 95%. • This corresponded to a critical value of 1.96 • We commonly use 3 different confidence levels: • 90% • Critical Value of 1.645 • 95% • Critical Value of 1.96 • 99% • Critical Value of 2.578

Notes on the Error Term • The error term can be effected by 3 different things • The sample size, n • The larger n, the smaller the error term • The standard deviation, s • The smaller s, the smaller the error term • The confidence level • The higher the confidence level, the larger the critical value, the larger the error term • We cannot choose s in practice, but we can choose the confidence level and often n

Section 8.3 Small Sample Confidence Intervals

Small Sample Confidence Intervals • For “large” sample sizes, we have the convenience of knowing that the distribution approaches normality • We can use the Z table (1.645, 1.98, 2.645) • For “small” sample sizes (<20) we have to do a little more work and we must know that X is normal • Our Central Limit Theorem Rules do not apply • Two Cases • The population standard deviation is known • The standard deviation is unknown

Case 1: Sigma is known • Good news! • In this case we handle our confidence intervals in the exact same way we would for a large sample • Example: • Compute a 99% confidence interval for the population mean when x-bar= 43.2, sigma = 18, n = 16 • (31.6, 54.8) • We are 99% confident that the population mean is between 31.6 and 54.8

Case 2: Sigma is unknown • When the population standard deviation is unknown we need to make a slight adjustment to our formula. • The adjustment is in the critical value. Rather than using our Z values (1.645, 1.98, 2.645) we will be using t values • t values come from the t-distribution. You will only need to know the t-distribution for inference, not probability.

T-distribution • t values can be found on Table F • Notes about t • t is mound shaped and symmetric about the mean, 0 • Just like the standard normal • It looks exactly like the standard normal, except with larger tails. • The values of T require a parameter, degrees of freedom, to find the value on the table • Degrees of freedom are equal to n – 1 • As n increases, t approaches Z

Example • Construct a 95% confidence interval for the mean weight of apples (in grams): • x-bar = 183, s = 14.1, n = 16 • First find t • Alpha/2 = .025 ( because of 95% confidence) • df = n-1 = 15 • t = 2.13 • 183±2.13*14.1/sqrt(16) • (175.5 and 190.5) • We are 95% confident that the mean weight of apples is between 175.5 and 190.5 grams

Sampling Distributions & Confidence Intervals: Central Limit Theorem Overview

Sampling Distributions & Confidence Intervals: Central Limit Theorem Overview

Presentation Transcript

Lecture 8

Lecture 8

Lecture 8

Lecture 8

Lecture #8

Lecture 8

Lecture 8

Lecture 8

Lecture 8

Lecture 8

Lecture 8

LECTURE № 8

LECTURE 8

Lecture 8

Lecture 8

Lecture 8

Lecture 8

Lecture 8

Lecture 8