210 likes | 385 Views
ENGR 224/STAT 224 Probability and Statistics Lecture 20. 1. Constructing Confidence Intervals. Recall, the (1- a )*100 % confidence interval for m is We of course assumed that the standard deviation is known or that it could be at least approximated by s when n is large.
E N D
Constructing Confidence Intervals • Recall, the (1-a)*100 % confidence interval for m is • We of course assumed that the standard deviation is known or that it could be at least approximated by s when n is large. • What happens if s is unknown and n is small (n < 30)?
Central Limit Theorem • Recall when n is sufficiently large that our random variable, the sample mean, can be approximated by a normal distribution according to the CLT. • We transformed this nonstandard normal problem to a standard problem through the use of the standard z-score
Student’s T- distribution • The random variable for the sample mean is no longer normally distributed if n is small, that is the classical CLT cannot help us here. • If we have a small sample (n < 30) and wish to construct a confidence interval for the mean we can use Student’s T- distribution provided the sample is drawn from a normally distributed population.
Student’s T- distribution • Since s is unknown, we must use s (the sample standard deviation) as a point estimate of s. • We convert the nonstandard Student’s T-distributed problem to a standard T-distributed problem through the use of the standard t-score
Properties of Student’s T-distribution. • Continuous with mean 0 • Symmetric and bell-shaped • Shape depends upon the degrees of freedom, which is one less than the sample size. df = n-1. • Lower in center, higher tails than normal as n T N • See Table inside back cover in text. • Example t14, 0.025 = 2.145
Confidence Interval for the mean when s is unknown and n is small The (1- a)*100% confidence interval for the population mean m is The margin of error E, is in this case N.B. The sample is assumed to be drawn from a normal population.
Example: Heat Capacity of Coal The following are the heat producing capabilities of coal from a particular mine (in millions of calories per ton)8,500 8,330 8,480 7,960 8,030Construct a 99% confidence interval for the true mean heat capacity. Solution:sample mean is 8260.0 sample Std. Dev. is 251.9 degrees of freedom = 4 a = 0.01 7741.4 m 8778.6
Practice Problem 33 (p 277) Using R- we find > boxplot(with(ex07.33,C1)) > mean(with(ex07.33,C1)) [1] 438.2941 > sqrt(var(with(ex07.33,C1))) [1] 15.14416 > qt(.975,16) [1] 2.119905 > qt(.975,length(with(ex07.33,C1))-1)*sqrt(var(with(ex07.33,C1)))/sqrt(length(with(ex07.33,C1)))
Shortcut for finding za/2 • Recall that as n the Student’s T-distribution approaches the normal distribution. • Look at T-table inside back cover, the last row represents the values of tn-1, a/2, as n becomes large which is essentially za/2. • Therefore, for some common values of a we are able to find za/2 quite quickly. • z0.025 =1.960, z0.10 =1.282
Prediction Interval for a single Future Value • Given some sample data that we know already, if one more observation is made, could you predict its value? • Obviously the best point estimate of the predicted value would be the sample mean or expected value. • But once again, as in the case for a confidence interval, we know nothing about the reliability.
Example The weights of gidgets in the gadget are normally distributed. A sample of size 5 yielded a mean of 10 and a standard deviation of 2. Construct both a 90% confidence interval for the mean weight and a 90% prediction interval for the weight of the next gidget to be measured.
Confidence Interval for Proportion • The objective of many surveys is to determine the proportion, p, of the population that possess a particular attribute. • If the size of the population is N, and X people have this attribute, then as we already know, p = X/N is the population proportion. • The idea here is to take a sample of size n, and count how many items in the sample have this attribute, call it x. Calculate the sample proportion, . We would like to use the sample proportion as an estimate for the population proportion.
Confidence Interval for Proportion Given that we know the distribution, the expected value and the variance of the point estimator, we can estimate the probability that a certain interval contains the true population parameter. Therefore, converting the nonstandard problem to a standard problem we obtain
Confidence Interval for Proportion • Solving the quadratic for p, the 100(1-a)% confidence interval for p is
Confidence Interval for Proportion(Large samples) For large values of n, the (1-a)*100 % confidence, interval estimate for the Error is At the (1-a)*100 % level of confidence, the confidence interval for the population proportion is
Determining Sample Size • In calculating the confidence interval for the population proportion we used • Perhaps we might be interested in knowing how large a sample we should use if we are willing to accept a margin of error E with a degree of confidence of 1-a.
Determining Sample Size • If we already have an idea of the proportion (either through a pilot study, or previous results) one can use • If we have no idea of what the proportion is then we use
Overview Confidence Intervals Sections 7.2. 7.3 of text 20
Homework Reread 7.1, 7.2, 7.3 Read 7.4 21