640 likes | 879 Views
STAT131/171 W9L1 Confidence Intervals. by Anne Porter alp@uow.edu.au. Lecture Outline. Review Central Limit theorem Intuitive ideas associated with confidence intervals CI for means Using Z Using t CI for proportions Hypothesis tests. Variability of Sample Means.
E N D
STAT131/171W9L1 Confidence Intervals by Anne Porter alp@uow.edu.au
Lecture Outline • Review Central Limit theorem • Intuitive ideas associated with confidence intervals • CI for means • Using Z • Using t • CI for proportions • Hypothesis tests
Variability of Sample Means • Sample means are random quantities • Sample means change from sample to sample • They also have a mean and standard deviation
Even with a bimodal distribution If we repeatedly take samples which are large enough from this distribution and plot the means of these samples we know that...
Central Limit Theorem(Large Sample Normality) • Given a random sample X1, X2, ..Xn from any distribution with mean m and finite variance s2, then irrespective of the distribution of the parent population, the distribution of approaches the shape of a normal distribution when the sample size is large and irrespective of sample size has a mean and standard deviation
Activity 1:Intuitive notions of Confidence intervals • When we take a sample and use it to estimate the population mean then we know that the point estimate will have associated with a certain amount of error. • So in preference to providing a point estimate of the population parameter, we generally provide an interval estimate.
Distribution of measurements How confident would you be of each of the following statements based on the sample of data plotted? The mean • is exactly 1.146 • is between1.414 and 1.151 • is between1.10 and 1.18 • is between1.05 and 1.24 • is between 0.0 and 2.2 • (Source, Griffiths et al, 1998, p. 290)
Confidence • Our confidence increases as the width of the interval increases. • We have a much greater confidence that m is between 0.0 and 2.2 than any of the other intervals given. • All intervals have some risk of not containing the true population mean m but there is greater confidence attached to the wider interval
The width of the confidence interval relates to the variability in the sample and sample size • Does greater variability in the sample, imply needing a wider or narrower confidence interval ? Wider • As the sample size n increases does the sample mean becomes • more or less stable ? More stable As n increases estimate of m is more accurate.
Confidence Intervals for the means known • Decisions through Data Video Unit 20 Confidence Intervals This film clip explores how confidence intervals are constructed for the mean of a normal population based on a Sample drawn from that population. It assumes that the s of The population is known and hence uses a standard normal distribution. In this instance the two-sided confidence interval is of the form
Confidence Intervals s unknown • Generally s is not known and it needs to be estimated from the sample. In this instance provided the population is normally distributed the t distribution can be used to construct a two-sided confidence interval Other forms are sometimes used – think rather than apply rote
Confidence Intervals • The two-tailed 100(1-a)% confidence is constructed using the values of Z and t which are associated with having a/2 beyond the |Z| and beyond the |t| according to which distribution of the means is in use. For example a/2 a/2
Example 1 What is the Z score corresponding to 90% confidence interval about the mean (a=0.1)?
0.1/2 0.1/2 Example 1 For a two-sided CI this means 5% in either tail What is the Z score corresponding to 90% confidence interval about the mean (a=0.1)? Z=-1.645
Example 2 • What is the Z score corresponding to 95% confidence interval about the mean? a/2=
Example 2 • What is the Z score corresponding to 95% confidence interval about the mean? a/2=.025 Z=-1.96
Example 3 • What is the Z score corresponding to 99% confidence interval about the mean? a/2= Z
Example 3 • What is the Z score corresponding to 99% confidence interval about the mean? a/2=0.005 Z Z= - 2.575
Example 4 • What is the t score corresponding to 90% confidence interval about the mean given a sample from a normal population of size 20? A 90% confidence interval means that there will be what percentage in each tail of the distribution How many degrees of freedom will there be?
Example 4 • What is the t score corresponding to 90% confidence interval about the mean given a sample from a normal population of size 20? A 90% confidence interval means that there will be what percentage in each tail of the distribution a/2=0.05 ie 5% How many degrees of freedom will there be? df = v = n-1 = 20-1 = 19
p tp 0 When the variance of the population is unknown • With n=n-1 =20-1=19 degrees of freedom and • a/2=0.05 ie 5% in the top tail is given by • ie the probability below tp is 0.95 a 0.1 0.05 0.025 0.01 0.005
p tp 0 When the variance of the population is unknown Then t 19, 0.1/2 =1.729 • With n=n-1 =20-1=19 degrees of freedom and • a/2=0.05 ie 5% in the top tail is given by • ie the probability below tp is 0.95 OR t 0.95=1.729 a 0.1 0.05 0.025 0.01 0.005
Example 5 • What is the t score corresponding to 95% confidence interval about the mean given a sample from a normal population of size 20? A 95% confidence interval means that there will be what percentage in each tail of the distribution How many degrees of freedom will there be?
Example 5 • What is the t score corresponding to 95% confidence interval about the mean given a sample from a normal population of size 20? A 95% confidence interval means that there will be what percentage in each tail of the distribution a/2=0.025 or 2.5% How many degrees of freedom will there be? df = v = n-1 = 20-1 = 19
(think! it no longer matters if it is a/2 just read the amount in the tail that is required) a p tp 0 When the variance of the population is unknown • With n=n-1 =20-1=19 degrees of freedom and • a/2=0.025 a 0.1 0.05 0.025 0.01 0.005 t 19, 0.05/2=2.093
Example 6 • What is the t score corresponding to 99% confidence interval about the mean given a sample from a normal population of size 20?
p tp 0 When the variance of the population is unknown • With n=n-1 =20-1=19 degrees of freedom and • a/2=0.005
p tp 0 t19,0.01/2 =2.861 When the variance of the population is unknown • With n=n-1 =20-1=19 degrees of freedom and • a/2=0.005
Example 7 • If a sample mean IQ is 105 based on a sample of 49 students. Given that the IQ test is standardised and is known to have a standard deviation of 15. What is the 95% confidence interval for the population mean. Will we use Z or t?
Example 7 • If a sample mean IQ is 105 based on a sample of 49 students. Given that the IQ test is standardised and is known to have a standard deviation of 15. What is the 95% confidence interval for the population mean. Will we use Z or t? In this case we have both a large population and a known standard deviation. So
Example 7 • If a sample mean IQ is 105 based on a sample of 49 students. Given that the IQ test is standardised and is known to have a standard deviation of 15. What is the 95% confidence interval for the population mean. =102.86, 107.14
Example 8 • Find the mean and standard deviation of the height of 5 students in STAT131. Sample 1: 178,170,165,154,165 cm Sample 2: 181,172,190,168,168 cm Sample 3: 184,185,175,180,163 cm
Example 8 • Find the mean and standard deviation of the height of 5 students in STAT131. Sample 1: 178,170,165,154,165 Sample 2: 181,172,190,168,168 Sample 3: 184, 185,175,180,163 S=8.7 S=9.55 S=8.96
Example 8 • To find the a confidence interval for the mean of the height of 5 students in STAT131 for each sample. Sample 1: 178,170,165,154,165 cm Sample 2: 181,172,190,168,168 cm Sample 3: 184,185,175,180,163 cm Will we use a t or Z distribution? Why?
Example 8 • To find the a confidence interval for the mean of the height of 5 students in STAT131 for each sample. Sample 1: 178,170,165,154,165 cm Sample 2: 181,172,190,168,168 cm Sample 3: 184,185,175,180,163 cm Will we use a t or Z distribution? Why? Although the sample size is smallheight can be assumed to be normally distributed, as s is unknown use a t.
Example 9 • Find the 95% confidence for the population mean of heights Sample 1: 178,170,165,154,165 S=8.7 What is n? n=5 t=2.776 What is t n-1,a/2 ?
Example 9 • Find the 95% confidence for the population mean of heights Sample 1: 178,170,165,154,165 S=8.7 What is n? n=5 t=2.776 What is t n-1,a/2 ? =(155.6, 177.2)
Example 10: Confidence interval for m • Find the 95% confidence of the population of heights Sample 2: 181,172,190,168,168 S=9.55 What is t n-1,a/2 ? What is n? n=5 t=2.776 What is the confidence interval for m? = (163.94, 187.66)
Example 10: Confidence interval for m • Find the 95% confidence of the population of heights Sample 2: 181,172,190,168,168 S=9.55 What is t n-1,a/2 ? What is n? What is the confidence interval for m?
Example 10: Confidence interval for m • Find the 95% confidence of the population of heights Sample 2: 181,172,190,168,168 S=9.55 What is t n-1,a/2 ? What is n? n=5 t=2.776 What is the confidence interval for m? = (163.94, 187.66)
Example 11: CI for m • Find the 95% confidence of the population of heights Sample 3: 184, 185,175,180,163 S=8.96 n=5 t=2.776
Example 11: CI for m • Find the 95% confidence of the population of heights Sample 3: 184, 185,175,180,163 S=8.96 n=5 t=2.776 =(166.28, 188.52)
Example 12: Interpretation of CI • Interpret these confidence intervals • Sample 1: CI (155.60, 177.2) • Sample 2: CI (163.94, 187.66) • Sample 3: CI (166.28, 188.52)
Example 12: Interpretation of CI • Interpret these confidence intervals • Sample 1: CI (155.60, 177.2) • Sample 2: CI (163.94, 187.66) • Sample 3: CI (166.28, 188.52) • These all represent 95% confidence intervals. That is if the process of sampling were repeated, 95% of the confidence intervals should cover the true population parameter m. We do not actually know if one of these three intervals contains the true parameter.
Central Limit Theorem (Large Sample Normality) • Given a random sample X1, X2, ..Xn from any distribution with mean m and finite variance s2, then irrespective of the distribution of the parent population, the distribution of approaches the shape of a normal distribution when the sample size is large, with a mean and standard deviation and • We will draw on this theorem when finding confidence intervals for proportions
Symbols • Let p=population proportion or binomial probability • Let =corresponding sample proportion • Using the central limit theorem, if numerous (n>30) samples are taken the distribution of all possible values of is approximately a normal curve with mean and standard deviation and
Sampling distribution of proportions • Mean • Variance • Standard Deviation
100(1-a)% CI of the population proportion p • When n is large , so we can assume a normal distribution of sample proportions, the confidence interval for a proportion is given by
An example: Class Roll • Using sample proportions to estimate a population • Take a sample of 30 students (needs to be at least 30 for proportions), see if each is present or absent • Use the sample to estimate the proportion of students in class • The proportions, together with the sample size completely summarise the data.