230 likes | 347 Views
Day 3 QMIM Confidence Interval for a Population mean of Large and Small Samples by Binam Ghimire. Objective. To be able to calculate best estimates of the mean and standard deviation of a population
E N D
Day 3 QMIM Confidence Interval for a Population mean of Large and Small Samples by Binam Ghimire
Objective • To be able to calculate best estimates of the mean and standard deviation of a population • To be able to calculate confidence intervals for a population mean for large and small samples
Why Sampling? (1) • Alternate to sampling is to test the entire population. • The advantage of testing the entire population is the accuracy • The disadvantages are 1) Expensive, 2) Time consuming 3) Destructive and 4) not possible or total population unknown
Why Sampling? (2) • Examples Car Crash Test Water Resistant Test – The deep dive watch from Rolex
Point Estimates • Symbols Population Sample Mean m Std. deviation ss Size N n
Estimate for the population parameter:Conditions • Sample should be part of population • Sample should represent the population • Sample should be random • Larger the better
Confidence Interval Estimations • Provides Range of Values • Based on Observations from Sample • Gives Information about closeness to unknown population parameter • Stated in terms of Probability • 90%, 95 %, 99%
Confidence Interval Estimations • Probability that the population parameter falls within certain range • Lower Range Higher Range
Confidence Interval Estimations Upper Confidence Limit Lower Confidence Limit Point Estimate Width of confidence interval
Confidence Interval Estimations For largest possible sample, margin of error = 0. (This will happen when sample data = population data) If so then m = But the is from sample so will not be exactly equal to m. In fact, it will be either lower or higher than the u [Never 100% (1 – a) ] m = +/ - Error ... (1)
Confidence Interval Estimations:Central Limit Theorem We may like to arrive close to m by finding means from multiple samples. If population is normally distributed then the “sampling distribution of the means” will also be normal. If the population is not normally distributed then whether “sampling distribution of the mean” will be normal or not depends on the size of the sample – Central Limit Theorem. The spread of the sampling distribution also depends upon the sample size. Larger the sample size the narrower the spread (or smaller the standard deviation)
Confidence Interval Estimations:Central Limit Theorem Source: Oakshott, L. (2006) Essential Quantitative Methods for Business, Management and Finance, 3rd Ed., Palgrave, p. 226
Confidence Interval Estimations:Standard Error m = +/ - Error ... (1) Error depends on two factors: 1) standard deviation (s) and 2) sample size (n). We call the error (standard deviation of the sampling distribution) Standard Error So we may call the above equation as follows m= +/ - Standard Error ... (2) Standard Error =
Confidence Interval Estimations:Standard Error • Standard error of sample mean is the standard deviation of the distribution of sample means. • When the population σ is known: • When the population σ is unknown:
Confidence Interval Estimations:Deriving the Formula When the number of sample is larger than 30 we can apply the normal distribution to calculate the limits of confidence interval (Because Z score table is for sample size > 30). Replacing x by , and by Standard Error ( ) we get, ... (3)
Common Levels of confidence:Checking the Z table • For a 99% confidence Interval, Z = _____ • For a 95% confidence Interval, Z = _____ • For a 90% confidence Interval, Z = _____
Exercise • If a sample of 100 items is drawn from a population and the mean is found to be 200g with a standard deviation of 5g, find: • a 95% confidence interval estimate for the population mean. • if a sampling error of only ± 0.5g is allowed, calculate the size of sample
Small Samples – T Distribution • Sample size < 30 = T-distribution • Properties of t-distribution (Student’s t-distribution) • Symmetrical (bell shaped) • Less peaked and fatter tails than a normal distribution • Defined by single parameter, degrees of freedom (df), where df = n – 1 • As df increase, t-distribution approaches normal distribution
T table • The number within the table are t-values not probabilities • Numbers in the first column are degree of freedom (v) of the sample • This is the freedom that you have in choosing the values of the sample. If you were given the mean of the sample of 8 values, you would be free to choose 7 of the 8 but not the 8th one. Therefore there are 7 degrees of freedom • The number of degrees of freedom is therefore n-1 • For a very large sample the t and Z distributions are same • For a 95% confidence interval you will choose 0.025 level • Properties of t-distribution (Student’s t-distribution) • Symmetrical (bell shaped) • Less peaked and fatter tails than a normal distribution • Defined by single parameter, degrees of freedom (df), where df = n – 1 • As df increase, t-distribution approaches normal distribution
T-Distribution • The formula is same like Z, we just replace the Z by t
Exercise Sample mean is 494.6 and standard deviation is 23.03. Calculate the confidence interval estimates for a sample size of 10 at 95% confidence interval