260 likes | 547 Views
Small Sample Confidence Intervals for Population Means, Large-sample Confidence Intervals for population proportions & Determining Sample Size. MSIT3000 Lecture 12. Objectives. Calculate & interpret small sample CI’s for population means.
E N D
Small Sample Confidence Intervals for Population Means,Large-sample Confidence Intervals for population proportions &Determining Sample Size MSIT3000 Lecture 12
Objectives • Calculate & interpret small sample CI’s for population means. • Calculate & interpret large sample CI’s for population proportions. • Determine the appropriate sample size based on a desired margin of error. Section: 5.2, 5.3, 5.4, [4.9]
Problems with the large-sample CI: • The CLT allows us to treat x-bar as if it is normally distributed when n 30. But does that mean we can automatically assume it is ok to use s as a substitute for ? • What happens to the distribution of x-bar when x is “nearly” normal, but n is less than 30?
Solution: • X-bar is nearly normally distributed, but follows the t-distribution, rather than the standard normal. • The t-distribution was published by Gosset in 1908. He worked for Guiness, a company that didn’t want him to publish. He therefore used the pseudonym “Student” to do so anyway. We still use “Student’s t-statistic” • Formula: t = (x-bar - )/(s/n) • Note that there are two random variables in the formula: x-bar & s. Compare the formula for t to the formula for z.
Points regarding the t-distribution: • The t-distribution is not defined solely by its mean and standard deviation. • The t becomes closer and closer to z as n increases. In the limit, as n approaches , the t becomes z-distributed. • Compare the bottom row of the t-table to the z-table. We need the relevant “degrees of freedom” = n-1 in order to choose the correct value for t. • NB! You read the t-table very differently from the z-table. [It’s organized “backwards” from the standard normal table!]
Actually constructing a small-sample CI for the population mean: • Choose the confidence level • usually 90%, 95% or 99%. • CI = x-bar t(s/n) • More terminology: • Note 1: Margin of Error (MOE) = t(s/n) • Note 2: Standard Error (SE) = s/n
Interpretation • What is the probability that a calculated small-sample 95% CI has captured the population mean? • If we are going to construct 1000 small-sample 95% C.I.s, approximately how many times would we expect to miss?
Proportions: Why are they useful? • Political polls. • Consumer surveys. • Product reviews. • Reputation. • Competitor analysis. • Quality control. • Conventional wisdom has it that there are more proportions CI in the real business world than CI for means.
Logic behind the method: • The logic is the same as behind any other CI. • We assume a proportion in the population (p) and assume that the sample proportion (p*) is an estimator for p. • We further assume that p* is approximately normally distributed.
Difficulty 1: Sample size • The population proportion is clearly bounded by 0 and 1. What do we do if the CI includes 0 or 1? • We do not use this methodology unless our sample size (n) is so large that: • np* 5 and n(1-p*) 5 • Alternative: do not use this methodology unless our sample size (n) is so large that p* is not within three standard deviations of 0 or 1. • This is just as common, but we choose to standardize on method 1.
What happens as n ? • How is a proportion distributed for a fixed sample size (n)? • What does the distribution look like? • {This is the topic of text section 4.9} • In the following four graphs, p=.5
Difficulty 2: Unknown standard deviation • The standard deviation of p* is (pq/n). We obviously don’t know p, so what do we do? • We use p* to estimate the standard deviation. NB! when we calculate the CI, we do NOT use the t-table! • Note also that the standard deviation cannot be greater than (½)(1/n), which happens when p=q=½.
Polling Example: Iraq • 514 out of 1000 support war in Iraq without UN allies. • Calculate a 95% CI for the population proportion that supports war in Iraq. • 477 out of the same sample wish to allow more time for inspections. • Calculate a 95% CI for this proportion.
Calculate support for war first: • P* = 514/1000 = 0.514 • S(p*) (p*(1-p*)/n) = ((.514*.486)/1000) = 0.0158 • CL=95% => z=1.960 • MOE = 1.960*0.0158 = 0.0309 • Finally … CI = [0.48, 0.54]
Support for inspections: • P* = 477/1000 = 0.477 • S(p*) (p*(1-p*)/n) = ((.477*.523)/1000) = 0.0158 • CL=95% => z=1.960 • MOE = 1.960*0.0158 = 0.0310 • Finally … CI = [0.45, 0.51] • This is what CNN means when they use the phrase “statistical dead heat”.
CNN typically had a MOE of 4% for their polls. • This is due to the sample size. • We know that: MOE = z(pq/n) • So we can solve for n: • n= z2pq /MOE2 • n= 1.962(.5)(.5)/ (0.04)2 = 600.25 601.
Terms from the text: • Width: This is simply the upper bound of the confidence interval minus the lower bound. • Bound: The largest MOE we are willing to use. • [We prefer “largest MOE” or “maximum MOE”] • Note: The text uses the word “bound” differently in sentences 1 & 2 above.
Sample Size Issues • Factors determining sample size: • Maximum MOE • Cost • How do we balance a small max MOE with the cost of samples? • MC=MB (in theory) • Satisfactory maximum MOE & cost (in practice)
Procedure • Assume there is a budget for the study. • Determine a maximum MOE (M). • Set up the equation: MOE = M • & solve for n • Check: is this n feasible with the specified budget?
Sample Size for Confidence Intervals for Population Means • MOE = M • z(s/n) = M • n= (z*s/M)2 Don’t memorize this. Remember themethod.
Sample Size for Confidence Intervals for Population proportions • MOE = M • z(pq/n) = M • n= (z/M)2(pq)
Conclusion • Objectives addressed: • Calculate & interpret small sample CI’s for population means. • Calculate & interpret large sample CI’s for population proportions. • Determine the appropriate sample size based on a desired margin of error.
Problems: • Small-sample CI for means: • Exam 2A # 15, 16, 20 • Text: (5.16), 5.19 • Proportions & Sample-size: • Text: 5.28, (5.32), 5.41, (5.47), 5.48