1 / 26

MSIT3000 Lecture 12

Small Sample Confidence Intervals for Population Means, Large-sample Confidence Intervals for population proportions & Determining Sample Size. MSIT3000 Lecture 12. Objectives. Calculate & interpret small sample CI’s for population means.

ophelia
Download Presentation

MSIT3000 Lecture 12

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Small Sample Confidence Intervals for Population Means,Large-sample Confidence Intervals for population proportions &Determining Sample Size MSIT3000 Lecture 12

  2. Objectives • Calculate & interpret small sample CI’s for population means. • Calculate & interpret large sample CI’s for population proportions. • Determine the appropriate sample size based on a desired margin of error. Section: 5.2, 5.3, 5.4, [4.9]

  3. Problems with the large-sample CI: • The CLT allows us to treat x-bar as if it is normally distributed when n 30. But does that mean we can automatically assume it is ok to use s as a substitute for ? • What happens to the distribution of x-bar when x is “nearly” normal, but n is less than 30?

  4. Solution: • X-bar is nearly normally distributed, but follows the t-distribution, rather than the standard normal. • The t-distribution was published by Gosset in 1908. He worked for Guiness, a company that didn’t want him to publish. He therefore used the pseudonym “Student” to do so anyway. We still use “Student’s t-statistic” • Formula: t = (x-bar - )/(s/n) • Note that there are two random variables in the formula: x-bar & s. Compare the formula for t to the formula for z.

  5. Points regarding the t-distribution: • The t-distribution is not defined solely by its mean and standard deviation. • The t becomes closer and closer to z as n increases. In the limit, as n approaches , the t becomes z-distributed. • Compare the bottom row of the t-table to the z-table. We need the relevant “degrees of freedom” = n-1 in order to choose the correct value for t. • NB! You read the t-table very differently from the z-table. [It’s organized “backwards” from the standard normal table!]

  6. Actually constructing a small-sample CI for the population mean: • Choose the confidence level • usually 90%, 95% or 99%. • CI = x-bar  t(s/n) • More terminology: • Note 1: Margin of Error (MOE) = t(s/n) • Note 2: Standard Error (SE) = s/n

  7. Interpretation • What is the probability that a calculated small-sample 95% CI has captured the population mean? • If we are going to construct 1000 small-sample 95% C.I.s, approximately how many times would we expect to miss?

  8. Proportions: Why are they useful? • Political polls. • Consumer surveys. • Product reviews. • Reputation. • Competitor analysis. • Quality control. • Conventional wisdom has it that there are more proportions CI in the real business world than CI for means.

  9. Logic behind the method: • The logic is the same as behind any other CI. • We assume a proportion in the population (p) and assume that the sample proportion (p*) is an estimator for p. • We further assume that p* is approximately normally distributed.

  10. Difficulty 1: Sample size • The population proportion is clearly bounded by 0 and 1. What do we do if the CI includes 0 or 1? • We do not use this methodology unless our sample size (n) is so large that: • np*  5 and n(1-p*)  5 • Alternative: do not use this methodology unless our sample size (n) is so large that p* is not within three standard deviations of 0 or 1. • This is just as common, but we choose to standardize on method 1.

  11. What happens as n ? • How is a proportion distributed for a fixed sample size (n)? • What does the distribution look like? • {This is the topic of text section 4.9} • In the following four graphs, p=.5

  12. n = 10

  13. n = 30

  14. n = 1000

  15. Difficulty 2: Unknown standard deviation • The standard deviation of p* is (pq/n). We obviously don’t know p, so what do we do? • We use p* to estimate the standard deviation. NB! when we calculate the CI, we do NOT use the t-table! • Note also that the standard deviation cannot be greater than (½)(1/n), which happens when p=q=½.

  16. Polling Example: Iraq • 514 out of 1000 support war in Iraq without UN allies. • Calculate a 95% CI for the population proportion that supports war in Iraq. • 477 out of the same sample wish to allow more time for inspections. • Calculate a 95% CI for this proportion.

  17. Calculate support for war first: • P* = 514/1000 = 0.514 • S(p*) (p*(1-p*)/n) = ((.514*.486)/1000) = 0.0158 • CL=95% => z=1.960 • MOE = 1.960*0.0158 = 0.0309 • Finally … CI = [0.48, 0.54]

  18. Support for inspections: • P* = 477/1000 = 0.477 • S(p*) (p*(1-p*)/n) = ((.477*.523)/1000) = 0.0158 • CL=95% => z=1.960 • MOE = 1.960*0.0158 = 0.0310 • Finally … CI = [0.45, 0.51] • This is what CNN means when they use the phrase “statistical dead heat”.

  19. CNN typically had a MOE of 4% for their polls. • This is due to the sample size. • We know that: MOE = z(pq/n) • So we can solve for n: • n= z2pq /MOE2 • n= 1.962(.5)(.5)/ (0.04)2 = 600.25  601.

  20. Terms from the text: • Width: This is simply the upper bound of the confidence interval minus the lower bound. • Bound: The largest MOE we are willing to use. • [We prefer “largest MOE” or “maximum MOE”] • Note: The text uses the word “bound” differently in sentences 1 & 2 above.

  21. Sample Size Issues • Factors determining sample size: • Maximum MOE • Cost • How do we balance a small max MOE with the cost of samples? • MC=MB (in theory) • Satisfactory maximum MOE & cost (in practice)

  22. Procedure • Assume there is a budget for the study. • Determine a maximum MOE (M). • Set up the equation: MOE = M • & solve for n • Check: is this n feasible with the specified budget?

  23. Sample Size for Confidence Intervals for Population Means • MOE = M • z(s/n) = M • n= (z*s/M)2 Don’t memorize this. Remember themethod.

  24. Sample Size for Confidence Intervals for Population proportions • MOE = M • z(pq/n) = M • n= (z/M)2(pq)

  25. Conclusion • Objectives addressed: • Calculate & interpret small sample CI’s for population means. • Calculate & interpret large sample CI’s for population proportions. • Determine the appropriate sample size based on a desired margin of error.

  26. Problems: • Small-sample CI for means: • Exam 2A # 15, 16, 20 • Text: (5.16), 5.19 • Proportions & Sample-size: • Text: 5.28, (5.32), 5.41, (5.47), 5.48

More Related