1 / 36

Week 8

Week 8. Confidence Intervals for Means and Proportions. Inference. Data are a single sample Interested in underlying population , not specific sample Sample gives information about population Randomness of sample means uncertainty Called inference about population. Types of inference.

glenna-hunt
Download Presentation

Week 8

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Week 8 Confidence Intervals for Means and Proportions

  2. Inference • Data are a single sample • Interested in underlying population, not specific sample • Sample gives information about population • Randomness of sample means uncertainty • Called inference about population

  3. Types of inference • Focus on value of population parameter • e.g. mean or proportion (probability) • Estimation • What is the value of the parameter? • Hypothesis testing • Is the parameter equal to a specific value (usually zero)?

  4. Point estimate • To estimate population parameter, use corresponding sample statistic • e.g. • Likely to be an error in estimate • e.g. • How big is error likely to be?

  5. Error distribution • Error is random • Simulation from an ‘approx’ population could build up error distribution • Shows how large error from actual sample data is likely to be

  6. Example • Silkworm survival after arsenic poisoning • How long will 1/4 survive? • What is upper quartile?

  7. Simulation • Approx population (same mean & sd as data) • Target = UQ from normal = 293.3 sec

  8. Simulation (cont) • Sample UQs ≠ target • Simulation shows error distribution • Error in estimate (292 sec) unlikely to be more than 10 sec.

  9. Error distn for proportion • Simulation is not needed

  10. Standard error of proportion • Approx error distn • bias = 0 • standard error =

  11. Teens and interracial dating 1997 USA Today/Gallup Poll of teenagers across country: 57% of the 497 teens who go out on dates say they’ve been out with someone of another race or ethnic group. • Point estimate: • Bias = 0 • Standard error = =  0.57

  12. Error distn (interracial dating) • General normal • Error distn     • Error in estimate, p = 0.57, • unlikely to be more than 0.05 • almost certainly less than 0.07 • = 0, s = 0.022 -.066 -.044 -.022 0 .022 .044 .066

  13. Interval estimates Survey 150 randomly selected students and 41% think marijuana should be legalized. If we report between 33% and 49% of all students at the college think that marijuana should be legalized, how confident can we be that we are correct? Confidence interval: an interval of estimates that is ‘likely’ to capture the population value.

  14. 95% confidence interval • Legalise? p = 0.41, n = 150 • 70-95-100 rule of thumb • Prob(error < 2 x 0.0412) is approx 95% • We are 95% confident that  is between 0.41 – 0.0824 and 0.41 + 0.0824 0.33 and 0.49 95% Conf Interval

  15. Interpreting 95% C.I. • Confidence interval is function of sample data • Random • It may not include population parameter ( here) • In repeated samples, about 95% of CIs calculated as described will include  • We therefore say we are 95% confident that our single CI will include 

  16. Teens and interracial dating 1997 USA Today/Gallup Poll of teenagers across country: 57% of the 497 teens who go out on dates say they’ve been out with someone of another race or ethnic group. • Point estimate: • Standard error = • 95% C.I. is 0.57 - 0.044 to 0.57 + 0.044 0.526 to 0.614 We would prefer more decimals! =  0.57

  17. Teens and interracial dating • 95% C.I. is 0.526 to 0.614 • We do not know whether  is between 0.526 and 0.614 • However 95% of CIs calculated in this way will work • We are therefore 95% confident that is in (0.526, 0.614)

  18. St error & width of 95% C.I. • Smallest s.e. and C.I. width when: • n is large • p is close to 0 or 1 • Biggest s.e. and C.I. width when: • n is small • p is close to 0.5

  19. Margin of error • Public opinion polls usually estimate several popn proportions. • Each has its own “± 2 s.e.” describing accuracy • n = 350

  20. Margin of error (cont) • n = 350 • Maximum possible is “Margin of error” for poll

  21. Requirements for C.I. • Sample should be randomly selected from population • “Large” sample size — at least 10 success and 10 failure (though some say only 5 needed) • If finite population, at least 10 times sample size

  22. Case Study :Nicotine Patches vs Zyban Study: New England Journal of Medicine 3/4/99) • 893 participants randomly allocated to four treatment groups: placebo, nicotine patch only, Zyban only, and Zyban plus nicotine patch. • Participants blinded: • all used a patch (nicotine or placebo) • all took a pill (Zyban or placebo). • Treatments used for nine weeks.

  23. Nicotine Patches vs Zyban (cont) Conclusions: Zyban is effective(no overlap of Zyban and not Zyban CIs) Nicotine patch is not particularly effective(overlap of patch and no patch CIs)

  24. Error distribution for mean • Again, a simulation is unnecessary to find the error distribution (approx)

  25. Standard error of mean • Approx error distn • bias = 0 • standard error =

  26. Mean hours watching TV Poll: Class of 175 students. In a typical day, about how much time to you spend watching television? • Point estimate: • Bias = 0 • Standard error, n Mean Median StDev 175 2.09 2.000 1.644 =  2.09 hours

  27. Standard devn & standard error • Sample standard deviation • is approx  • stay similar if n increases • Standard error of mean • is usually less than  • decreases as n increases Don’t get mixed up between the two!

  28. Error distn (hours watching TV) • General normal • Error distn     • Error in estimate,      = 2.09 hours, • unlikely to be more than 0.25 hrs • almost certainly less than 0.4 hrs • = 0, s = 0.124 -.372 -.248 -.124 0 .124 .248 .372

  29. General form for 95% C.I. • Error distn • If • error distn is normal • zero bias & we can find s.e.  se  se se  Prob( error is in ± 2 s.e.) is approx 0.95 • 95% confidence interval: estimate ± 2 s.e. • 95% confidence interval: estimate ± 1.96 s.e. (if really sure error distn is normal)

  30. 95% confidence interval • Mean hrs watching TV? • 70-95-100 rule of thumb • Prob(error < 2 x 0.124) is approx 95% • We are 95% confident that  is between 2.09 – 0.248 and 2.09 + 0.248 1.84 and 2.34 hours = 2.09 hrs, n = 175 95% C. I.

  31. Requirements for C.I. • Sample should be randomly selected from population • “Large” sample size  —  n > 30 is often recommended • If finite population, at least 10 times sample size

  32. Problem with small n • Known  • Unknown  • Variable width • Less likely to include  • Confidence level less than 95% works fine

  33. C.I. for mean, small n • Solution is to replace 1.96 (or 2) by a bigger number. • Look up t-tables with (n - 1) ‘degrees of freedom’

  34. Example: Mean Forearm Length Data: From random sample of n = 9 men 25.5, 24.0, 26.5, 25.5, 28.0, 27.0, 23.0, 25.0, 25.0 df = 8 t8 = 2.31 95% C.I.: 25.5  2.31(.507) => 25.5  1.17 => 24.33 to 26.67 cm

  35. What Students Sleep More? Q: How many hours of sleep did you get last night, to the nearest half hour? Class n Mean StDev SE MeanStat 10 (stat literacy) 25 7.66 1.34 0.27Stat 13 (stat methods) 148 6.81 1.73 0.14 • Notes: • CI for Stat 10 is wider (smaller sample size) • Two intervals do not overlap

  36. Interpreting 95% C.I. • Confidence interval is function of sample data • Random • It may not include population parameter ( here) • In repeated samples, about 95% of CIs calculated as described will include  • We therefore say we are 95% confident that our single CI will include 

More Related