140 likes | 426 Views
Sampling and Confidence Intervals. Math/Econ 108, Spring 2009. Coin tossing: Assume a coin is tossed 10 times, heads come up 7 times. Should this lead us to suspect that the coin is not fair?.
E N D
Sampling and Confidence Intervals Math/Econ 108, Spring 2009
Coin tossing: Assume a coin is tossed 10 times, heads come up 7 times. Should this lead us to suspect that the coin is not fair? We expect 5 heads. The other events that are at least as unlikely as 7 heads are: 8, 9, or 10 heads, 0, 1, 2, or 3 heads. The probability of at least 7 heads or at most 3 heads is 34%.
Coin tossing: Assume a coin is tossed 100 times, heads come up 70 times. Should this lead us to suspect that the coin is not fair? We expect 50 heads. The other events that are at least as unlikely as 70 heads are: 70 or more heads, 30 or fewer heads. The probability of at least 70 heads or at most 30 heads is 0.008%. (That’s less than 1 in 10,000 trials.)
This is the idea behind the question: A certain town is served by two hospitals. In the larger hospital about 45 babies are born each week, and in the smaller hospital about 15 babies are born each week. As you know, about 50 percent of all babies are boys. However, the exact percentage varies from week to week. Sometimes it may be higher than 50 percent, sometimes lower. For a period of one year, each hospital recorded the weeks on which more than 60 percent of the babies born were boys. Which hospital do you think recorded more such weeks? The larger hospital The smaller hospital About the same (there is no reason to expect that one hospital will have more such weeks than the other)
Coin tossing: Assume a coin is tossed 1000 times, heads come up 530 times. Should this lead us to suspect that the coin is not fair? We expect 500 heads. The other events that are at least as unlikely as 530 heads are: 530 or more heads, 470 or fewer heads. The probability of at least 530 heads or at most 470 heads is 6.2%. The probability of at least 532 heads or at most 468 heads is 4.6%. The probability of at least 535 heads or at most 465 heads is 2.9%.
A common (though arbitrary) cut-off is to say that anything with a probability of less than 5% is “surprising.” If the probability is more than 5%, it is not “surprising.” (Though some people use 1% as the cut-off for what is truly “surprising.”) Let us assume that in a presidential preference poll of 1000 likely voters, Obama and McCain are each chosen by exactly 500 of those polled. What is the probability that Obama is really preferred by at least 53% or at most 47% of the voters? The probability is about 5%. We feel pretty confident that Obama’s true share of the voters is somewhere between 47% and 53%. We say that the Margin of Error (MoE) is ±3%.
Margin of error with sample size = 1000 at 99% confidence is ± 4% Margin of error with sample size = 1000 at 95% confidence is ± 3%
CookPolitical Report poll, February 15–18, 2007: Clinton 42%, Obama 20%, Edwards 16%, Richardson 5%, Margin of Error = ± 5% • This does not mean: • Obama and Edwards are “statistically tied” or in “statistical dead heat” • The poll is only meaningful ifone leads the other by more than 5% • Because Richardson polled 5% with a margin of error of 5%, his true value could be 0%.
Difference of percentages: 0% 1% 2% 3% 4% 5% 1% margin of error 50.0 83.6 97.5 99.8 100 100 2% margin of error 50.0 68.8 83.7 92.9 97.5 99.3 3% margin of error 50.0 62.8 74.3 83.7 90.5 94.9 4% margin of error 50.0 59.7 68.8 76.9 83.7 89.0 5% margin of error 50.0 57.8 65.2 72.2 78.4 83.7 6% margin of error 50.0 56.5 62.8 68.8 74.3 79.3 Obama ahead by 4% with a 5% margin of error means 78.4% probability that Obama really does lead Edwards. See Wikipedia article on Margin of Error.
Before we begin: what do we expect to see? Add last column:
Before we begin: what do we expect to see? In Excel, = chidist(sum,5) Add last column:
Census versus Sampling: “Census Sampling Confusion” by Ivars Petersen Why do we conduct a census every ten years? Why is it difficult to get an accurate count? What was the proposed means of adjusting the count? What are the arguments for and against such a statistical adjustment?