160 likes | 285 Views
Intro to Hypothesis Testing. March 2012. Geometry Warm-Up. There is a maximum of one obtuse angle in a triangle, but can you prove it? To prove something like this, we mathematicians must do a proof by contradiction. We play devil’s advocate and assume the alternative is true.
E N D
Intro to Hypothesis Testing March 2012
Geometry Warm-Up • There is a maximum of one obtuse angle in a triangle, but can you prove it? • To prove something like this, we mathematicians must do a proof by contradiction. We play devil’s advocate and assume the alternative is true.
Tying that in to inference • If we want to demonstrate that there is a statistical difference between two sets of data, we have to play devil’s advocate and assume that there is no difference. • Then we test to see if this assumption can be validated or not.
Standard Model for Inference • Hypotheses • Model • Mechanics • Conclusion
Hypotheses • A hypothesis is a claim about an unknown population model. Use correct notation. Example: p = 0.25. Do not write p = .25 because p is the parameter. • There are always two hypotheses to make. 1. H0: The “Null Hypothesis.” This is where you play devil’s advocate. To illustrate that there is a statistical difference between two sets of data, you want this hypothesis to be incorrect.
Hypotheses 2. HA: The “Alternative Hypothesis.” This is sometimes called the claim. This is what you are attempting to show to be true. You must decide if you are doing a one- or two-tailed test (more on that in a minute). We base this decision on what we are interested in learning about the situation, the “Why” of the study. We do not base this on what we believe the results will show us.
Hypotheses • From before, we know that all models are wrong, but that some models can be useful. • We make our decision to reject or not to reject the null hypothesis in order to make sense of the data.
Model • We need to decide which inference procedure to use. For now, that will be a one-proportion z-test, but as we continue to learn about inference, we will add more options. • List assumptions and check conditions. Inference methods are based on sampling distribution models. If our data do not meet the requirements to use the sampling distribution model in question, then we cannot proceed. • Name the test that you will be using.
Mechanics • Write down all statistics using proper notation. For now, these statistics are sample size, number of successes and sample proportion. Later there will be more statistics than these. • Draw a curve representing the sample model. Mark the parameter from the null hypothesis and the observed statistic on the curve and shade the appropriate tail.
Mechanics • Calculate the value of the test statistic. Show the formula, substitute all proper values and give the final result. Your TI calculator will compute the test statistic for you, but it is important for this class and the AP test that you be able to show me where the statistic comes from. • Find the P-value. The TI often can do this for you, but we will also use tables as well.
Conclusion • Link the P-value to the decision. We need to be clear how the calculated P-value led to the decision to reject or not to reject H0. • “Since P<.05… • “Since our P-value was so low … • “The P-value of --- means that the observed results are unlikely… • State the decision about H0 clearly. You either reject or fail to reject. WE DO NOT EVER EVER accept the null.
Conclusion • Interpret the decision in context. You must clearly explain what was learned in regards to the original context. Examples: • “Because P < .05, I reject the null hypothesis. There is strong evidence that more than 50% of all voters favor the amendment.” • “The high P-value indicated that these results could be reasonably explained by sampling error, so I fail to reject the null hypothesis. We do not have evidence of a change in the percentage of teens who smoke.”
Notes • The P-value is not the probability that the null hypothesis is true. We assume the null is true. With that assumption, we determine the likelihood of our statistic based on our hypothesis. • A low P-value means that results are very unusual so we would reject the model. • A high P-value would cause us to fail to reject. • Make sure that you use parameter notation in writing hypotheses. Use μ, not y-bar. Use p, not p-hat.
How low will you go? • Traditionally, statisticians have assigned an α-level, or a level of significance, to make that determination. • If α = .01 then a P-value under .01 is significant. • If α= .05 then a P-value under .05 is significant. • If α = .10 then a P-value under .10 is significant. • These are not ironclad laws. Sometimes you have the option of choosing your own α-level. There is no wrong choice. You should always determine the α-level before you do the test.
Example 1 • A 1996 report from the US Consumer Safety Commission claimed that at least 90% of all American homes have at least one smoke detector. A city’s fire dept. has been running a public safety campaign to increase awareness. The FD wants to determine if the effort has raised the local level above the 90% national rate. Building inspectors visited 400 randomly selected homes and determined that 376 have fire detectors. Is there strong evidence to conclude that more than 90% of the city’s homes have fire detectors?
Example 2 • There are supposed to be 20% orange M&M’s in a bag of candy. If a bag of 122 has 21 orange, does this contradict the company’s claim?