290 likes | 302 Views
Lesson 11 - 1. Significance Tests: The Basics. Knowledge Objectives. Explain why significance testing looks for evidence against a claim rather than in favor of the claim Define null hypothesis and alternative hypothesis Define P -value Define significance level
E N D
Lesson 11 - 1 Significance Tests: The Basics
Knowledge Objectives • Explain why significance testing looks for evidence against a claim rather than in favor of the claim • Define null hypothesis and alternative hypothesis • Define P-value • Define significance level • Define statistical significance (statistical significance at level α)
Construction Objectives • Explain the difference between a one-sided hypothesis and a two-sided hypothesis. • Identify the three conditions that need to be present before doing a significance test for a mean. • Explain what is meant by a test statistic. Give the general form of a test statistic. • Explain the difference between the P-value approach to significance testing and the statistical significance approach.
Vocabulary • Hypothesis – a statement or claim regarding a characteristic of one or more populations • Hypothesis Testing – procedure, base on sample evidence and probability, used to test hypotheses • Null Hypothesis – H0, is a statement to be tested; assumed to be true until evidence indicates otherwise • Alternative Hypothesis – H1, is a claim to be tested.(what we will test to see if evidence supports the possibility) • Level of Significance – probability of making a Type I error, α
Steps in Hypothesis Testing • A claim is made • Evidence (sample data) is collected to test the claim • The data are analyzed to assess the plausibility (not proof!!) of the claim • Note: Hypothesis testing is also called Significance testing
Hypotheses: Null H0 & Alternative Ha • Think of the null hypothesis as the status quo • Think of the alternative hypothesis as something has changed or is different than expected • We can not prove the null hypothesis! We only can find enough evidence to reject the null hypothesis or not.
Hypotheses Cont • Our hypotheses will only involve population parameters (we know the sample statistics!) • The alternative hypothesis can be • one-sided: μ > 0 or μ < 0 (which allows a statistician to detect movement in a specific direction) • two-sided: μ 0 (things have changed) • Read the problem statement carefully to decide which is appropriate • The null hypothesis is usually “=“, but if the alternative is one-sided, the null could be too
Three Ways – Ho versus Ha 1 2 3 a b b a Critical Regions 1. Equal versus less than (left-tailed test) H0: the parameter = some value (or more) H1: the parameter < some value 2. Equal hypothesis versus not equal hypothesis (two-tailed test) H0: the parameter = some value H1: the parameter ≠ some value 3. Equal versus greater than (right-tailed test) H0: the parameter = some value (or less) H1: the parameter > some value
Example 1 A manufacturer claims that there are at least two scoops of cranberries in each box of cereal Parameter to be tested: Test Type: H0: Ha: • number of scoops of cranberries in each box of cereal • If the sample mean is too low, that is a problem • If the sample mean is too high, that is not a problem • left-tailed test • The “bad case” is when there are too few Scoops = 2 (or more) (s ≥ 2) Less than two scoops (s < 2)
Example 2 A manufacturer claims that there are exactly 500 mg of a medication in each tablet Parameter to be tested: Test Type: H0: Ha: • amount of a medication in each tablet • If the sample mean is too low, that is a problem • If the sample mean is too high, that is a problem too • Two-tailed test • A “bad case” is when there are too few • A “bad case” is also where there are too many Amount = 500 mg Amount ≠ 500 mg
Example 3 A pollster claims that there are at most 56% of all Americans are in favor of an issue Parameter to be tested: Test Type: H0: Ha: • population proportion in favor of the issue • If p-hat is too low, that is not a problem • If p-hat is too high, that is a problem • right-tailed test • The “bad case” is when sample proportion is too high P-hat = 56% (or less) P-hat > 56%
Conditions for Significance Tests • SRS • simple random sample from population of interest • Normality • For means: population normal or large enough sample size for CLT to apply or use t-procedures • t-procedures: boxplot or normality plot to check for shape and any outliers (outliers is a killer) • For proportions: np ≥ 10 and n(1-p) ≥ 10 • Independence • Population, N, such that N > 10n
Test Statistics Principles that apply to most tests: • The test is based on a statistic that compares the value of the parameter as stated in H0 with an estimate of the parameter from the sample data • Values of the estimate far from the parameter value in the direction specified by Ha give evidence against H0 • To assess how far the estimate is from the parameter, standardize the estimate. In many common situations, the test statistic has the form: estimate – hypothesized valuetest statistic = ------------------------------------------------------------ standard deviation of the estimate (ie SE)
Example 4 Several cities have begun to monitor paramedic response times. In one such city, the mean response time to all accidents involving life-threatening injuries last year was μ=6.7 minutes with σ=2 minutes. The city manager shares this info with the emergency personnel and encourages them to “do better” next year. At the end of the following year, the city manager selects a SRS of 400 calls involving life-threatening injuries and examines response times. For this sample the mean response time was x-bar = 6.48 minutes. Do these data provide good evidence that the response times have decreased since last year? List parameter, hypotheses and conditions check
Example 4 cont μ = 6.7 minutes (unchanged) Parameter: H0: Ha: Conditions Check: 1) : 2) : 3) : μ < 6.7 minutes (they got “better”) SRS stated in problem statement Normality n = 400 suggest CLT would apply to x-bar Independence n = 400 means we must assume over 4000 calls each year that involve life-threatening injuries
Hypothesis Testing Approaches • P-Value • Logic: Assuming H0 is true, if the probability of getting a sample mean as extreme or more extreme than the one obtained is small, then we reject the null hypothesis (accept the alternative). • Classical (Statistical Significance) • Logic: If the sample mean is too many standard deviations from the mean stated in the null hypothesis, then we reject the null hypothesis (accept the alternative) • Confidence Intervals • Logic: If the sample mean lies in the confidence interval about the status quo, then we fail to reject the null hypothesis
-z*α/2 z*α/2 Confidence Interval Approach FTR Region LB UB μ0 Reject Regions Reject Regions x – μ0 Test Statistic: z0 = ------------- z* = invnorm(1-α/2) σ/√n
-zα/2 zα zα/2 -zα Classical Approach Reject Regions x – μ0 Test Statistic: z0 = ------------- σ/√n
P-value • P-value is the probability of getting a more extreme value if H0 is true (measures the tails) • Small P-values are evidence against H0 • observed value is unlikely to occur if H0 is true • Large P-values fail to give evidence against H0
-|z0| |z0| z0 z0 x – μ0 Test Statistic: z0 = ------------- σ/√n P-Value Approach P-Value is thearea highlighted • Probability(getting a result further away from the point estimate) = p-value • P-value is the area in the tails!!
Example 5: P-Values For each α and observed significance level (p-value) pair, indicate whether the null hypothesis would be rejected. • α = . 05, p = .10 • α = .10, p = .05 • α = .01 , p = .001 • α = .025 , p = .05 e) α = .10, p = .45 α < P fail to reject Ho P < α reject Ho P < α reject Ho α < P fail to reject Ho α < P fail to reject Ho
Example 4 cont • What is the P-value associated with the data in example 4? • What if the sample mean was 6.61? x – μ0 6.48 – 6.7 Z0 = ----------- = -------------- σ/√n 0.10 = -2.2 P(z < Z0) = P(z < -2.2) = 0.0139 (unusual !) x – μ0 6.61 – 6.7 Z0 = ----------- = -------------- = - 0.9 σ/√n 0.10 P(z < Z0) = P(z < -0.9) = 0.1841 (not unusual !)
Two-sided Test P-value • P-value is the sum of both tail areas in the two sided test case
Statistical Significance Dfn • Statistically significant means simply that it is not likely to happen just by chance • Significant in the statistical sense does not mean important • Very large samples can make very small differences statistically significant, but not practically important
Statistical Significance – P-value When using a P-value, we compare it with a level of significance, α, decided at the start of the test. • Not significant when α < P • Significant when α ≥ P Fail to Reject H0 Reject H0
Statistical Significance Interpretation Remember the three C’s: Conclusion, connection, context • Conclusion: Either we have evidence to reject H0 in favor of Ha or we fail to reject • Connection: connect your calculated values to your conclusion • Context: Always put it in terms of the problem (don’t use generalized statements)
Statistical Significance Warnings • If you are going to draw a conclusion base on statistical significance, then the significance level α should be stated before the data are produced • Deceptive users of statistics might set an α level after the data have been analyzed to manipulate the conclusion • P-values give a better sense of how strong the evidence against H0 is • This is just as inappropriate as choosing an alternative hypothesis to be one-sided in a particular direction after looking at the data
Summary and Homework • Summary • Significance test assesses evidence provided by data against H0 in favor of Ha • Ha can be two-sided (different, ≠) or one-sided (specific direction, < or >) • Same three conditions as with confidence intervals • Test statistic is usually a standardized value • P-value, the probability of getting a more extreme value given that H0 is true is small we reject H0 • Homework • 11.3, 11.6 – 11.8, 11.12 – 11.14, 11.19