540 likes | 781 Views
Section 6.2. Tests of Significance. Test of Significance. Statistical Inference- To make conclusion about a population To estimate a population parameter 3 most common ways to make inference Point Estimation (x-bar or p-hat) Confidence Intervals (Plausible values of parameter)
E N D
Section 6.2 Tests of Significance
Test of Significance • Statistical Inference- • To make conclusion about a population • To estimate a population parameter • 3 most common ways to make inference • Point Estimation (x-bar or p-hat) • Confidence Intervals (Plausible values of parameter) • Test of Significance (also called Hypothesis Tests) • Test of Significance is the focus of this section
Test of Significance • Test of Significance: • Assess the evidence provided by data in favor of some claim about the population • Start by setting up a hypothesis • This is a statement about a population parameter • Results of the test are expressed in terms of a probability • Usually called a p-value • This probability (p-value) measures how well the data and the hypothesis agree.
Test of Significance • Is the observed effect “too large” or “too unusual” to be due to chance variation? • An approach to significance testing • Define a research question in terms of a parameter and specify the claimed parameter value. • Conduct an experiment or survey to obtain data. • Plot data and compute observed statistic (x-bar or p-hat). • Check sampling distribution for normality. • Using the sampling distribution of the statistic with center given by claimed parameter value, calculate the probability of getting differences more extreme than the observed effect • If the probability is small enough, declare the observed effect statistically significant (i.e., “too large” to be due to chance).
Test of Significance – An example • The mean yield of corn in the US is about 135 bushels per acre. A survey of 40 randomly selected farmers this year gives a sample mean yield of bushels per acre. • We want to know whether this is good evidence that the national mean this year is not 135 ushers per acre
Test of Significance – An example • Farmers corn yield • want to measure the average corn yield for the US this year. • Sample -- n = 40 farmers • Statistic – sample mean: – pop std. dev: • n=40≥30, according to CLT, • Question: Is it possible that, even though we observed , the true µ is in fact 135 bushels per acre? • (Is it likely that observing is due to pure chance?)
Section 6.2Test of Significance – An example • Using z-calculation and table A, the probability of observing 138.8 or larger while the true µ is 135, with sd = 10/sqrt(40) is only 0.008 = sd
Test of Significance – An example • The probability 0.008 is very small • We will only observe eight times per 1000 samples if the true mu = 135 • Hence, it is not likely that the true µ is 135, given that a sample mean=138.8 is observed • Two possibilities • µ=135 and we observed something very unusual • µ is not 135 but is some other value that makes the observed data more probable
Test of Significance – An example • What if a sample mean is observed? • Probability of observing 136.6 given µ=135 is 0.15 • We will observe “ ” 15 times per 100 sample • This value 0.15 is not as extreme as 0.008, and it could easily happen • We don’t have strong evidence to believe µ≠135. • Conclusion: A sample outcome that would be extreme if a hypothesis were true is evidence that the hypothesis is not true. In other words, if the sample result is extreme, we likely have evidence against our original hypothesis.
Test of Significance – An example • In this example, we considered 2 situations of how the value of µ can occur: 1. µ=0 2. µ≠0 These are the hypotheses • The first step in a test of significance is to state a claim that we will try to find evidence against
Stating hypothesis • Null Hypothesis (H0) • The statement being tested in a test of significance is called the null hypothesis • Usually the null hypothesis • is a statement of “no effect” or “no difference”, • is a statement about a population, • is expressed in terms of a (some) parameter(s). • Example H0: =0
Stating hypothesis • Alternative Hypothesis ( Ha ) • The name we give to the statement we hope (or suspect) is true. • Example Ha: 0 • Hypotheses always refer to some population or model, not a particular outcome • We must decide whether the alternative hypothesis (Ha) should be one-sided or two-sided
Stating hypothesis • One-sided alternative hypotheses: • Example: • Two-sided alternative hypothesis: • Example:
Stating hypothesis • Choosing one-sided or two-sided Hypothesis • The alternative hypothesis should express the hopes or suspicions we had in mind when we decided to collect the data. • You are cheating if you first look at the data and then frame Ha to fit what the data show. Choose a one or two-sided Ha before you look at the data. • If you do not have a specific direction in advance, use a two-sided alternative
Stating hypothesis • Example: Your company hopes the reduce the mean time (µ) required to process customer orders. At present, this mean is 3.8 days. You study the process and eliminate some unnecessary steps. • Q: Did you succeed in decreasing the average process time? Target: to show that the mean is now less than 3.8 days. • So alternative hypothesis is one-sided • The null hypothesis is “no change” value
Stating hypothesis • The mean area of several thousand apartments in a new development is advertised to be 1250 sqft. A tenant group thinks that the apartments are smaller than advertised. They hire an engineer to measure a sample of apartments to test their suspicion. • H0:=1250 vs. Ha: <1250
Stating hypothesis • Experiments concerning learning in animals sometimes measure how long it takes a mouse to find its way through a maze. The mean time is 18 seconds for one particular maze. A researcher thinks that a loud noise will cause the mice to complete the maze faster. She measures how long each of 10 mice takes with a noise as stimulus. • H0:=18 vs. Ha: <18
Stating hypothesis • Last year, your company’s service technicians took an average of 2.6 hours to respond to trouble calls from business customers who purchased service contracts. Do this year’s data show a different average response time? • H0: = 2.6 vs. Ha: 2.6
Test of Significance • The test is based on a statistic that estimates the parameter appearing in the hypothesis • Usually this is the same estimate we would use in a confidence interval for the parameter • When H0 is true, we expect the estimate to take a value near the parameter value specified by H0 • Values of the estimate far from the parameter value specified by H0 give evidence against H0 • The alternative hypothesis (Ha) determines which directions count against H0
Test of Significance • Test statistics • A test statistic measures compatibility between the null hypothesis and the data • Many test statistics can be thought of as a distance between a sample estimate of a parameter and the value of the parameter specified by the null hypothesis
Test of Significance • Example: • The Census Bureau reports that households spend an average of 31% of their total spending on housing. A homebuilders association in Cleveland wonders if the national finding applies in their area. They interview a sample of 40 households in the Cleveland metropolitan are to learn what percent of their spending goes toward housing. Take to be the mean percent of spending devoted to housing among all Cleveland households. Assume that = 9.6%. • What is the null and alternative hypothesis? • H0: = 31% vs. Ha: 31%
Test of Significance • Example: Spending on housing • Sample: n=40 households • sample mean: • Pop. std. dev: • Test: • The test statistic is the standardized version of distance between sample mean and parameter value given in the H0:
Test of Significance • Example: Spending on Housing • The Central Limit Theorem assures us that is approximately normal • This implies that the test statistic has an approximately standard Normal distribution • To move from the test statistic z to a probability, we must do Normal probability calculations.
Test of Significance P-Values P-values • A test of significance assesses the evidence against the null hypothesis and provides a numerical summary of this evidence in terms of a probability • The idea is that “surprising” or “unusual” outcomes are evidence against H0 • A surprising or unusual outcome is one that is far from what we would expect if H0 were true
Test of Significance P-Values • A test of significance finds the probability of getting an outcome as extreme or more extreme than the actually observed outcome • The direction or directions that count as “far from what we would expect” are determined by the alternative hypothesis • Definition: The probability, computed assuming that H0 is true, that the test statistic would take a value as or more extreme than that actually observed is called the P-value of the test • The smallerthe P-value, the strongerthe evidence against H0 provided by the data in our sample
Test of Significance P-values • To calculate the P-value, we must use the sampling distribution of the test statistic • Since our test statistic z follows a standard Normal distribution that is all we will need in chapter 6. (Matters change in Chapter 7)
Test of Significance P-values • Example: continue Spending on housing • Previously, we calculated z = -1.58 • If the null hypothesis is true, we expect z to take a value not too far from 0 • Because the alternative is two-sided, values of z far from 0 in either direction count as evidence against H0 and in favor of Ha
Test of Significance P-Values • Example:Continued • So the P-value is: • P(z < -1.58) + P(z > 1.58) = 2*P(z > |1.58|) = 2*0.057 = 0.114 • What is the p-value if Ha: < 31% • P-value is: • P(z < -1.58) = 0.057
Test of Significance Statistical Significance • Statistical software automates the task of calculating the test statistic and its P-value • You must still decide which test is appropriate and whether to use a one-sided or two-sided test • You must also decide what conclusion the computer’s numbers support • We know that smaller P-values indicate stronger evidence against the null hypothesis
Test of Significance • How strong is strong enough? • One approach is to announce in advance how much evidence against H0 we will require to reject H0 • We compare the P-value with a significance level that says “this evidence is strong enough” • Significance level is denoted by • If we choose = 0.05, we are requiring that the data give evidence against H0 so strong that it would happen no more than 5% of the time when H0 is true • If the P-value is small or smaller than , we say that the data are statistically significant at level
Test of Significance Statistical Significance • P-value < => Data is significant. Reject H0 • P-value => Data is not significant at given significance level. There is not enough evidence to reject H0
Test of Significance Statistical Significance • “Significant” in the statistical sense does not mean “important” • The term is used to indicate only that the evidence against the null hypothesis reached the standard set by
Test of Significance Test Statistic • Previously, we calculated P-value = 0.057 • With Ha: < 31% • Is this significant at the = 0.10 level? • 0.057 < 0.10 => There is enough evidence to reject H0 at this -level • Is there statistical significance at the = 0.05 level? • 0.057 > 0.05 => There is not significant evidence to reject H0
Test of Significance Test for a population mean • There are four steps in carrying out a significance test • State the hypothesis (Ho vs. Ha) • Calculate the test statistic (x-bar usually) • Find the P-value and make a decision • State your conclusion in the context of your specific setting
Test of Significance Test for a population mean • We have a SRS of size n drawn from a Normal population with unknown mean • We want to test the hypothesis that has a specified value • Call the specified value 0 to represent a specific value 1. State the null hypothesis • The null hypothesis is H0 : = 0
Test of Significance Test for a population mean • The test is based on the sample mean 2. Calculate the test statistic • Because Normal calculations require standardized variables, we will use the standardized sample mean as our test statistic: • This one-sample z statistic has the standard Normal distribution when H0 is true
Test of Significance Test for a population mean 3. find the P-value and make a decision • The P-value of the test is the probability that z takes a value at least as extreme as the value for our sample • What counts as extreme is determined by the alternative hypothesis Ha • One sided Ha : < 0 or Ha : > 0 • Two sided Ha: ≠ 0
Test of Significance 3. Find a p-value and make a decision • P-value < => Data is significant. Reject H0 • P-value => Data is not significant at given significance level. There is not enough evidence to reject H0 4. State your conclusion within the context of the problem
Test of Significance z Test for a Population Mean • Example: • A manufacturer of cereal wants to test the performance of one of its filling machines • The machine is designed to discharge a mean amount of = 12 ounces per box • The manufacturer wants to detect any departure from this setting • Suppose the sample yields the following results • n = 100 observations (boxes) • = 11.85 ounces • = 0.5 ounces
Test of Significance Solution: • State the null and alternative hypothesis, specify an -level • Calculate the test statistic • find the p-value • Conclusion: p-value=0.0026 < Reject H0. At this significance level, the sample provides enough evidence to believe that mean amount of cereal the machines discharge is different from 12 ounces per box.
Test of Significance • Remember • Tests of significance assess the evidence against H0 • If the evidence is strong, we can confidently reject H0 in favor of the alternative • Failing to find evidence against H0 means only that the data are consistent with H0, not that we have clear evidence that H0 is true
Test of Significance Two-sided Significance Tests and Confidence Intervals: • A 95% confidence interval captures the true value in 95% of all samples • If we are 95% confident that the true lies in our interval, we are also confident that values of that fall outside our interval are incompatible with the data • That sounds like the conclusion of a test of significance!
Test of Significance Two-sided Significance Tests and Confidence Intervals: • There is a close connection between 95% confidence intervals and significance at the 5% level • The same connection holds between 99% confidence intervals and significance at the 1% level, and so on • So we can use confidence intervals to conduct a two-sided hypotheses test (in these simple cases)
Test of Significance Two-sided Significance Tests and Confidence Intervals:
Test of Significance Two-sided Significance Tests and Confidence Intervals: Example cont’d: • take • n = 100 observations • = 11.85 ounces • . • 99% confidence interval:
Section 6.2Test of Significance Two-sided Significance Tests and Confidence Intervals: • Hypotheses: • Conclusion: • 12 is not in (11.72, 11.98) • We reject the at significance level • Does this match the previous conclusion?
Test of Significance P-value versus Fixed • A P-value is more informative than a “reject-or-not” finding at a fixed significance level • Assessing significance at a fixed level is easier, because no probability calculation is required • Simply look up a critical value in a table • Because the practice of statistics almost always employs software that calculates P-values automatically, tables of critical values are basically outdated
Section 6.2 Summary • A test of significance is intended to assess the evidence provided by data against a null hypothesis (Ho) and in favor of an alternative hypothesis Ha. The test provides a method for ruling out chance as an explanation for data that deviate from what we expect under Ho.
Section 6.2 Summary • The hypotheses (Ho and Ha) are stated in terms of population parameters. Usually Ho is a statement that no effect is present, and Ha says that a parameter differs from its null value, in a specific direction (one-sided alternative) or in either direction (two-sided alternative).