Chapter 9

Chapter 9 Hypothesis Testing

Table of Contents 9.1 Introduction to Statistical Tests 9.2 Testing the Mean µ 9.3 Testing a Proportion p 9.4 Tests Involving Paired Differences (Dependent Samples)

9.1 Introduction to Statistical Tests Why we do hypothesis testing There are times when analyzing statistical data that we will want to be able to determine whether a statement about a population parameter is supported or rejected by the data we have, and to what degree.

Definitions Null hypothesis H0: This is the statement that is under investigation or being tested. Usually the null hypothesis represents a statement of “no effect,” “no difference,” or, put another way, “things haven’t changed.” Alternate hypothesis H1: This is the statement you will adopt in the situation where the evidence (data) is so strong that you reject H0. A statistical test is designed to assess the strength of the evidence (data) against the null hypothesis. 9.1 Introduction to Statistical Tests

Nuts & bolts H0 is always of the form H0: µ = # or H0: p = # H1 is always of the form H1: µ < # or H1: p < # or H1: µ > # or H1: p > # or H1: µ≠# or H1: p≠# 9.1 Introduction to Statistical Tests

Nuts & bolts There are only 6 possible H0, H1 combinations. Note that the number used in H0’s comparison is the same number used in H1’s comparison. 9.1 Introduction to Statistical Tests

Types of statistical tests A statistical test is: left-tailed if H1 states that the parameter is less than the value claimed in H0; right-tailed if H1 states that the parameter is greater than the value claimed in H0; two-tailed if H1 states that the parameter is different from (or not equal to) the value claimed in H0. Note: H1 determines the type of test we’re using. 9.1 Introduction to Statistical Tests

x – µ ¯ test statistic = z = ─ σ/√n Hypothesis Tests of µ, Given x is Normal and σ Is Known There are calculator functions that handle hypothesis testing, but we will work with them formally beginning with section 9.2. For now, we will just be working with concepts. Definition 9.1 Introduction to Statistical Tests

We’ll be testing what we have (x = 105.0) against an established mean µ = 115 to see whether what we have indicates that the sample comes from a population with a mean µ < 115. ¯ Example 2 (a) H0: µ = 115 H1: µ < 115 9.1 Introduction to Statistical Tests

Example 2 (a) H0: µ = 115 H1: µ < 115 (b) and (c) deal essentially with the same topic. Refer back to example 2b on page 301. In that problem, we found the probability that a sample mean of size 5 would lie between 8 and 12 for an x-distribution whose mean was µ = 10.2 and whose standard deviation was σ = 1.4. 9.1 Introduction to Statistical Tests

In this problem, we want to find the probability that an x-distribution with mean µ = 115 and standard deviation σ = 12 would have a sample mean of size 6 that is less than or equal to x = 105.0. ¯ Example 2 (a) H0: µ = 115 H1: µ < 115 (b) and (c) deal essentially with the same topic. normalcdf(-1E99,105.0,115,12/√(6)) .0206133484 9.1 Introduction to Statistical Tests

What we have just computed is called the P-value for a left-tailed test against a null hypothesis of µ = 115 for the case when a size 6 sample produces a sample mean of x = 105.0. ¯ Example 2 (a) H0: µ = 115 H1: µ < 115 (b) and (c) deal essentially with the same topic. .0206133484 9.1 Introduction to Statistical Tests

Definition Assuming H0 is true, the probability that the test statistic will take on values as extreme as or more extreme than the observed test statistic (computed from sample data) is called the P-value of the test. The smaller the P-value computed from the sample data, the stronger the evidence against H0. The computation of the P-value depends on whether the test is left-tailed (as in 2c), right-tailed, or two-tailed. 9.1 Introduction to Statistical Tests

Refer to the diagram on pages 405–6. If you plan on using the TI-83/84 functions when we get to section 9.2, then the only thing you will need to glean from the diagram is that the computation of the P-value will depend on the nature of the alternate hypothesis. 9.1 Introduction to Statistical Tests

Notice that we either say that we reject or do not reject H0. We will not use the word ‘accept’ in these problems. 9.1 Introduction to Statistical Tests

Definitions The level of significance α is the probability of rejecting H0 when it is true. This is the probability of a type I error. The probability of making a type II error is denoted by the Greek letter β. Note: We will not be dealing with β in this course. 9.1 Introduction to Statistical Tests

How to conclude a test using the P-value and the level of significance α If P-value ≤ α, we reject the null hypothesis and say that the data are statistically significant at the level α. If P-value > α, we do not reject the null hypothesis. 9.1 Introduction to Statistical Tests

How to conclude a test using the P-value and the level of significance α A handy mnemonic Since P is an uppercase letter and α is a lowercase letter, we need P to be bigger than α (P > α) in order for H0 to be happy (i.e., we do not reject H0). If P is not bigger than α (P ≤ α), then H0 is unhappy (i.e., we reject H0). 9.1 Introduction to Statistical Tests

So what’s the deal with α? α represents the degree to which we are willing to challenge the null hypothesis. If α is small (e.g. 1%), then we don’t want to “rock the boat”, or, in other words, we are desensitized to samples that are only somewhat “off” from the null hypothesis. If α is large (e.g. 10%), then we are more sensitized to samples that are “off” from our null hypothesis. 9.1 Introduction to Statistical Tests

So what’s the deal with α? Two examples can be found in medicine. When a general property has been well established (e.g., physiological data, treatment success rates), the medical community is typically resistant to accepting any changes in these well-established values just because one data collector has been able to observe different values. When testing a well-established general medical property, the medical community will typically insist on low values for α (e.g., α = 0.01). 9.1 Introduction to Statistical Tests

So what’s the deal with α? Two examples can be found in medicine. However, when data has been collected from an patient, a medical professional might want to be careful not to suggest too strongly that the patient’s data is representative of the general population (e.g., blood sugar level, white blood cell count). When testing the average of several of the patient’s samples against the general population’s parameters, the medical professional might want to choose a higher α so as to be alerted to potential medical risks for the patient. 9.1 Introduction to Statistical Tests

Basic components of a statistical test A statistical test can be thought of as a package of five basic ingredients. 1. Null hypothesis H0, alternate hypothesis H1, and preset level of significance α If the evidence (sample data) against H0 is strong enough, we reject H0 and adopt H1. The level of significance α is the probability of rejecting H0 when it is in fact true. 9.1 Introduction to Statistical Tests

Basic components of a statistical test 2. Test statistic and sampling distribution These are the mathematical tools used to measure compatibility of sample data and the null hypothesis. 3. P-value This is the probability of obtaining a test statistic from the sampling distribution that is as extreme as or more extreme (as specified by H1) than the sample test statistic computed from the data under the assumption that H0 is true. 9.1 Introduction to Statistical Tests

Basic components of a statistical test 4. Test conclusion If P-value ≤ α, we reject H0 and say that the data are significant at level α. If P-value > α, we do not reject H0. 5. Interpretation of the test results Give a simple explanation of your conclusions in the context of the application. 9.1 Introduction to Statistical Tests

9.1 Introduction to Statistical Tests

9.2 Testing the Mean µ Calculator function The TI-83/84 calculator has a function to perform a hypothesis test when σ is known. STAT|TESTS|Z-Test…

Example 3 … Do the data indicate that the mean sunspot activity during the Spanish colonial period was higher than 41? Use α = 0.05. H0: µ = 41 H1: µ > 41 Just as with the interval functions, the test functions give the option of giving raw data or statistical results. For this example, let’s enter the data into L1. 9.2 Testing the Mean µ

Z-Test Inpt: Stats µ0:41 σ:35 List:L1 Freq:1 µ:≠µ0 <µ0 Calculate Draw Data Z-Test µ>41 z=1.091437547 p=.1375402372 x=47.04 Sx=37.52284637 n=40 ¯ >µ0 Example 3 H1: µ > 41 H0: µ = 41 α = 0.05 P-value Since “big P is greater than little α” (0.1375 > 0.05), then “H0 is happy.” 9.2 Testing the Mean µ

Z-Test Inpt: Stats µ0:41 σ:35 List:L1 Freq:1 µ:≠µ0 <µ0 Calculate Draw Data Z-Test µ>41 z=1.091437547 p=.1375402372 x=47.04 Sx=37.52284637 n=40 ¯ >µ0 Example 3 H1: µ > 41 H0: µ = 41 α = 0.05 Therefore, we do not reject H0 at the 5% level of significance. 9.2 Testing the Mean µ

Calculator function The TI-83/84 calculator has a function to perform a hypothesis test when σ is unknown. STAT|TESTS|T-Test… 9.2 Testing the Mean µ

Example 4 … Do the data indicate that the mean remission time using the drug 6-mP is different (either way) from 12.5 weeks? Use α = 0.01. H0: µ = 12.5 H1: µ ≠ 12.5 For this example, let’s enter the data into L2. 9.2 Testing the Mean µ

T-Test Inpt: Stats µ0:12.5 List:L2 Freq:1 µ: <µ0 >µ0 Calculate Draw Data T-Test µ≠12.5 t=2.105902924 p=.0480466063 x=17.0952381 Sx=9.999523798 n=21 ≠µ0 ¯ Example 4 H1: µ ≠ 12.5 H0: µ = 12.5 α = 0.01 P-value Since “big P is greater than little α” (0.0480 > 0.01), then “H0 is happy.” 9.2 Testing the Mean µ

T-Test Inpt: Stats µ0:12.5 List:L2 Freq:1 µ: <µ0 >µ0 Calculate Draw Data T-Test µ≠12.5 t=2.105902924 p=.0480466063 x=17.0952381 Sx=9.999523798 n=21 ≠µ0 ¯ Example 4 H1: µ ≠ 12.5 H0: µ = 12.5 α = 0.01 Therefore, we do not reject H0 at the 1% level of significance. 9.2 Testing the Mean µ

Note The part called Testing µ Using Critical Regions (Traditional Method) is unnecessary if you are using the TI-83/84 to perform your hypothesis tests. This section just shows a different calculation route used to get the same results as before. In fact, example 5 is just a redo of example 3 using the critical region method. 9.2 Testing the Mean µ

9.3 Testing a Proportion p Just as with intervals, our technique for testing a proportion p hinges on both the number of successes and the number of failures exceeding 5. Calculator function The TI-83/84 calculator has a function to perform a hypothesis test on a proportion. STAT|TESTS|1-PropZTest…

Example 6 … Under the old method, it is known that only 30% of the patients who undergo this operation recover their eyesight. … Can we justify the claim that the new method is better than the old one? (Use a 1% level of significance.) Is p for this sample (using the new method) greater than for the old method? H1: p > 0.30 H0: p = 0.30 9.3 Testing a Proportion p

1-PropZTest p0:0.30 x:88 n:225 prop≠p0 <p0 Calculate Draw 1-PropZTest prop>.3 z=2.982311167 p=.0014304744 p=.3911111111 n=225 ˆ >p0 Example 6 H0: p = 0.30 H1: p > 0.30 α = 0.01 P-value Since “big P is less than little α” (0.0014 < 0.01), then “H0 is unhappy.” 9.3 Testing a Proportion p

1-PropZTest prop>.3 z=2.982311167 p=.0014304744 p=.3911111111 n=225 ˆ Example 6 H0: p = 0.30 H1: p > 0.30 α = 0.01 1-PropZTest p0:0.30 x:88 n:225 prop≠p0 <p0 Calculate Draw >p0 P-value Therefore, we reject H0 at the 1% level of significance, and we choose H1: p > 0.30. 9.3 Testing a Proportion p

We will skip example 7 because it involves using the traditional method of critical regions, which is an unnecessary approach given the output from the calculator. 9.3 Testing a Proportion p

9.4 Tests Involving Paired Differences (Dependent Samples) In this section, we will be concerned with paired data samples. Paired data samples are such that every individual in one sample is naturally paired with a unique individual in the other sample. One type of paired data occurs in “before and after” situations, where the same object or item is measured both before and after a treatment.

Nuts & bolts When we have two samples that are paired, x1 and x2, we will calculate a new set of data d = x1 – x2 which consists of the differences between paired values. Then we will run a t test of the d differences against a mean of µd = 0, indicating that the x1 and x2 means are basically the same. The alternate hypothesis will either be µd > 0 to test whether the x1 mean is greater than the x2 mean, µd < 0 to test whether the x1 mean is less than the x2 mean, or µd ≠ 0 to test whether the x1 and x2 means are just different. 9.4 Tests Involving Paired Differences (Dependent Samples)

Nuts & bolts Note that we will always only ever use a t test when testing paired differences, and that the null hypothesis will always only ever be H0: µd = 0. Also note that the assignment of data to x1 and x2 will directly affect our choice of alternate hypothesis. E.g., suppose we have assigned sample A to x1 and sample B to x2, and that we want to test whether B has a greater mean than A. That means that we’re testing whether d = x1 – x2 < 0 on average, or µd < 0. Had we assigned A and B the other way around, we’d be testing whether µd > 0. 9.4 Tests Involving Paired Differences (Dependent Samples)

Example 9 … Table 9-8 indicates the results for a random sample of nine patients. … From the given data, can we conclude that the counseling sessions reduce anxiety? Use a 0.01 level of significance. Notice in table 9-8 that the book has already decided to define the difference data d as the speculated higher mean minus the speculated lower mean. This means that we’ll be testing the data against H1: µd > 0 H0: µd = 0 9.4 Tests Involving Paired Differences (Dependent Samples)

Example 9 Let’s proceed with this problem as if the last column in table 9-8 had not been calculated for us. Let’s put the scores before counseling into L3 and the scores after counseling into L4. After our original data has been entered into L3 and L4, let’s move to the header for L5. (You can tell that you are in L5’s header when the header ‘L5’ is highlighted.) Type in the expression L3–L4 and press [ENTER]. You should notice that L5 matches the difference column in table 9-8. H1: µd > 0 H0: µd = 0 9.4 Tests Involving Paired Differences (Dependent Samples)

T-Test Inpt: Stats µ0:0 List:L5 Freq:1 µ:≠µ0 <µ0 Calculate Draw T-Test µ>0 t=4.362281022 p=.0012026345 x=33.33333333 Sx=22.92378677 n=9 Data ¯ >µ0 Example 9 Now let’s run a t test on L5. H0: µd = 0 H1: µd > 0 α = 0.01 P-value Since “big P is less than little α” (0.0012 < 0.01), then “H0 is unhappy.” 9.4 Tests Involving Paired Differences (Dependent Samples)

T-Test Inpt: Stats µ0:0 List:L5 Freq:1 µ:≠µ0 <µ0 Calculate Draw T-Test µ>0 t=4.362281022 p=.0012026345 x=33.33333333 Sx=22.92378677 n=9 Data ¯ >µ0 Example 9 Now let’s run a t test on L5. H0: µd = 0 H1: µd > 0 α = 0.01 Therefore, we reject H0 at the 1% level of significance, and we choose H1: µd > 0. The data shows a reduction in anxiety level. 9.4 Tests Involving Paired Differences (Dependent Samples)

Chapter 9

Chapter 9

Presentation Transcript

Chapter 9

CHAPTER 9

Chapter 9

Chapter 9

Chapter 9

Chapter 9

Chapter 9

Chapter 9

Chapter 9

Chapter 9

Chapter 9

Chapter 9

Chapter 9

Chapter 9

Chapter 9

Chapter 9

Chapter 9

Chapter 9

CHAPTER 9

Chapter 9

Chapter 9

Chapter 9