1 / 31

Week 2 Lecture Notes PSYC2022: Winter 2019

Conduct a significance test to determine if the true mean number of hours of work for Canadians with poor health is different from the usual 40 hours per week.

sjennifer
Download Presentation

Week 2 Lecture Notes PSYC2022: Winter 2019

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Week 2 Lecture Notes PSYC2022: Winter 2019

  2. Introduction to Hypothesis Testing (Significance Test) Consider the following problems: Researchers claimed that the true mean number of hours of work for all Canadians with poor health is different from 40 hours (usual hours of work per week). • The above statement serves as a hypothesis, moreover it is a Research Hypothesis. A hypothesis is: • a statement about a population. • a predication that a parameter describing some characteristics of a variable (e.g., true mean, ) takes a particular numerical value or falls in a certain range of values. For conducting a Significance Test: • Researchers (you) use data to summarize the evidence about a hypothesis. • With data, you can compare the point estimates of parameters to the values predicted by the hypothesis.

  3. Example: Hours of Work for Canadians with Poor HealthSignificance Test (Hypothesis Testing) for a Quantitative Mean Researchers claimed that the true mean number of hours of work for all Canadians with poor health is different from 40 hours (usual hours of work per week). To test their hypothesis, the researchers relied on the results from the Canadian Community Health Survey (CCHS, 2012) for a random sample of 56 Canadians with poor health. • Next slides show using R to obtain descriptive statistics and a graphical display to describe the distribution of hours of work for a random sample of 56 Canadians with poor health.

  4. Read the Data Hours of Work with Poor Health (CCHS, 2012) in R

  5. Histogram of Hours of Work for Canadians with Poor Health (CCHS, 2012) in R

  6. Install the mosaic and yaml packages in RPurpose: Obtain Descriptive Statistics

  7. Descriptive Statistics of Hours of Work for Canadians with Poor Health (CCHS, 2012) in R

  8. Descriptive Statistics and Histogram for:Hours of Work for Canadians with Poor Health (CCHS, 2012) Based on the results from the Canadian Community Health Survey (CCHS, 2012) for a random sample of 56 Canadians with poor health, the hours of work had mean 33.91 and standard deviation of 13.45. • Hours of work for Canadians with poor health is 33.91, on average; it differs from the mean by about 13.45 hours, on average. • The estimated standard error for the mean is: = = 1.80. • The distribution of hours of work for Canadians with poor health is somewhat skewed to the left.

  9. Hours of Work for Canadians with Poor HealthSignificance Test (Hypothesis Testing) for a Quantitative Mean Researchers claimed that the true mean number of hours of work for all Canadians with poor health is different from 40 hours (usual hours of work per week). To test their hypothesis, the researchers relied on the results from the Canadian Community Health Survey (CCHS, 2012) for a random sample of 56 Canadians with poor health. The hours of work for the 56 randomly selected Canadians with poor health had mean 33.91 ( = 33.91), and standard deviation of 13.45 (S = 13.45). The estimated standard error for the mean is = =1.80. The idea to think about is: • How far away from hypothezised mean value of 40 ( = 40) hours the sample mean need to be in order for the researchers be able to support their claim? • In other words, how many estimated standard error (se) do the sample mean need to be away from the hypothesized mean value of 40 hours (= 40) so that researchers could support their claim?

  10. Idea of Hypothesis Testing • All the hypothesis tests boils down to the same question: “Is an observed difference (difference between the observed sample statistic and the hypothesized value) or pattern too large to be attributed to chance?” • We measure “how large” by putting our sample results in the context of a sampling distribution model (e.g., Normal model, distribution).

  11. The Five Steps in Hypothesis Testing Steps in conducting Hypothesis Testing: • State the null and the alternative hypothesis. • Check the necessary assumptions. • Identify the test-statistic. Find the value of the test-statistic. • Find the p-value of the test-statistic. • State (if any) a conclusion.

  12. Step 1 in Hypothesis Testing: State the Null and Alternative Hypothesis Step 1. State the null and the alternative hypothesis for conducting a significance test for a quantitative mean: • The null hypothesis is the current belief: : In our example it would have a form:: • The Alternative hypothesis is what the researcher(s) [you] want to prove: : In our example it would have a form:: This means, we have a two-sided test: either or * The goal here is to provide evidence against Ho (e.g., suggest Ha).

  13. Step 2 in Hypothesis Testing: Check the Necessary Assumptions Step 2. Check the necessary two assumptions for conducting a significance test for a mean: • The sample of cases are randomly selected. • The random sample came from an approx. Normal distribution. • According to the CCSH (2012) results, the sample of 56 Canadians with poor health is a random sample. • We need to assume that this data came from a population that has a normal distribution. We can check the normality assumption by graphing the histogram of this data. The distribution of hours of work for Canadians with poor health is somewhat left-skewed. However, recall that the t-distribution works reasonably well when there is a slight departure from normality.

  14. The t distribution William Gosset • The density, t distribution, was calculated by William Gosset. • Recall: When population standard deviation is unknown, its value is estimated by the sample standard deviation S; The value for S is different for different random samples with different sizes (n). • The t distribution is bell-shaped and is symmetric about the mean 0. • The standard deviation is a bit larger than 1 and its value depends on degrees of freedom, df = n-1 (one less than the sample size). • The t distribution has a slightly different spread for different values of df. • The t distribution has a wider shape than Z standard normal distribution when sample size is small. • When df is about 60 or more, the two distributions (Z and t) are nearly identical. • We can think of t distribution with df = (infinity) as standard normal distribution, Z, because • as n (sample size) increases, we have

  15. Step 3 in Hypothesis Testing: Identify the Test-statistic and Find its Value Step 3. Identify the test-statistics for conducting a significance test for a quantitative mean, and find its value: • We need to find the value of the test-statistics, which summarizes how far (how many est. standard error, se) the point estimate is way from the hypothesized -value. In our example, we are interested to see how many est. standard error (se), of 33.91 is away from of 40. • For small sample sizes (n < 60), after checking the assumptions (random sample, and normality condition): the sampling distribution of sample mean has a t-distribution with mean and est. standard error (se) of . This can be written as: ~ t ( = , se = ) Under Ho: = , the test-statistic has a t-score: (t distribution with ) In our example (recall n = 56 with = 33.91 and s = 13.45): Assume Ho: = 40 is true: ~ t ( = = 40, = = 1.80) The observed test-statistics is t-score: = -3.38 with = 56 – 1 = 55

  16. Step 4 in Hypothesis Testing: Find the p-value of the Test-statistic Step 4. Find the p-value of the test-statistic when conducting a significance test for a quantitative mean: • The P-Value is the probability of getting at least something (e.g., sample mean,) more extreme (e.g., unusual, unlikely, or rare) than what we have already found (our observed value of ) that provide even stronger evidence against Ho. • The more extreme the t-scores (large in absolute values) are the ones that denote farther departure of the observed values (e.g., sample mean, ) from the parameter value () in Ho. • In the two-sided test, e.g., : , p-value is the two-tailed probability. This is the probability that the sample mean falls at least as far from in either direction as the observed value of .

  17. More About p-values p-value: • It is a conditional probability. • It is not the probability that Ho (null hypothesis: current belief) is true. • It is: P(observed statistic value [or even more extreme] | Ho]. Given Ho (the null hypothesis), because Ho gives the parameter values that we need to find required probability. • P-value serves as a measure of the strength of the evidence against the null hypothesis (but it should not serve as a hard and fast rule for decision). • If p-value = 0.03 (for example) all we can say is that there is 3% chance of observing the statistic value we actually observed (or one even more inconsistent with the null value).

  18. Step 4 in Hypothesis Testing: Find the p-value of the Test-statistic Step 4. Find the p-value of our example’s test-statistic: • p-value = Area below t = -3.38 PLUS Area above t = 3.38 with df = n-1 = 56 – 1 = 55 Take the absolute value of this t -score: |-3.38| which is 3.38 p-Value = 2 x Area above t = 3.38 with df = 55 • Note: we cannot find df = 55 in our t-table. In that case, we use a reference distribution (df = 50). • To find p-value, in line df = 50 (our reference distribution) search for a t-score close to 3.38 . Note that along df = 50 as the t-scores increase the right-tail probability decrease. The last info we have is: t = 3.26 with right-tail probability of 0.001. This means, the area above t-value of 3.38 is less than 0.001. • Thus, we have: p-Value 2 x (less than 0.001) = Less than 0.002 Interpretation of this p-value: There is about 0.2% chance of observing the statistic-value we actually observed or one even more inconsistent with the null value.

  19. Step 4 in Hypothesis Testing: Find the p-value of the Test-statisticUse Online Applet: https://istats.shinyapps.io/tdist/ Step 4. Find the p-value a test-statistic using online applet: p-Value = Area below t = -3.38 PLUS Area above t = 3.38 with df = n-1 = 56 – 1 = 55 Take the absolute value of this t -score: |-3.38| = 3.38 p-Value = 2 x Area above t of 3.38 with df = 55 p-Value 2 x 0.0006 = 0.0012 Interpretation of this p-value: There is about 0.13% chance of observing the statistic-value we actually observed or one even more inconsistent with the null value.

  20. Step 5 in Hypothesis Testing: State a Conclusion (if any) Step 5. Conclusion: • We use the p-value for stating our conclusion to our research question. • When Ho is true, p-value is roughly equally likely to fall anywhere between 0 and 1. • When Ho is false, p-value is closer to 0 than 1. • The smaller the p-value the stronger evidence we have against Ho and thus we have stronger evidence to support Ha (e.g., sufficient evidence to conclude our claim).

  21. Step 5 in Hypothesis Testing: State a Conclusion (if any) Step 5. Conclusion: • But how small a p-value is small? • We would need to choose an -level. • -level is a boundary value that denotes rejection region. A boundary value at which we could get, with probability, an observed value (e.g., sample mean, ) just like the one we have found or something more extreme than what we have already found. • -level is also called significance-level. • Often we use -level of 0.05 (or 0.01 in more serious cases, e.g., crime trials). • -level is the same concept as error probability in the confidence interval.

  22. Step 5 in Hypothesis Testing: State a Conclusion (if any) Step 5. Conclusion: • But how small a p-value is small? • We would need to choose an -level: a number such that if: • p-value -level, we reject Ho; We can conclude Ha (we have evidence to support our claim). Often we phrase as a statistically significant result at that specified -level. • p-value -level, we fail to reject Ho; We cannot conclude Ha (we have not enough evidence to support our claim; thus, Ho is plausible – We do not accept Ho). Often we phrase as the result is not statistically significant at that specified -level. • The default -level (significance-level) is typically = 0.05 (but it can be a different value based on context – it is usually not higher than 0.10).

  23. Step 5 in Hypothesis Testing: State a Conclusion (if any) Step 5. In our example, Conclusion is: p-value < 0.002, which is less than = 0.05; We reject : 40 and conclude : that the result is statistically significant at = 0.05. We have a very strong evidence to conclude that the true mean hours of work for Canadians with poor health is different from 40. Make a Directional Conclusion: • Based on the sign of the test-statistic modify your conclusion. For example, instead of stating a conclusion that true mean differs from some hypothesized value, state that the true mean is: • “less than” a hypothesized value if the sign of the test-statistic is “negative”; • “more than” a hypothesized value if the sign of the test-statistic is “positive”. • In our example, the directional conclusion is: We have a very strong evidence to conclude that the true mean hours of work for Canadians with poor health is less than 40 (we say less than since the observed t-test is -3.38).

  24. The Five Parts of a Significance Test for a Mean Using an Online Applet: https://istats.shinyapps.io/Inference_mean/

  25. The Five Parts of a Significance Test (Two-Sided Test) for a Quantitative Mean in R : : = -3.39, df = 55, p-value =0.0013 < = 0.05. We reject : and Conclude : We have strong evidence to conclude that the mean hours of work for Canadians with poor health is different from 40. Directional Conclusion: We have strong evidence to conclude that the mean hours of work for Canadians with poor health is less than 40.

  26. Confidence Interval (CI) is Long-Run Proportion Correct • A confidence interval constructed from any particular sample either does or does not contain the population parameter, for example true mean . • If we repeatedly selected random samples of that size and each time constructed a 95% confidence interval, then in the long run about 95% of the intervals (19 out of 20 times) would contain the population parameter, for example true mean. • On average, only about 5% (about 1 out of 20 times) does a 95% confidence interval fail to contain the population parameter. • Note that different samples have different ’s (), and give different confidence intervals. • So our confidence in procedure rather than an individual interval.

  27. Idea of CI for a Population Mean • Once sample is selected, if does fall within 1.96 units of , then the interval from - 1.96 to +1.96contains . • So, with probability 0.95, a value occurs such that the interval ± 1.96contains the population parameter. • On the other hand, the probability is 0.05 [1- confidence level(0.95) = 0.05] that does not fall within 1.96 units of . In that case, the interval from - 1.96 to +1.96 does not contains . • 0.05 is called the error probability and it is denoted by (the Greek letter alpha).

  28. Equivalence Between Confidence Intervals and Test Decisions (Two-sided Test) 95% CI for : (30.31, 37.51) : : The 95% CI does not contain the value 40; Thus, we reject Ho: at = 0.05. We have the same conclusion as above. Furthermore, the values in the 95% CI are both below the hypothesized value of 40. This suggests that the true mean hours of work per week for Canadians with poor health is less than 40 hours (between 30.31 to 37.51). : : = -3.39, df = 55, p-value < = 0.05. We reject : and Conclude : We have strong evidence to conclude that the mean hours of work for Canadians with poor health is different from 40. Directional Conclusion: We have strong evidence to conclude that the mean hours of work for Canadians with poor health is less than 40.

  29. The Five Parts of a Significance Test (One-Sided Test) for a Quantitative Mean in R : : = -3.39, df = 55, p-value = 0.0006 (p-value is one-tailed; which is 0.0013/2 from two-sided test) p-value < = 0.05. We reject : and Conclude : We have strong evidence to conclude that the mean hours of work for Canadians with poor health is less than 40.

  30. Decisions Errors in Tests • When Ho is true, a Type I error occurs if Ho is rejected. The probability of making a type I error is denoted by . • When Ho is false, a Type II error occurs if Ho is not rejected. The probability of making a type II error is denoted by

  31. What type of error could we making in our CCHS(2012) example? Researchers claimed that the true mean number of hours of work for all Canadians with poor health is less than 40 hours (usual hours of work per week). In order to test their hypothesis, the researchers relied on the obtained statistics from the Canadian Community Health Survey (2012) for a random sample of 56 Canadians with poor health. : : = -3.39, df = 55, p-value < 0.05. We Reject Ho: and conclude Ha: . We have strong evidence to conclude that the mean hours of work for Canadians with poor health is less than 40. This means we could be making a Type I error. We decided that the true mean is less than 40 hours based on our data (as evidence against Ho: ), however, it could be that the hypothesized mean value of 40 is true (e.g., : could be true).

More Related