1 / 38

Psychology 10

Psychology 10. Analysis of Psychological Data March 10, 2014. The plan for today. A little more on power. Review of the formal steps of hypothesis testing. The chi square test for goodness of fit. The chi square test for contingency tables. In class exercises (from last class).

roch
Download Presentation

Psychology 10

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Psychology 10 Analysis of Psychological Data March 10, 2014

  2. The plan for today • A little more on power. • Review of the formal steps of hypothesis testing. • The chi square test for goodness of fit. • The chi square test for contingency tables.

  3. In class exercises (from last class) • What would happen to power if we used a one-tailed test? • What would happen to power if we used an alpha level of .01 (still one tailed)? • What would happen to power if our standard deviation were only 10? • What would happen to power if our sample size increased to 100?

  4. Original power calculation • Sigma = 15, N = 25, so standard error = 3. • Critical means = 100 ± 1.96  3 = (94.12, 105.88). • Area above 105.88? • Z = (105.88 – 105) / 3 = 0.2933. • Area above .29 = .3859. • Other tail: (94.12 – 105) / 3 = -3.627. • Area below -3.63 = .0002. • So power is .3859 + .0002 = .3861.

  5. Power with one-tailed test • Critical mean is 1.645  3 above the mean: 104.935. • Z = (104.935 – 105) / 3 = -.02167. • Area above -.02 = .508. • Note that power has improved.

  6. Power with standard deviation reduced to 10 • If sigma were only 10, the standard error would be 2 instead of 3. • Critical mean: 100 + 1.645  2 = 103.29. • Z = (103.29 – 105) / 2 = -0.855. • Area above -0.86 = .8051. • Power is much higher.

  7. Power with N increased to 100 • If sigma = 10 and N = 100, the standard error of the mean is 1. • Critical mean = 100 + 1.645  1 = 101.645. • Z = (101.645 – 105) / 1 = -3.355. • Area above -3.36 = .999. • Power is much higher.

  8. Things that affect power • Things that affect power fall into two broad classes: • Things that reduce the critical value; • Things that increase the value of the test statistic.

  9. Things that affect power (cont.) • The critical value will decrease if: • You convince yourself that a one-tailed test is appropriate; • You use a less stringent alpha level. • The test statistic will increase if: • The effect is larger; • The variability is reduced; • The sample size increases.

  10. Formal steps of hypothesis testing • Identify an interest, and state the research and null hypotheses. • Identify a relevant test statistic and its sampling distribution under the null hypothesis. • Set the alpha level. • Collect data and make a decision about the null hypothesis. • Evaluate assumptions.

  11. A new hypothesis test • We are going to work with some examples of a very different kind of hypothesis test. • Notice that we will still follow the same steps.

  12. Testing goodness of fit • The Chair of a psychology department suspects that some sections of a class are more popular with students than others. • There are three sections offered at the same time by three different professors. • The chair plans to examine enrollment in these sections to determine if there are differences in popularity.

  13. The research and null hypotheses • The chair’s interest is in the possibility that some sections are more popular than others. • H1: There are differences in the proportions of students who will enroll in each section. • H0: There are no differences in the proportions of students who will enroll in each section.

  14. Test statistic and sampling distribution • Under the null hypothesis, the chi-square statistic has a known distribution. where O is the observed count and E is the expected count if the null hypothesis is true. • If the null hypothesis is true, the chi-square statistic has a chi-square distribution, with degrees of freedom equal to the number of cells minus one.

  15. Are we ready to gather data? • We must set the alpha level. • Let’s set a = .05. • Here are the data:

  16. Expected counts • There are 32 + 25 + 10 = 67 students. • If the null hypothesis were true, 1/3 of those students would enroll in each section of the class. • Hence, the expected count for each instructor is 67/3 = 22 1/3.

  17. Calculating the statistic

  18. Evaluating the statistic • Does that meet our definition of “unusual?” • Degrees of freedom = 2. • Why? Because if you know that there are 67 students and that the first two cells have 32 and 25 students, the third cell has no further information. It must have 10 students. It is no longer “free” to vary. • Using table B.8.

  19. Evaluating the statistic (cont.) • From the table, our critical value is 5.99. • The value we obtained was 11.31. • 11.31 > 5.99, so we reject the null hypothesis. • This allows us to conclude that there are indeed differences in popularity.

  20. Assumptions of the chi-square test • Observations must be independent. • There should be no cells with expected frequencies less than 5.

  21. Chi-square test for independence • A group of researchers were interested in the effect of smoking on mortality. • They identified a group of women in Britain in 1972, and classified them as smokers or nonsmokers. • In 1995, they tracked down the same women, and observed whether they were alive or dead.

  22. Hypotheses • The research hypothesis here is that the mortality rate will differ for smokers and nonsmokers. • The null hypothesis is that the mortality rate is the same for both populations. • Let’s set our alpha level at .01.

  23. The smoking data

  24. Expected counts • 369 out of 1314 people died. That’s a proportion of .2808219178. • 945 out of 1314 people lived. That’s a proportion of .7191780822. • If smoking makes no difference, we would expect .2808219178 of the 582 smokers to die; that’s 163.4384. • We would expect the remaining 418.5616 smokers to survive.

  25. Expected counts (cont.) • If smoking makes no difference, we would expect .2808219178 of the 732 nonsmokers to die; that’s 205.5616. • We would expect the remaining 526.4384 nonsmokers to survive.

  26. The smoking data

  27. Calculating the chi-square statistic

  28. Evaluating the chi-square statistic • Degrees of freedom = (number of columns – 1) (number of rows – 1) • Here, that’s (2 – 1)  (2 – 1) = 1. • From the table, the critical value when we have one degree of freedom and alpha of .01 is 6.63. • We reject the null hypothesis and conclude that smoking and mortality are related.

  29. But wait a minute…

  30. Simpson’s paradox • We got a significant result because smokers are dying less frequently than expected, and nonsmokers are dying more frequently. • So we should all start smoking in order to live long, healthy lives, right? • An important third variable was not considered in our analysis.

  31. Smoking data paradox • In 1972, smoking for women was a relatively new and radical thing in Britain. • Accordingly, a disproportionate number of the smokers in the study were young… • …and a disproportionate number of nonsmokers in the study were old. • Guess who was more likely to have died 23 years later?

  32. Next time • Inference about the mean when sigma is not known.

  33. In class exercise • Benford’s law: for many numbers, the relative frequency of the first significant figure is approximately log10(1 + 1/x), where x is the value of the digit. • That results in the following predicted proportions for the digits 1 through 9: .301, .176, .125, .097, .079, .067, .058, .051, .046

  34. Example: government jobs data • The Bureau of Labor Statistics publishes information about the number of persons employed in various occupations, by state. • Intuitively, it seems as if each digit should occur with roughly equal frequency. • However, this might be a variable that follows Benford’s law.

  35. Take a stand • How many people would expect roughly equal frequency for each digit? • How many people think this data set will follow Benford’s law? • Let’s set our alpha level.

  36. The data New York, May 2006, N = 694

  37. Expected counts

  38. Answers: • The chi-square for the Benford’s law fit is 10.605. • The chi-square for the Uniform fit is 423.323.

More Related