280 likes | 379 Views
Psychology 10. Analysis of Psychological Data March 5 , 2014. The plan for today. Two more examples of hypothesis tests. The importance of assumptions in hypothesis testing. Effect sizes. Power. Another example.
E N D
Psychology 10 Analysis of Psychological Data March 5, 2014
The plan for today • Two more examples of hypothesis tests. • The importance of assumptions in hypothesis testing. • Effect sizes. • Power.
Another example • I am interested in the idea that a particular way of treating depression is effective. • I have identified a test of depression that has a mean of 100 and a standard deviation of 10 in the population of depressed persons. • I plan to administer my treatment to a sample of 25 depressed clients.
Depression example (cont.) • Research hypothesis: • H1: m≠ 100. • Null hypothesis: • H0: m= 100. • Alpha level: • Suppose the treatment is expensive and invasive, so we want to be really confident. • a = .01.
Depression example (cont.) • Calculating the statistic: • My 25 scores add up to 2625. • M = 2625 / 25 = 105. • sM = 10 / √25 = 2.0. • Z = (105 – 100) / 2.0 = 2.5. • Evaluating the statistic: • Critical Z = 2.575. • Decision: we fail to reject the null hypothesis. • Conclusion: • We have not found convincing evidence that the treatment is effective.
Another example of hypothesis testing • A professor has been using the same final exam for his introductory statistics class for years, and knows that historically the mean has been 82.3, and the standard deviation has been 7.9. • He believes that the performance of his class this year is worse than previous years, and decides to test that hypothesis.
Another example (cont.) • What are the research and null hypotheses? • What decision do we need to make before we proceed with data collection? • Let’s set the two-tailed alpha level at .05.
Another example (cont.) • We gather data. N = 76, M = 79.7. • How large must our Z statistic be for us to reject the null hypothesis? • Zcritical = 1.96. • SE = 7.9 / √76 = 0.906192148. • Z = (79.7 – 82.3) / 0.906192148 = -2.87. • What do we conclude? • We reject the null hypothesis.
Interpretation • Because we rejected the null hypothesis, we have found evidence that the mean of the population represented by this class differs from the historical population mean. • Specifically, the current mean is lower than the historical mean.
Assumptions of the Z test • Whenever we conduct a hypothesis test, at some stage in the process we identify a statistic and its sampling distribution. • The null hypothesis is part of what allows us to specify the sampling distribution. • Some set of assumptions is also always necessary.
Assumptions (cont.) • Violation of assumptions always represents an alternative explanation for why we got an unusual test statistic. • Whenever we learn about a new statistical testing procedure, we must also learn its assumptions.
Assumptions of the Z test • In decreasing order of importance, the assumptions of the Z test are: • The individual observations are independent of one another. • The population standard deviation is known. • The population distribution is normal or the sample is sufficiently large that the sampling distribution of the mean is normal.
Evaluating the assumptions • Independence must be evaluated by considering the procedure that generated the data. • (The population standard deviation will rarely, if ever, actually be known.) • Normality may be evaluated by considering (graphically) whether the sample looks like it came from a normal population.
Importance of the assumptions • Independence is absolutely essential; without independence, the test is meaningless. • Knowledge of the population standard deviation is essential to the validity of this test. • Normality is somewhat less important. As the sample size becomes large, normality becomes unimportant.
Effect sizes • It is important to remember that statistical significance is not the same thing as practical significance. • For example, with a large enough sample size, we might be able to demonstrate that an educational intervention can produce a 1/10 point improvement on a California STAR test. • We probably would not find that very interesting.
Effect sizes (cont.) • Hypothesis testing addresses the question of whether there is an effect. • We must still address the question of how large the effect is.
Intrinsically meaningful metrics • When the metric of a variable is intrinsically meaningful, the effect is expressed in that metric. • Examples: • Number of cigarettes smoked. • Pounds of body weight lost. • Miles per gallon.
Arbitrary metrics • When the metric lacks intrinsic meaning, we standardize. • In our example, d = (79.7 – 82.3) / 7.9 = -0.33. • This tells us that our current mean is about a third of a standard deviation below the historical value.
Power (cont.) • When we set our alpha level, we are directly controlling the probability that we will commit a Type I error (assuming the null hypothesis is false). • Unfortunately, we cannot set our beta level. • Power is defined as 1 – b. • Power is the probability that we will avoid a Type II error, given that the null hypothesis is false in a particular way.
Power calculations • Suppose we are planning a test of our IQ-improving school (see Powerpoint from last time). • Suppose, further, that we believe the school causes about a 5 point increase in IQ. • We are planning an investigation with N = 25, and want to know what our power will be.
Power calculations (cont.) • What is the power? • What is the probability that a single draw from a normal distribution with mean 105 and sd 3 will be greater than 105.88? • Z = (105.88 – 105) / 3 = .29333. • The area above .29333 in a standard normal distribution is .3846.
Power calculations (cont.) • That’s not very good. • That means that about 62% of the time, we would fail to find an effect. • How can we change things?
Things that affect power • The magnitude of the true effect affects power. • The alpha level affects power. • The choice of a one- or two-tailed test affects power. • The standard deviation affects power. • The sample size affects power.
In class exercises • What would happen to power if we used a one-tailed test? • What would happen to power if we used an alpha level of .01 (still one tailed)? • What would happen to power if our standard deviation were only 10? • What would happen to power if our sample size increased to 100?