320 likes | 390 Views
Psychology 10. Analysis of Psychological Data March 12 , 2014. The plan for today. What do we do if we don’t know the population standard deviation? Introducing the t test. An example of the problem. We have scored four homework assignments so far in this class.
E N D
Psychology 10 Analysis of Psychological Data March 12, 2014
The plan for today • What do we do if we don’t know the population standard deviation? • Introducing the t test.
An example of the problem • We have scored four homework assignments so far in this class. • I am interested in whether the population mean of scores is greater than 7.00 (the traditional cut for scores in the C range). • We are viewing this class as a sample from a population of possible Psych 10 students.
Hypothesis testing • What is my research hypothesis? • H1: m≠ 7.0. • What is my null hypothesis? • H0: m= 7.0. • Am I ready to examine the data? • a = .05. • What else do we need? • A test statistic with a known sampling distribution.
The t test • The t test is calculated exactly like the Z test… • …except that the sample standard deviation replaces sigma. • Under the null hypothesis, the t statistic has a t distribution. • The t distribution differs for different degrees of freedom.
The t distribution and df • When df is small, the t distribution has much heavier tails than the normal distribution. • As df increases, the tails grow closer to those of the normal distribution. • When df = ∞, the t distribution is the same as the normal distribution.
What is df ? • For the t distribution, df will be the amount of information we had when we estimated the variance or standard deviation. • For this one-sample t test, df is N – 1. • In order to estimate the standard deviation, we first needed to estimate the mean. • That used up a piece of information.
Understanding df • Why does estimating the mean use information? • If I have a data set consisting of 5 numbers, and I tell you that the mean is 3 and that the first 4 numbers are 1, 2, 3, 4… • …you can tell me what the fifth number must be. It is no longer free to vary.
Identifying critical values of t • Table B.2 lists critical values of the t distribution for various degrees of freedom. • If the df you need is not tabled, use the next lower value.
Calculating t • Now we are ready to examine our data. • The sum of the homework scores is 892.665, and there are 119 students in the sample. • M = SX / N = 892.665 / 119 = 7.501387.
Calculating t (cont.) • The sum of the homework scores is 892.665, and the sum of the squared scores is 6986.157. • SS = SX2 - (SX)2/ N = 6986.157 – 892.6652 / 119 = 289.9318.
Calculating t (cont.) • s2= SS / (N – 1) = 289.9318 / 118 = 2.457049. • s = √ s2 = √ 2.457049 = 1.567498
Calculating t (cont.) • M = 7.501387; s = 1.567498; N = 119 • m0= 7.0. • t = (7.501387 – 7.0) / (1.567498 / √119) = 3.48931 ≈ 3.49.
Making a decision • N = 119, so df = 118. • In the table, the next lower df is 60. • tcrit = 2.00. • Our t statistic had the value 3.49, so we reject the null hypothesis. • We have found evidence that the population represented by the class performs above the C level on the homework.
Assumptions of the t test • The observations must be independent. • The population from which the sample is drawn must be normally distributed. • Note that the normality assumption does not go away for large samples, as it did for the Z test. • However, the test is robust to violations of normality when N is large.
Another example • How often do people typically sleep each night? • H0: m = 8 hours. • H1: m ≠8 hours. • Let’s use a two-tailed alpha of .01. • Class sample: N = 109, so df = 108. • What is the critical value? • From the table, tcrit= 2.66.
Results • M = 7.105505, s = 1.263947. • sM = 1.263947 / √109 = 0.12106417. • t = (7.105505 – 8) / 0.12106417 = -7.39. • We can reject the null hypothesis: the mean amount of sleep per night for undergraduate students in psychology is less than 8 hours.
Checking normality Hours of sleep: 4 | 5 5 | 00000055555 6 | 0000000000000000000055555555 7 | 0000000000000000000000000555555555 8 | 000000000000000000000555555 9 | 0005 10 | 005 11 | 12 | 0
Another example • UC requires a high school GPA of 3.0 in the crucial courses. • H0: m = 3.0. • H1: m ≠ 3.0. • Let’s use a two-tailed alpha of .01 again. • N = 111.
Another example • M = 404.469 / 111 = 3.643865. • SS = 1490.705 – 404.4692 / 111 = 16.87462. • s = √(16.8746 / 110) = 0.3916701. • sM = 0.3916701 / √111 = 0.03717568 • t = (3.643865 – 3.0) / 0.03717568 = 17.31952 ≈ 17.32.
Evaluating normality GPA Distribution: 24 | 000 26 | 28 | 0 30 | 000000003 32 | 0000000000034 34 | 0000000000000016688 36 | 00000057700005 38 | 0000000000000005577990000045568 40 | 000000000077 42 | 000010000
Power and the t statistic • Recall: power is the probability that we will reject a null hypothesis that is actually false. • In our last example, that is the probability that we will conclude that the mean GPA is different from 3.0 when it really is different from 3.0. • (A Type II error would be failing to reject the null when the true mean GPA is different from 3.0.)
Power and the t statistic. • Power analysis is much more complex for the t statistic than for the Z statistic. • However, exactly the same things increase power: • Bigger effect. • Lower variability. • Larger alpha. • One-tailed tests. • Larger sample size.
Class exercise • I am going to show you a slide with a fairly large number of dots on it. • My hypothesis is that people are biased in their estimates of numbers in such circumstances; I won’t say yet in which direction. • When the image goes away, write down your best guess of the number of dots you saw.
Class exercise (cont.) • Are we ready to observe our data yet? • Alpha = .05.