320 likes | 493 Views
T-tests Part 1 PS1006 Lecture 2. Sam Cromie. What has this got to do with statistics?. Overview. Review: hypothesis testing The need for the t-test The logic of the t-test Application to: Single sample designs Two group designs Within group designs (Related samples)
E N D
T-testsPart 1PS1006 Lecture 2 Sam Cromie
Overview • Review: hypothesis testing • The need for the t-test • The logic of the t-test • Application to: • Single sample designs • Two group designs • Within group designs (Related samples) • Between group designs (Independent samples) • Assumptions • Advantages and disadvantages
Generic Hypothesis testing steps • State hypotheses – Null and alternative • State alpha value • Calculate the statistic (z score, t score, etc.) • Look the statistic up on the appropriate probability table • Accept or reject null hypothesis
Generic form of a statistic Data – Hypothesis Error What you got – what you expected (null) The unreliability of your data Z = Individual score – Population mean Population standard deviation
State null hypothesis State α value Convert the score to a z score Look z score up on z score tables Hypothesis testing with an individual data point…
Hypothesis testing with one sample (n>1) … • 100 participants saw video containing violence • Then they free associated to 26 homonyms with aggressive & non-aggressive forms - e.g., pound, mug, • Mean number of aggressive free associates = 7.10 • Suppose we know that without an aggressive video the mean ()=5.65 and the standard deviation () = 4.5 • Is 7.10 significantly larger than 5.65?
becomes where Hypothesis testing with one sample (n>1) … • Use the sample mean instead of x in the z score formula • Use standard error of sample instead of the population standard deviation n = the number of scores in the sample
Standard error: • If we know then can be calculated using the formula
Sampling distribution • http://www.ruf.rice.edu/~lane/stat_sim/sampling_dist/index.html • Will always be narrower than the parent population • The more samples that are taken the more normal the distribution • As sample size increases standard error decreases
= = = Back to video violence • H0: = 5.65 • H1: 5.65(two-tailed) • Calculate p for sample mean of 7.10 assuming =5.65 • Use z from normal distribution as sampling distribution can be assumed to be normal • Calculate z • If z> + 1.96, reject H0 • 3.22 > 1.96 the difference is significant
But mostly we do not know σ • E.g. do penalty-takers show a preference for right or left? • 16 penalty takers; 60 penalties each; null hypothesis = 50% or 30 each way • Result mean of 39 penalties to the left; is this significantly different? • µ = 30, but how do we calculate the standard error without the σ?
Using s to estimate σ • Can’t substitute s for in a z score because s likely to be too small • So we need: • a different type of score – a t-score • a different type of distribution – Student’s t distribution
First published in 1908 by William Sealy Gosset, Worked at a Guinness Brewery in Dublin on best yielding varieties of barley Prohibited from publishing under his own name so the paper was written under the pseudonym Student. T distribution
T-test in a nut-shell… Allows us to calculate precisely what small samples tell us Uses three critical bits of information – mean, standard deviation and sample size
= = = t test for one mean • Calculated the same way as z except is replaced by s. • For the video example we gave before, s = 4.40
Degrees of freedom • t distribution is dependent on the sample size and this must be taken into account when calculating p • Skewness of sampling distribution decreases as n increases • t will differ from z less as sample size increases • t based on df where df = n - 1
Statistical inference made • With n = 100, t.02599 = 1.98 • Because t = 3.30 > 1.98, reject H0 • Conclude that viewing violent video leads to more aggressive free associates than normal
Factors affecting t • Difference between sample & population means • As value increases so t increases • Magnitude of sample variance • As sample variance decreases t increases • Sample size - as it increases • The value of t required to be significant decreases • The distribution becomes more like a normal distribution
t for repeated measures scores • Same participants give data on two measures • Someone high on one measure probably high on other • Calculate difference between first and second score • Base subsequent analysis on these difference scores. Before and after data are ignored
Example - Therapy for PTSD • Therapy for victims of psychological trauma-Foa et al (1991) • 9 Individuals received Supportive Counselling • Measured post-traumatic stress disorder symptoms before and after therapy
Results • The Supportive Counselling group decreased number of symptoms - was difference significant? • If no change, mean of differences should be zero • So, test the obtained mean of difference scores against = 0. • We don’t know , so use s and solve for t
Repeated measures t test • and = mean and standard deviation of differences respectively df = n - 1 = 9 - 1 = 8
Inference made • With 8 df, t.025 = +2.306 • We calculated t = 6.85 • Since 6.85 > 2.306, reject H0 • Conclude that the mean number of symptoms after therapy was less than mean number before therapy. • Infer that supportive counselling seems to work
+ & - of Repeated measures design • Advantages • Eliminate subject-to-subject variability • Control for extraneous variables • Need fewer subjects • Disadvantages • Order effects • Carry-over effects • Subjects no longer naïve • Change may just be a function of time
t test is robust • Test assumes that variances are the same • Even if the variances are not the same, the test still works pretty well • Test assumes data are drawn from a normally distributed population • Even if the population is not normally distributed, the test still works pretty well