160 likes | 284 Views
Review. Nine men and nine women are tested for their memory of a list of abstract nouns. The mean scores are M male = 15 and M female = 17. The mean square based on both samples is MS = 25. What is your best estimate of the (non-standardized) effect size, µ female – µ male ? 0.40
E N D
Review Nine men and nine women are tested for their memory of a list of abstract nouns. The mean scores are Mmale = 15 and Mfemale = 17. The mean square based on both samples is MS = 25. What is your best estimate of the (non-standardized) effect size, µfemale– µmale? • 0.40 • 1.33 • 2.00 • 18.0
Review Nine men and nine women are tested for their memory of a list of abstract nouns. The mean scores are Mmale = 15 and Mfemale = 17. The mean square based on both samples is MS = 25. Calculate the standardized effect size. • 0.08 • 0.10 • 0.40 • 0.49
Review A sample of 16 subjects has a mean of 53 and a sample variance of 9. Calculate a 95% confidence interval for the population mean. The critical value for t (at 97.5% with 15 df) is 2.13. • [51.40, 54.60] • [48.21, 57.79] • [51.80, 54.20] • [50.16, 55.84]
What are we doing? • In science, we run lots of experiments • In some cases, there's an "effect”; in others there's not • Some manipulations have an impact; some don't • Some drugs work; some don't • Goal: Tell these situations apart, using the sample data • If there is an effect, reject the null hypothesis • If there’s no effect, retain the null hypothesis • Challenge • Sample is imperfect reflection of population, so we will make mistakes • How to minimize those mistakes?
Analogy: Prosecution • Think of cases where there is an effect as "guilty” • No effect: "innocent" experiments • Scientists are prosecutors • We want to prove guilt • Hope we can reject null hypothesis • Can't be overzealous • Minimize convicting innocent people • Can only reject null if we have enough evidence • How much evidence to require to convict someone? • Tradeoff • Low standard: Too many innocent people get convicted (Type I Error) • High standard: Too many guilty people get off (Type II Error)
Binomial Test • Example: Halloween candy • Sack holds boxes of raisins and boxes of jellybeans • Each kid blindly grabs 10 boxes • Some kids cheat by peeking • Want to make cheaters forfeit candy • How many jellybeans before we call kid a cheater?
Binomial Test • How many jellybeans before we call kid a cheater? • Work out probabilities for honest kids • Binomial distribution • Don’t know distribution of cheaters • Make a rule so only 5% of honest kids lose their candy • For each individual above cutoff • Don’t know for sure they cheated • Only know that if they were honest, chance would have been less than 5% that they’d have so many jellybeans • Therefore, our rule catches as many cheaters as possiblewhile limiting Type I errors (false convictions) to 5% Honest Cheaters ? 0 1 2 3 4 5 6 7 8 9 10 Jellybeans
Testing the Mean • Example: IQs at various universities • Some schools have average IQs (mean = 100) • Some schools are above average • Want to decide based on n students at each school • Need a cutoff for the sample mean • Find distribution of sample means for Average schools • Set cutoff so that only 5% of Average schools will be mistaken as Smart Average Smart ? p(M) 100
Unknown Variance: t-test • Want to know whether population mean equals m0 • H0: m = m0 H1: m ≠ m0 • Don’t know population variance • Don’t know distribution of M under H0 • Don’t know Type I error rate for any cutoff • Use sample variance to estimate standard error of M • Divide M – m0 by standard error to get t • We know distribution of t exactly Distribution of t when H0 is True Criticalvalue p(M) p(t) a m0 0 Critical Region
Two-tailed Tests • Often we want to catch effects on either side • Split Type I Errors into two critical regions • Each must have probability a/2 H0 H1 ? H1 ? Criticalvalue Test statistic Criticalvalue a/2 a/2
An Alternative View: p-values • p-value • Probability of a value equal to or more extreme than what you actually got • Measure of how consistent data are with H0 • p > • t is within tcrit • Retain null hypothesis • p < • t is beyond tcrit • Reject null hypothesis; accept alternative hypothesis tcrit for a = .05 p = .03 t = 2.57 t t
Three Views of Inferential Statistics Confidence interval • Effect size & confidence interval • Values of m0 that don’t lead to rejecting H0 • Test statistic & critical value • Measure of consistency with H0 • p-value & a • Type I error rate • Can answer hypothesis testat any level • Result predicted by H0vs. confidence interval • Test statistic vs. critical value • p vs. m0 m0 m0 m0 M Criticalvalue Test statistic Criticalvalue 0 1 p a
Review Which of the following would lead you to reject the null hypothesis? • t < tcrit • p < a • M outside the confidence interval • t> a • p > tcrit
Review In a paired-samples experiment, you get a confidence interval of [-2.4, 3.6] for µdiff. What is your conclusion? • t < tcrit, p > a, and you keep the null hypothesis • t < tcrit, p< a, and you reject the null hypothesis • t> tcrit, p > a, and you reject the null hypothesis • t> tcrit, p< a, and you keep the null hypothesis
Review In a one-tailed t-test, you get a result of t = 2.8. The critical value is tcrit = 2.02. What region(s) correspond to alpha? • Blue • Blue + red • Blue + green • Blue + red + orange + green tcrit t (from data)