560 likes | 717 Views
Now What. Last class introduced statistical concepts. How do we actually implement these ideas and get some results?? Goal: introduce you to what’s out there, things you need to be conscious of Familiarity with terms (understand papers). Experiment: IV= smartness drug; DV= IQ
E N D
Now What • Last class introduced statistical concepts. • How do we actually implement these ideas and get some results?? • Goal: introduce you to what’s out there, things you need to be conscious of • Familiarity with terms (understand papers)
Experiment: IV= smartness drug; DV= IQ Experimental group 1 scores: mean = 115 Experimental group 2 scores: mean = 102 Control group scores: mean = 100 It looks different, but how different IS different?
T tests Inferential test to decide if there is a real (significant) difference between the means of 2 data sets. In other words, do our 2 groups of people (experimental and control) represent 2 different POPULATIONS of people, or not?
Hypothesis Testing (review) • The steps of hypothesis testing are: 1. Restate the question as a research hypothesis and a null hypothesis about the populations. ex) Does drug A make people super smart? null hypothesis (assumed to be true)= Drug A has no effect on people’s intelligence. HO: µ1 = µ2
2. Figure out what the comparison distribution looks like (your control group, usually)
3. Decide Type 1 error (alpha level) • Determine the cutoff sample score on the comparison distribution at which point the null should be rejected • Typically 5% or 1%
4. determine where your sample falls on the comparison distribution • 5. reject or retain null hypothesis
Backbone of inferential stats • Idea of assuming null and rejecting it if your results would only happen 5% of the time if the null were true • Underlies any “p” value, significance test you’ll ever see
Stats packages • Matlab, stata, sas, R, JMP, spss • Different ones more popular in different fields; • I will reference SPSS (popular in psychology)
Descriptive statistics • Start off getting familiar with your data • Analyze descriptive statistics • Means, quartiles, outliers, plots, frequencies
T tests-NOT interchangeable • 1-sample T test Use this if you know the population mean (or you have hypothesized a population mean) and you want to know if your sample belongs to this population Ex- IQ population mean is 100, is my sample different? Or, I have a theory that everyone is 6 feet tall. I can take a sample of people and see if this is true.
1 sample t test • Analyze Compare means 1 sample t test Test Variable= the thing you want to know if it is different from the hypothesized population mean Test Value= hypothesized population mean (default is 0)
Reading output • Is my variable ‘caldif’ significantly different than the null population mean of zero?
Independent Samples T Test Compares the mean scores of two groups, on a given variable Ex- is the IQ score for my control and experimental groups different? Is the mean height different for men and women?
Independent Samples T Test • Analyze Compare means 1 sample t test Test variable= the dependent variable (iq, height) Grouping variable= differentiate the 2 groups you’re comparing. Example, you have a variable called sex and the values can be 1 or 2 corresponding to male and female.
Paired samples t test • Compares the means of 2 variables; tests if average difference is significantly different from zero • Use when the scores are not independent from each other (ex, scores from the same subjects before and after some intervention) • Ask yourself: are all the data points randomly selected, or is the second sample paired to the first? Ex), before using your device the subject’s mean happiness score was 100, afterwards it’s 102, is this average difference of 2 significantly different from no difference at all?
Paired (dependent) samples T test • Analyze Compare means paired samples t test HappinessScoreBefore HappinessScoreAfter
Which T test? • Is there a change in children’s understanding of algebra before and after using a learning program? • The average IQ in America is 100. Are MIT students different from this? • Which deodorant is better? Each subject gets each brand, one on each arm. • Which shoes lead to faster running? One sibling gets type A, the other type B. • Which remote control do people learn to use faster? We randomly select subjects from the population.
You will have more power with a repeated measure design, but sometimes (often) there are reasons you can’t design your study that way. -order effects (learning, long-lasting intervention) - ‘demand’ effects
Important assumptions for inferential statistics • 1 homogeneity of variance check and correct if necessary (ex, Levene test; Welsh procedure) • 2. normal distribution check and correct if necessary(ex, transform data to log, square) • 3 random sample of population vital! Or else be clear on what population you’re really learning about • 4 independence of samples vital! Knowing the score for one subject give you no specific hints on how another will score
Anova • T tests are when you have only 1 or 2 groups. For more, use the anova model. • Basic method: compares the variance between groups/within groups • Is this ratio (‘F ratio’) is significantly >1
1 way anova • Compare means from multiple groups What is the effect of three different online learning environments and students’ ‘interest’ score?
Three different groups (N=12) Treatment means overall mean (‘grand mean’)
1 way anova • The basic model is that An individual score = overall mean + effect of treatment (group mean) + error Total variance = total variance between groups +total variance with group (as error term)
SS total = (9-5.92)^2 + (7-5.92)^2 + (8-5.92)^2… + (7-5.92)^2 = 112.92 SS between = (8.25-5.92)^2 + (1.75-5.92)^2 + (7.75)^2 = 26.17 SS within= (9-8.25)^2 + (7-8.25)^2… + (7-7.75)^2 = 7.42
Mean squares You get the average sum of squares, or mean squares, by dividing sum of squares by degrees of freedom (measure of independent pieces of information)
Df between = J-1 (groups-1) • Df within = N-J (total people-groups) • So, MS between = 26.17/2 = 13.1 • MS within = 7.42/ (12-3=9) = .83
F ratio • MS between/MS within • Signal/ Noise ratio • 13.1/.83 = 15.78
If no effect, you’d expect a ratio of 1 • Ratio of 15 seems strong. Check with F table (same principle as with T test earlier!)
Spss 1 way anova • Analyze General Linear Model Univariate
Fixed vs random factors • Fixed factor: the levels under study are the only levels of interest; you can’t generalize to anything else • Random effect: levels were drawn randomly from population, you can generalize • Ex- do people from different countries like my new phone differently? Give phone to people from Japan, India, America.
2+ way anova • Main effects • Interaction effects : Testing gender and age (undergrad vs senior citizen): DV = engagedness with robot You get 3 overall effects -effect of gender on engagedness -effect of age on engagedness -interaction of gender and age on engagedness (does the effect of gender depend on age?/ does the effect of age depend on gender?)
contrasts The main effects (‘omnibus test’) tells you that something is going on here, there is some difference somewhere, but doesn’t tell you what. Is group 1 different than group 3? Are groups 1 and 3 together different than group 2?
Spss contrasts Contrast coefficients (add to zero): 1, 0, -1 .5, -1,.5 “Name brand” polynomial etc
Omnibus v contrasts • Significant omnibus means there will be at least 1 significant contrast, but • Nonsignificant omnibus DOESN’T necessarily mean there are no significant contrasts
A Priori vs Post-hoc • A priori (planned) = theory driven, you planned to test this before you saw your data • Post-hoc = exploratory, data-driven • When doing post-hoc contrasts you must be especially careful of type 1 error.
Family wise error – take this SERIOUSLY • With an alpha (type 1 error) of .05, you expect 1 test 5% chance you’re wrong 5 tests 25% chance one of them is wrong 20 tests 1 of them is probably wrong
To keep overall error at 5%, if you are doing multiple contrasts you can do a “Bonferroni” correction, which just means you divide .05 by the number of contrasts • Ex, 10 contrasts. I want overall error to be 5%. So each contrast must meet at stricter cutoff- .005%
Correlation and Regression • Correlation: linear relationship between X and Y (no assumptions about IV/DV) • Regression: what is the best guess about Y given a certain value of X (X is IV) • Similar to anova model
spss -Analyze correlate bivariate The 2 variables, date and meandif, have a Pearson correlation of .013, and the significance is .908. (i.e. not significant).
Effect size • Review: can be significant but tiny effect • Knowing significance doesn’t give indication of effect strength • Amount by which the 2 populations don’t overlap • Amount of total variance explained by your variable (sometimes)