290 likes | 494 Views
Statistics (cont.). Psych 231: Research Methods in Psychology. Change in Due date(s) for final draft. Instead of Wednesday in class, will turn in during labs that week (so Nov. 29 & 30). Have a great Fall break. Announcements. Inferential statistics used to generalize back. Population.
E N D
Statistics (cont.) Psych 231: Research Methods in Psychology
Change in Due date(s) for final draft. Instead of Wednesday in class, will turn in during labs that week (so Nov. 29 & 30). • Have a great Fall break. Announcements
Inferential statistics used to generalize back Population Sample • 2 General kinds of Statistics • Descriptive statistics • Used to describe, simplify, & organize data sets • Describing distributions of scores • Inferential statistics • Used to test claims about the population, based on data gathered from samples • Takes sampling error into account. Are the results above and beyond what you’d expect by random chance? Statistics
Step 3: Collect your data from your sample(s) • Step 4: Compute your test statistics • Descriptive statistics (means, standard deviations, etc.) • Inferential statistics (t-tests, ANOVAs, etc.) • Step 5: Make a decision about your null hypothesis • Reject H0“statistically significant differences” • Fail to reject H0“not statistically significant differences” • Make this decision by comparing your test’s “p-value” against the alpha level that you picked in Step 2. • Step 1: State your hypotheses • Step 2: Set your decision criteria Testing Hypotheses
Real world (‘truth’) H0 is correct H0 is wrong Type I error Reject H0 Experimenter’s conclusions Fail to Reject H0 Type II error 76% 80% XB XB XA XA • Tests the question: Are there differences between groups due to a treatment? This is the one that test examines H0 is true (no treatment effect) H0 is false (is a treatment effect) One population Two populations 76% 80% Step 4: “Generic” statistical test
TR + ID + ER ID + ER XA XB • The generic test statistic - is a ratio of sources of variability • ER: Random sampling error • ID: Individual differences (if between subjects factor) • TR: The effect of a treatment Observed difference Computed test statistic = = Difference from chance Step 4: “Generic” statistical test
Real world (‘truth’) H0 is correct H0 is wrong Reject H0 Exp’s concl Fail to Reject H0 Test statistic TR + ID + ER ID + ER • The generic test statistic distribution • To reject the H0, you want a computed test statistics that is large • reflecting a large Treatment Effect (TR) • What’s large enough? The alpha level gives us the decision criterion Distribution of the test statistic Depending on the design: T-scores, F-ratios, r’s transforms Distribution of sample means α-level (from Step 2) determines where these boundaries go Step 4: “Generic” statistical test
The generic test statistic distribution • To reject the H0, you want a computed test statistics that is large • reflecting a large Treatment Effect (TR) • What’s large enough? The alpha level gives us the decision criterion Distribution of the test statistic Reject H0 “Two tailed test”: looking for “an effect” – don’t have a theoretical expectation for the direction of the effect 2.5% 2.5% “two-tailed” with α = 0.05 Fail to reject H0 Step 4: “Generic” statistical test
The generic test statistic distribution • To reject the H0, you want a computed test statistics that is large • reflecting a large Treatment Effect (TR) • What’s large enough? The alpha level gives us the decision criterion Distribution of the test statistic Reject H0 “One tailed test”: sometimes you know to expect a particular difference (e.g., “improve memory performance”) 5.0% “one-tailed” with α = 0.05 Fail to reject H0 Step 4: “Generic” statistical test
TR+ ID + ER TR+ ID + ER TR + ID +ER ID + ER ID +ER ID + ER XA XA XB XB • Things that affect the computed test statistic • Size of the treatment effect (effect size) • The bigger the effect, the bigger the computed test statistic • Difference expected by chance (standard error) • Variability in the population • Sample size Step 4: “Generic” statistical test
Statistical tests follow the research design used • 1 factor with two groups • T-tests • Between groups: 2-independent samples • Within groups: Repeated measures samples (matched, related) • 1 factor with more than two groups • Analysis of Variance (ANOVA) (either between groups or repeated measures) • Multi-factorial • Factorial ANOVA Some inferential statistical tests • KahnAcademy: Hypothesis testing and p-values (~11 mins) • Goldstein: Choosing a Statistical Test (~12 mins)
Observed difference X1 - X2 T = Diff by chance Based on sampling error • Design • 2 separate experimental conditions • Degrees of freedom • Based on the size of the sample and the kind of t-test • Formulae: Computation differs for between and within t-tests CI: μ=(X1-X2)±(tcrit)(Diff by chance) T-test • StatsLectures: One Sample z-test (~6 mins) • Repeated Measures t Tests Part I Introduction (~14 mins) • Independent vs. Paired samples t-tests (~4 mins)
Less than α = .05? Less than α = .05? Less than α = .05? SPSS output for t-tests In SPSS, compare observed p-values with your alpha-level.
Reporting your results • The observed difference between conditions • Kind of t-test • Computed T-statistic • Degrees of freedom for the test • The “p-value” of the test • “The mean of the treatment group was 12 points higher than the control group. An independent samples t-test yielded a significant difference, t(24) = 5.67, p < 0.05, 95% CI [7.62, 16.38]” • “The mean score of the post-test was 12 points higher than the pre-test. A repeated measures t-test demonstrated that this difference was significant significant, t(12) = 7.50, p < 0.05, 95% CI [8.51, 15.49].” DepVar Error bars are 95% CIs T-test
Two types typically • Standard Error (SE) • diff by chance • Confidence Intervals (CI) • A range of plausible estimates of the population mean DepVar Error bars are 95% CIs CI: μ = (X) ± (tcrit) (diff by chance) Error bars
XA XC XB • Designs • More than two groups • 1 Factor ANOVA, Factorial ANOVA • Both Within and Between Groups Factors • Test statistic is an F-ratio • Degrees of freedom • Several to keep track of • The number of them depends on the design Analysis of Variance (ANOVA)
Observed variance F-ratio = XA XC XB Variance from chance • More than two groups • Now we can’t just compute a simple difference score since there are more than one difference • So we use variance instead of simply the difference • Variance is essentially an average difference Analysis of Variance (ANOVA)
XA XC XB • 1 Factor, with more than two levels • Now we can’t just compute a simple difference score since there are more than one difference • A - B, B - C, & A - C 1 factor ANOVA
The ANOVA tests this one!! Do further tests to pick between these XA = XB = XC XA ≠ XB ≠ XC XA ≠ XB = XC XA = XB ≠ XC XA = XC ≠ XB XA XC XB Null hypothesis: H0: all the groups are equal Alternative hypotheses HA: not all the groups are equal 1 factor ANOVA
XA ≠ XB ≠ XC XA ≠ XB = XC XA = XB ≠ XC XA = XC ≠ XB Planned contrasts and post-hoc tests: - Further tests used to rule out the different Alternative hypotheses Test 1: A ≠ B Test 2: A ≠ C Test 3: B = C 1 factor ANOVA
Reporting your results • The observed differences • Kind of test • Computed F-ratio • Degrees of freedom for the test • The “p-value” of the test • Any post-hoc or planned comparison results • “The mean score of Group A was 12, Group B was 25, and Group C was 27. A 1-way ANOVA was conducted and the results yielded a significant difference, F(2,25) = 5.67, p < 0.05. Post hoc tests revealed that the differences between groups A and B and A and C were statistically reliable (respectively t(1) = 5.67, p < 0.05 & t(1) = 6.02, p < 0.05). Groups B and C did not differ significantly from one another” 1 factor ANOVA
We covered much of this in our experimental design lecture • More than one factor • Factors may be within or between • Overall design may be entirely within, entirely between, or mixed • Many F-ratios may be computed • An F-ratio is computed to test the main effect of each factor • An F-ratio is computed to test each of the potential interactions between the factors Factorial ANOVAs
Consider the results of our class experiment ✓ • Main effect of cell phone ✓ • Main effect of site type X • An Interaction between cell phone and site type 0.73 1.19 Factorial designs Resource: Dr. Kahn's reporting stats page
Reporting your results • The observed differences • Because there may be a lot of these, may present them in a table instead of directly in the text • Kind of design • e.g. “2 x 2 completely between factorial design” • Computed F-ratios • May see separate paragraphs for each factor, and for interactions • Degrees of freedom for the test • Each F-ratio will have its own set of df’s • The “p-value” of the test • May want to just say “all tests were tested with an alpha level of 0.05” • Any post-hoc or planned comparison results • Typically only the theoretically interesting comparisons are presented Factorial ANOVAs
Inferential statistics used to generalize back Population Sample A Treatment X = 80% Sample B No Treatment X = 76% • Two approaches • Hypothesis Testing • “There is a statistically significant difference between the two groups” • Confidence Intervals • “The mean difference between the two groups is between 4% ± 2%” Inferential Statistics
What DOES “confident” mean? • “90% confidence” means that 90% of the interval estimates of this sample size will include the actual population mean CI: μ = (X) ± (tcrit) (diff by chance) 9 out of 10 intervals contain μ Actual population mean μ Using Confidence intervals
Distribution of the test statistic The upper and lower 2.5% CI: μ = (X) ± (tcrit) (diff by chance) Confidence interval uses the tcrit values that identify the top and bottom tails 2.5% 2.5% A 95% CI is like using a “two-tailed” t-test with with α = 0.05 95% of the sample means Using Confidence intervals
Note: How you compute your standard error will depend on your design CI: μ = (X) ± (tcrit) (diff by chance) Using Confidence intervals
Two types typically • Standard Error (SE) • diff by chance • Confidence Intervals (CI) • A range of plausible estimates of the population mean CI: μ = (X) ± (tcrit) (diff by chance) Note: Make sure that you label your graphs, let the reader know what your error bars are Error bars