680 likes | 709 Views
Using Statistics in Research. Psych 231: Research Methods in Psychology. Announcements. I will be helping with statistical analyses of group project data during this week’s labs. Enter data into SPSS datafile and e-mail it to me Bring raw data in organized fashion for easy entry into SPSS
E N D
Using Statistics in Research Psych 231: Research Methods in Psychology
Announcements • I will be helping with statistical analyses of group project data during this week’s labs. • Enter data into SPSS datafile and e-mail it to me • Bring raw data in organized fashion for easy entry into SPSS • Think about what the appropriate statistical test should be IN ADVANCE of seeing me
Statistics • Why do we use them? • Descriptive statistics • Used to describe, simplify, & organize data sets • Inferential statistics • Used to test claims about the population, based on data gathered from samples • Takes sampling error into account, are the results above and beyond what you’d expect by random chance
Distributions • Recall that a variable is a characteristic that can take different values. • The distribution of a variable is a summary of all the different values of a variable • both type (each value) and token (each instance)
Distribution • Example: Distribution of scores on an exam • A frequency histogram Frequency
Distribution • Properties of a distribution • Shape • Symmetric v. asymmetric (skew) • Unimodal v. multimodal • Center • Where most of the data in the distribution are • Spread (variability) • How similar/dissimilar are the scores in the distribution?
Distributions • A picture of the distribution is usually helpful • Gives a good sense of the properties of the distribution • Many different ways to display distribution • Graphs • Continuous variable: • histogram, line graph (frequency polygons) • Categorical variable: • pie chart, bar chart • Table • Frequency distribution table • Stem and leaf plot
Graphs for continuous variables • Histogram • Line graph
Graphs for categorical variables • Bar chart • Pie chart
Percentages Values (types) Counts Frequency distribution table
Descriptive statistics • In addition to pictures of the distribution, numerical summaries are also presented. • Numeric Descriptive Statistics • Shape: • Skew (symmetry) & Kurtosis (flatness) • Measures of Center: • Mean • Median • Mode • Measures of Variability (Spread) • Standard deviation (variance) • Range
tail tail Shape • Symmetric • Asymmetric Positive Skew Negative Skew
Shape • Unimodal (one mode) • Multimodal • Bimodal examples
Center • There are three main measures of center • Mean (M): the arithmetic average • Add up all of the scores and divide by the total number • Most used measure of center • Median (Mdn): the middle score in terms of location • The score that cuts off the top 50% of the from the bottom 50% • Good for skewed distributions (e.g. net worth) • Mode: the most frequent score • Good for nominal scales (e.g. eye color) • A must for multi-modal distributions
Spread (Variability) • How similar are the scores? • Range: the maximum value - minimum value • Only takes two scores from the distribution into account • Influenced by extreme values (outliers) • Standard deviation (SD): (essentially) the average amount that the scores in the distribution deviate from the mean • Takes all of the scores into account • Also influenced by extreme values (but not as much as the range) • Variance: standard deviation squared
mean mean Variability • Low variability • The scores are fairly similar • High variability • The scores are fairly dissimilar
Relationships between variables • Suppose that you notice that the more you study for an exam, the better your score typically is. This suggests that there is a relationship between study time and test performance. • Computation of the Correlation Coefficient (and regression) - a numerical description of the relationship between two variables • May be used for • Prediction • Validity • Reliability • Theory verification
Correlation • For relationship between two continuous variables we use Pearson’s r (Pearson product-moment correlation) • It basically tells us how much our two variables vary together • As X goes up, what does Y typically do • X, Y • X, Y • X, Y
Correlation • Properties of a correlation • Form • Linear • Non-linear • Direction • Negative • Positive • Strength • Ranges from -1 to +1, 0 means no relationship
Scatterplot • Plots one variable against the other • Useful for “seeing” the relationship • Form, Direction, and Strength • Each point corresponds to a different individual • Imagine a line through the data points
Y 6 5 4 3 2 1 1 2 3 4 5 6 X Scatterplot
Linear Non-linear Form
Y Y X X Direction Positive Negative • As X goes up, Y goes up • X & Y vary in the same direction • positive r • As X goes up, Y goes down • X & Y vary in opposite directions • negative r
Strength • Zero means “no relationship”. • The farther the r is from zero, the stronger the relationship • The strength of the relationship • Spread around the line (note the axis scales) • r2 sometimes reported instead • %variance in Y given X
r = 1.0 “perfect positive corr.” r2 = 100% r = -1.0 “perfect negative corr.” r2 = 100% r = 0.0 “no relationship” r2 = 0.0 -1.0 0.0 +1.0 The farther from zero, the stronger the relationship Strength
Rel A Rel B r = -0.8 r2 = 64% r = 0.5 r2 = 25% -.8 .5 -1.0 0.0 +1.0 Strength • Which relationship is stronger? • Rel A, -0.8 is stronger than +0.5
Y 6 5 4 0.5 2.0 3 2 Change in Y Change in X 1 = slope 1 2 3 4 5 6 X Regression • Compute the equation for the line that best fits the data points Y = (X)(slope) + (intercept)
Y 6 5 4 3 2 4.5 1 1 2 3 4 5 6 X Regression • Can make specific predictions about Y based on X X = 5 Y = ? Y = (X)(.5) + (2.0) Y = (5)(.5) + (2.0) Y = 2.5 + 2 = 4.5
Y Y 6 6 5 5 4 4 3 3 2 2 1 1 1 1 2 2 3 3 4 4 5 5 6 6 X X Regression • Also need a measure of error Y = X(.5) + (2.0) + error Y = X(.5) + (2.0) + error • Same line, but different relationships (strength difference)
Multiple regression • You want to look at how more than one variable may be related to Y • The regression equation gets more complex • X, Z, & W variables are used to predict Y • e.g., Y = b1X + b2Z + b3W + b0 + error
Cautions with correlation and regression • Don’t make causal claims • Don’t extrapolate • Extreme scores can strongly influence the calculated relationship
Inferential Statistics • Why? • Purpose: To make claims about populations based on data collected from samples • What’s the big deal? • Example Experiment: • Group A - gets treatment to improve memory • Group B - control, gets no treatment • After treatment period test both groups for memory • Results: Group A’s average memory score is 80%, while group B’s is 76% • Is the 4% difference a “real” difference or is it just sampling error?
Testing Hypotheses • Step 1: State your hypotheses • Null hypothesis (H0) • There are no differences (effects) • This is the hypothesis that you are testing • Alternative hypothesis(ses) • Generally, not all groups are equal • You aren’t out to prove the alternative hypothesis (although it feels like this is what you want to do) • If you reject the null hypothesis, then you’re left with support for the alternative(s) (NOT proof!)
Hypotheses • In our memory example experiment • H0: mean of Group A = mean of Group B • HA: mean of Group A ≠ mean of Group B • (Or more precisely: Group A > Group B) • It seems like our theory is that the treatment should improve memory. • That’s the alternative hypothesis. That’s NOT the one the we’ll test with inferential statistics. • Instead, we test the H0
Testing Hypotheses • Step 2: Set your decision criteria • Your alpha level will be your guide for when to reject or fail to reject the null hypothesis • Step 3: Collect your data from your sample(s) • Step 4: Compute your test statistics • Descriptive statistics (means, standard deviations, etc.) • Inferential statistics (t-tests, ANOVAs, etc.) • Step 5: Make a decision about your null hypothesis • Reject H0 • Fail to reject H0
Statistical significance • “Statistically significant difference” • When you reject your null hypothesis • Essentially this means that the observed difference is above what you’d expect by chance • “Chance” is determined by estimating how much sampling error there is • Factors affecting “chance” • Sample size • Population variability
Population mean Population Distribution Sampling error (Pop mean - sample mean) Sampling error x N = 1
Population mean Population Distribution Sample mean Sampling error (Pop mean - sample mean) Sampling error x x N = 2
Population mean Population Distribution Sample mean x x x x x x x x x x Sampling error (Pop mean - sample mean) Sampling error N = 10 • Generally, as the sample size increases, the sampling error decreases
Small population variability Large population variability Sampling error • Typically the narrower the population distribution, the narrower the range of possible samples, and the smaller the “chance”
Population Distribution of sample means XB XC XD Avg. Sampling error “chance” XA Sampling distribution • The sampling distribution is a distribution of all possible sample means of a particular sample size that can be drawn from the population Samples of size = n
Error types • Based on the outcomes of the statistical tests researchers will either: • Reject the null hypothesis • Fail to reject the null hypothesis • This could be correct conclusion or the incorrect conclusion • Two ways to go wrong • Type I error: saying that there is a difference when there really isn’t one • Type II error: saying that there is not a difference when there really is one
Error types Real world (‘truth’) H0 is correct H0 is wrong Type I error Reject H0 Experimenter’s conclusions Fail to Reject H0 Type II error
Error types: Courtroom analogy Real world (‘truth’) Defendant is innocent Defendant is guilty Type I error Find guilty Jury’s decision Type II error Find not guilty
Error types • Type I error: concluding that there is an effect (a difference between groups) when there really isn’t. • Sometimes called “significance level” • We try to minimize this (keep it low) • Pick a low level of alpha • Psychology: 0.05 and 0.01 most common • Type II error: concluding that there isn’t an effect, when there really is. • Related to the Statistical Power of a test • How likely are you able to detect a difference if it is there
Significance • “A statistically significant difference” means: • the researcher is concluding that there is a difference above and beyond chance • with the probability of making a type I error at 5% (assuming an alpha level = 0.05) • Note “statistical significance” is not the same thing as theoretical significance. • Only means that there is a statistical difference • Doesn’t mean that it is an important difference
Non-Significance • Failing to reject the null hypothesis • Generally, not interested in “accepting the null hypothesis” (remember we can’t prove things only disprove them) • Usually check to see if you made a Type II error (failed to detect a difference that is really there) • Check the statistical power of your test • Sample size is too small • Effects that you’re looking for are really small • Check your controls, maybe too much variability
Inferential Statistical Tests • Different statistical tests • “Generic test” • T-test • Analysis of Variance (ANOVA)
XA XA XB XB “Generic” statistical test • Tests the question: • Are there differences between groups due to a treatment? H0: is true (no treatment effect) H0: is false (is a treatment effect)
XA XB “Generic” statistical test • Why might the samples be different? (What is the source of the variability between groups)? • ER: Random sampling error • ID: Individual differences (if between subjects factor) • TR: The effect of a treatment