E N D
1. Psychology 242, Dr. McKirnan Statistics: an introduction
2. Psychology 242, Dr. McKirnan Research questions, hypotheses & designs
3. Psychology 242, Dr. McKirnan Statistics introduction 1 Why Use Numbers in Science? Comparability of research outcomes
across groups or conditions, within study
across studies
Statistical operations
effect sizes
statistical comparisons & probability judgments
4. Psychology 242, Dr. McKirnan Statistics introduction 1 Costs and & benefits of numbers Virtues of numerical data:
Representing data by numbers helps clarify and simplify communication
Can show general trend or “common denominator” in the data
Danger of numerical data:
Can over-simplify / ignore individual differences
Liable to misrepresentation or manipulation, as per Broder’s article…
5. Psychology 242, Dr. McKirnan Statistics introduction 1 The danger of reducing complex social processes to simple numbers:
6. Psychology 242, Dr. McKirnan Statistics introduction 1 Political (mis)uses of scientific data
7. Psychology 242, Dr. McKirnan Statistics introduction 1 Thus… Numerical representations of nature are important to scientific progress
Key for operational definitions
Provide powerful analytic & communication tools
Understanding ANY statistical representation requires understanding how it was derived
Understanding must involve the context and origin of statistical information ? see examples below.
Statistical statements can be literally / technically “true” but misleading.
Political (mis)use of numerical / statistical information is common and problematic.
8. Psychology 242, Dr. McKirnan Statistics introduction 1 Some basic terms:
9. Psychology 242, Dr. McKirnan Week 3; Experimental designs Research questions, hypotheses & designs
10. Psychology 242, Dr. McKirnan Statistics introduction 1 Interval
Arbitrary or relative ? scales
Common in behavioral research, e.g., attitude or rating scales.
Ordinal
Rank order with non-equal intervals
Simple finish place, rank in organization, most, 2nd most, 3rd most...
Categorical
Categories only
Typical of inherent categories: ethnic group, gender, zip code Types of numerical scales
11. Psychology 242, Dr. McKirnan Statistics introduction 1 Scale details Ratio Scales, e.g., batting average
Each scale value describes a physical reality
% hits / at bats
The zero point is grounded in a physical property
0 hits is meaningful
Each scale point is absolute
32o has an absolute meaning, not
just relative to other temperatures.
Each interval is continuous & exactly equal:
.100 ? .150 = .200 ? .250
Used for correlational designs, or as dependent variable in experiments.
12. Interval scales Psychology 242, Dr. McKirnan Statistics introduction 1
13. Interval scales Psychology 242, Dr. McKirnan Statistics introduction 1
14. Interval scales Psychology 242, Dr. McKirnan Statistics introduction 1
15. Psychology 242, Dr. McKirnan Statistics introduction 1 Two central ways of using numbers. Descriptive Statistics:
Simple quantitative description or summary.
Batting average in baseball
Grade-point average
Univariate Analysis: examines cases in terms of a single variable.
Frequency distributions group data into categories.
E.g., drug use by Age categories, Ethnic groups…
Inferential Statistics:
Conduct analyses on samples
Compare groups (experimental v. control…)
Characterize a sample in epidemiological research
Use statistical operations to generalize the results to a population.
16. What is a Frequency Distribution? Psychology 242, Dr. McKirnan Statistics introduction 1
17. Psychology 242, Dr. McKirnan Statistics introduction 1 How angry are you at the crooks who crashed the economy?
18. Descriptive statistics example Psychology 242, Dr. McKirnan Statistics introduction 1
19. We can “block” frequencies by, e.g., gender Psychology 242, Dr. McKirnan Statistics introduction 1
20. Psychology 242, Dr. McKirnan Research questions, hypotheses & designs
21. Central tendency Psychology 242, Dr. McKirnan Statistics introduction 1
22. What is Central Tendency (Mean or average…) Psychology 242, Dr. McKirnan Statistics introduction 1
23. Psychology 242, Dr. McKirnan Statistics introduction 1 Describing data 1. Central tendency or general “drift” of the scores.
Mode ? most common score
Median ? middle of the distribution
Mean ? average score
2. Variance: how diverse the scores are (how much vary from each other).
Range ? …from the highest to lowest score
Standard ? “average” amount the scores vary
deviation from the Mean score
24. Psychology 242, Dr. McKirnan Statistics introduction 1 The Mode Example: scores = 15, 20, 21, 20, 36,15, 25,15,12
Show scores as a frequency distribution
25. Psychology 242, Dr. McKirnan Statistics introduction 1 Median Mid-point of a distribution of scores: half are above, half are below.
List scores in numerical order (interval or ratio scale)
Locate the score in the center of the sample.
First line up the scores: 12,15,15,15,20,20,21,25,36
The middle (5th out of 9) score = 20.
If there are an even number of scores, Median = mean of the two middle scores
26. Psychology 242, Dr. McKirnan Statistics introduction 1 Mean (M) Most common measure of central tendency
Total all scores: 12+15+20+21+20+36+15+25+15 = 179
Divide by “n” of scores: 179 / 9 = 19.9 (round to 20).
Characteristics:
Good for Ratio or interval scales
Sensitive to all observed values
Highly stable; with larger n is insensitive to subtle changes in values
Can be highly sensitive to extreme values (particularly in smaller samples).
27. The Mean: Statistical notation Psychology 242, Dr. McKirnan Statistics introduction 1
28. Psychology 242, Dr. McKirnan Statistics introduction 1 Central tendency: Normal Distributions For a normal distribution the mean, mode, and median are all same -- the center of the distribution
29. Psychology 242, Dr. McKirnan Statistics introduction 1 The distribution of student ages Age is a good example of a variable that is normally distributed
30. Psychology 242, Dr. McKirnan Statistics introduction 1 Central tendency: Bimodal Distributions A bimodal distribution has two modes, at the outsides of the distribution.
31. Psychology 242, Dr. McKirnan Statistics introduction 1 Central tendency: Skewed Distributions A skewed distribution has extreme scores in one direction.
32. Psychology 242, Dr. McKirnan Statistics introduction 1 Central tendency; normal distribution
33. Local examples of distributions, 1 Psychology 242, Dr. McKirnan Statistics introduction 1
34. Psychology 242, Dr. McKirnan Statistics introduction 1
35. Local examples of distributions, 2b Psychology 242, Dr. McKirnan Statistics introduction 1
36. Psychology 242, Dr. McKirnan Statistics introduction 1 Positive skew example
37. Example of a strong positive skew: American incomes. Psychology 242, Dr. McKirnan Statistics introduction 1
38. American income distribution (2000) Psychology 242, Dr. McKirnan Statistics introduction 1
39. Psychology 242, Dr. McKirnan Statistics introduction 1 Deceptiveness of the M in skewed data:Example from Bush tax cut data Did all Americans benefit equally from the Bush era tax cuts?
The M annual tax cut for all tax payers is $1629 per year from 2001 to 2010... a pretty good number.
40. Psychology 242, Dr. McKirnan Statistics introduction 1 Deceptive Means in skewed data The M for tax cuts is pulled upward by a small number of very high values
41. Psychology 242, Dr. McKirnan Statistics introduction 1 Tax cuts by income group
42. Psychology 242, Dr. McKirnan Statistics introduction 1 Mean v. Median in skewed data The difference between the Mode, Median & Mean in skewed data can be crucial: Example from men’s v. women’s number of sex partners.
43. Psychology 242, Dr. McKirnan Statistics introduction 1 Mean v. Median in skewed data Expressing the data as Ms greatly exaggerates differences between men & women.
44. Psychology 242, Dr. McKirnan Statistics introduction 1 Bottom line: Most natural processes or variables show a (more or less) normal distribution…
Age, Height, IQ, most attitude scales…
When a distribution is normal the Mode = Mean = Median, all at the center of the distribution.
When a distribution is bimodal or skewed the M can be deceptive.
Typically:
Mode
Median
Mean
45. Psychology 242, Dr. McKirnan Research questions, hypotheses & designs
46. Psychology 242, Dr. McKirnan Statistics introduction 1 Range 1. The Range of the highest to the lowest score.
Ages of males: 18, 25, 20, 21, 20, 23, 24, 26,18, 25, 20, 19, 19.
Ages of women: 26, 27, 27, 31, 32, 28, 31, 29, 30, 27, 26, 37, 28
47. Psychology 242, Dr. McKirnan Statistics introduction 1 Standard deviation Similar to the “average” amount each score deviates from the M of the sample.
“Standardizes” scores to a normal curve, allowing basic statistics to be used.
More accurate & detailed than range:
A few extremely high or low scores (“outliers”) may make the range inaccurate
S assesses the deviation of all scores in the sample from the mean
48. Standard deviation Psychology 242, Dr. McKirnan Statistics introduction 1
49. Psychology 242, Dr. McKirnan Statistics introduction 1 Standard Deviations; Deviations of scores from the M
50. Psychology 242, Dr. McKirnan Statistics introduction 1 Standard Deviation & Formulas
51. Psychology 242, Dr. McKirnan Statistics introduction 1 Degrees of freedom
52. Psychology 242, Dr. McKirnan Statistics introduction 1 Standard Deviation & Formulas
53. Psychology 242, Dr. McKirnan Statistics introduction 1 Standard Deviation & Formulas
54. Studying Psychology 242 Psychology 242, Dr. McKirnan Statistics introduction 1
55. Psychology 242, Dr. McKirnan Statistics introduction 1 Using Standard Deviations
56. Psychology 242, Dr. McKirnan Statistics introduction 1 Calculating the standard deviation; high variance
57. Psychology 242, Dr. McKirnan Statistics introduction 1 Scores with less variance
58. Psychology 242, Dr. McKirnan Statistics introduction 1 Calculating the standard deviation; lower variance
59. Differing variances Psychology 242, Dr. McKirnan Statistics introduction 1
60. Summary Psychology 242, Dr. McKirnan Statistics introduction 1
61. Psychology 242, Dr. McKirnan Research questions, hypotheses & designs
62. Psychology 242, Dr. McKirnan Statistics introduction 1 Z scores: How do we characterize one participant’s score on a scale? How do we combine these into a single metric (mathematical description) to characterize a score?
Z score:
63. Psychology 242, Dr. McKirnan Statistics introduction 1 Standard Deviation & Formulas
64. Psychology 242, Dr. McKirnan Statistics introduction 1 Introduction to normal distribution The normal distribution is a hypothetical distribution of cases in a sample
It is segmented into standard deviation units.
Each standard deviation unit (Z) has a fixed % of cases
We use Z scores & associated % of the normal distribution to make statistical decisions about whether a score might occur by chance.
65. Standard deviations & distributions, 1 Psychology 242, Dr. McKirnan Statistics introduction 1
66. Standard deviations & distributions, 2 Psychology 242, Dr. McKirnan Statistics introduction 1
67. Standard deviations & distributions, 3 Psychology 242, Dr. McKirnan Statistics introduction 1
68. Standard deviations & distributions, 4 Psychology 242, Dr. McKirnan Statistics introduction 1
69. Standard deviations & distributions, 4 Psychology 242, Dr. McKirnan Statistics introduction 1
70. Psychology 242, Dr. McKirnan Statistics introduction 1 Z scores Z describes how far a score is above or below the M in standard deviation units rather than raw scores.
“Adjusts” the score to be independent of the original scale.
Rather than, e.g., inches, IQ points, attitude scores the scale is in terms of universal standard deviation units.
Z allows us to use the general properties of the normal distribution to determine how much of the curve a score is above / below.
71. (Hypothetical) Sampling Distribution Psychology 242, Dr. McKirnan Statistics introduction 1
72. Psychology 242, Dr. McKirnan Statistics introduction 1 The Normal Distribution
73. Psychology 242, Dr. McKirnan Statistics introduction 1
74. Psychology 242, Dr. McKirnan Statistics introduction 1 Standard deviations and distributions
75. Psychology 242, Dr. McKirnan Statistics introduction 1 Standard deviations and distributions
76. Raw scores ? Z scores ? % cases Psychology 242, Dr. McKirnan Statistics introduction 1
77. Psychology 242, Dr. McKirnan Statistics introduction 1
78. Psychology 242, Dr. McKirnan Research questions, hypotheses & designs
79. Psychology 242, Dr. McKirnan Statistics introduction 1 Normal distribution; Z scores To use the normal distribution to calculate how ‘good’ a score is…
Calculate how far the score (X) is from the mean (M); X–M.
“Adjust” X–M by how much variance there is in the sample; use the standard deviation (S).
Z = X–M / S
80. Psychology 242, Dr. McKirnan Statistics introduction 1 Z scores: areas under the normal curve, 2
81. Psychology 242, Dr. McKirnan Statistics introduction 1 Z scores: areas under the normal curve, 2
82. Psychology 242, Dr. McKirnan Statistics introduction 1 Z scores: areas under the normal curve, 2
83. Psychology 242, Dr. McKirnan Statistics introduction 1 Z scores: areas under the normal curve, 2
84. Psychology 242, Dr. McKirnan Statistics introduction 1 Z scores: areas under the normal curve, 2
85. Psychology 242, Dr. McKirnan Statistics introduction 1 Using the Normal Distribution How good is a score of ‘6' in the group described in
Table 1?
Table 2?
86. Psychology 242, Dr. McKirnan Statistics introduction 1 Using the normal distribution, 2
87. Psychology 242, Dr. McKirnan Statistics introduction 1 Comparing Scores: deviation x Variance
88. Psychology 242, Dr. McKirnan Statistics introduction 1 Normal distribution; high variance
89. Psychology 242, Dr. McKirnan Statistics introduction 1 Normal distribution; low variance
90. Psychology 242, Dr. McKirnan Statistics introduction 1 Evaluating scores using Z
91. Psychology 242, Dr. McKirnan Statistics introduction 1 Summary: evaluating individual scores A. The distance of the score from the M.
In both groups ‘6’ is two units > the M (X = 6, M = 4).
B. The variance in the rest of the sample
One group has low variance and one has higher.
C. Criterion for “significantly good” score
What % of the sample must the score be higher than…
92. Psychology 242, Dr. McKirnan Statistics introduction 1 Z / “standard” scores Z scores (or standard deviation units) standardize scores by putting them on a common scale.
In our example the target score and M scores are the same, but come from samples with different variances.
We compare the target scores by translating them into Zs, which take into account variance.
Any scores can be translated into Z scores for comparison…
93. Psychology 242, Dr. McKirnan Statistics introduction 1 Cannot directly compare scores because they are on different scales.
Change scale to common metric: Z scores (% of larger distribution of scores on that scale).
Z scores can be compared, since they are standardized by being relative to the larger population of scores.
94. Psychology 242, Dr. McKirnan Statistics introduction 1 Comparing Zs
95. Psychology 242, Dr. McKirnan Research questions, hypotheses & designs
96. Psychology 242, Dr. McKirnan Statistics introduction 1 Using statistics to test hypotheses:
97. Psychology 242, Dr. McKirnan Statistics introduction 1 Probabilities & Statistical Hypothesis Testing Using the Normal Distribution:
More extreme scores have a lower probability of occurring by chance alone
Z (# of standard deviation units from M) = the % of cases above or below the observed score
A high Z score may be “extreme” enough for us to reject the null hypothesis
98. Psychology 242, Dr. McKirnan Statistics introduction 1 “Statistical significance” We assume a score with less than 5% probability of occurring has not occurred by chance alone (i.e., higher or lower than 95% of the other scores… p < .05)
Z > +1.98 occurs < 95% of the time (p <.05).
If Z > 1.98 we consider the score to be “significantly” different from the mean
To test if an effect is “statistically significant”
Compute a Z score for the effect
Compare it to the critical value for p<.05; + 1.98
99. Psychology 242, Dr. McKirnan Statistics introduction 1
100. Psychology 242, Dr. McKirnan Statistics introduction 1 Evaluating Research Questions Data Statistical Question
101. Psychology 242, Dr. McKirnan Statistics introduction 1 Critical ratio Critical ratio =
102. Psychology 242, Dr. McKirnan Statistics introduction 1 Critical Ratios and distributions We use Mean(s) and Standard Deviation(s) to compute a critical ratio for a given research result.
103. Psychology 242, Dr. McKirnan Statistics introduction 1 Examples of Critical Ratios
104. Psychology 242, Dr. McKirnan Statistics introduction 1 Numbers are important for representing “reality” in science (and other fields).
Different measures of central tendency are useful & accurate for different data;
Mean is the most common.
Median useful for skewed data
Mode useful for simple categorical data
Variance (around the mean) is key to characterizing a set of numbers.
We understand a set of scores in terms of the:
Central tendency – the average or Mean score
The amount of variance in the scores, typically the Standard Deviation.
105. Psychology 242, Dr. McKirnan Statistics introduction 1 Z is the prototype critical ratio: