500 likes | 587 Views
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring, 2014 Room 120 Integrated Learning Center (ILC) 10:00 - 10:50 Mondays, Wednesdays & Fridays . Welcome. http://www.youtube.com/watch?v=oSQJP40PcGI. Please click in.
E N D
Introduction to Statistics for the Social SciencesSBS200, COMM200, GEOG200, PA200, POL200, or SOC200Lecture Section 001, Spring, 2014Room 120 Integrated Learning Center (ILC)10:00 - 10:50 Mondays, Wednesdays & Fridays. Welcome http://www.youtube.com/watch?v=oSQJP40PcGI
Please click in My last name starts with a letter somewhere between A. A – D B. E – L C. M – R D. S – Z
Use this as your study guide By the end of lecture today2/10/14 • Characteristics of a distribution • Central Tendency • Dispersion • Shape • What are the three primary types of “measures of central • tendency”? • Mean • Median • Mode • Measures of variability • Range, Standard deviation and Variance • Memorizing the four definitional formulae
Schedule of readings Exam Review Tuesday 7:00 – 9:00pm Room TBA Study Guide is online Before next exam (February 14th) Please read chapters 1 - 4 in Ha & Ha textbook Please read Appendix D, E & F onlineOn syllabus this is referred to as online readings 1, 2 & 3 Please read Chapters 1, 5, 6 and 13 in Plous Chapter 1: Selective Perception Chapter 5: Plasticity Chapter 6: Effects of Question Wording and Framing Chapter 13: Anchoring and Adjustment
Lab sessions Labs continue this week Exam Review This week
Homework due – Wednesday (February 12th) On class website: please print and complete homework worksheet # 6 & 7
Overview Frequency distributions The normal curve Challenge yourself as we work through characteristics of distributions to try to categorize each concept as a measure of 1) central tendency 2) dispersion or 3) shape Mean, Median, Mode, Trimmed Mean Standard deviation, Variance, Range Mean Absolute Deviation Skewed right, skewed left unimodal, bimodal, symmetric
Another example: How many kids in your family? Number of kids in family 1 4 3 2 1 8 4 2 2 14 14 4 2 1 4 2 3 2 1 8
Measures of Central Tendency(Measures of location)The mean, median and mode Mean: The balance point of a distribution. Found by adding up all observations and then dividing by the number of observations Mean for a sample: Σx / n = mean = x Mean for a population: ΣX / N = mean = µ(mu) Measures of “location” Where on the number line the scores tend to cluster Note: Σ = add up x or X = scores n or N = number of scores
Measures of Central Tendency(Measures of location)The mean, median and mode Mean: The balance point of a distribution. Found by adding up all observations and then dividing by the number of observations Mean for a sample: Σx / n = mean = x 41/ 10 = mean = 4.1 Number of kids in family 1 4 3 2 1 8 4 2 2 14 Note: Σ = add up x or X = scores n or N = number of scores
Number of kids in family 1 3 1 4 2 4 2 8 2 14 How many kids are in your family? What is the most common family size? Median: The middle value when observations are ordered from least to most (or most to least)
Number of kids in family 1 4 32 18 42 2 14 How many kids are in your family? What is the most common family size? Median: The middle value when observations are ordered from least to most (or most to least) 1, 3, 1, 4, 2, 4, 2, 8, 2, 14 1, 2, 2, 4, 2, 1, 8, 3, 4, 14
Number of kids in family 1 3 1 4 2 4 2 8 2 14 Number of kids in family 1 4 32 18 42 2 14 How many kids are in your family? What is the most common family size? Median: The middle value when observations are ordered from least to most (or most to least) 1, 3, 1, 4, 2, 4, 2, 8, 2, 14 1, 2, 2, 4, 1, 2, 2, 4, 2, 1, 2, 1, 8, 8, 3, 4, 14 3, 4, 14 2.5 2 + 3 µ=2.5 If there appears to be two medians, take the mean of the two Median always has a percentile rank of 50% regardless of shape of distribution
Mode: The value of the most frequent observation Score f . 1 2 2 3 3 1 4 2 5 0 6 0 7 0 8 1 9 0 10 0 11 0 12 0 13 0 14 1 Number of kids in family 1 3 1 4 2 4 2 8 2 14 Please note: The mode is “2” because it is the most frequently occurring score. It occurs “3” times. “3” is not the mode, it is just the frequency for the value that is the mode Bimodal distribution: If there are two most frequent observations
What about central tendency for qualitative data? Mode is good for nominal or ordinal data Median can be used with ordinal data Mean can be used with interval or ratio data
Overview Frequency distributions The normal curve Challenge yourself as we work through characteristics of distributions to try to categorize each concept as a measure of 1) central tendency 2) dispersion or 3) shape Mean, Median, Mode, Trimmed Mean Skewed right, skewed left unimodal, bimodal, symmetric
A little more about frequency distributions An example of a normal distribution
A little more about frequency distributions An example of a normal distribution
A little more about frequency distributions An example of a normal distribution
A little more about frequency distributions An example of a normal distribution
A little more about frequency distributions An example of a normal distribution
Measure of central tendency: describes how scores tend to cluster toward the center of the distribution Normal distribution In all distributions: mode = tallest point median = middle score mean = balance point In a normal distribution: mode = mean = median
Measure of central tendency: describes how scores tend to cluster toward the center of the distribution Positively skewed distribution In all distributions: mode = tallest point median = middle score mean = balance point In a positively skewed distribution: mode < median < mean Note: mean is most affected by outliers or skewed distributions
Measure of central tendency: describes how scores tend to cluster toward the center of the distribution Negatively skewed distribution In all distributions: mode = tallest point median = middle score mean = balance point In a negatively skewed distribution: mean < median < mode Note: mean is most affected by outliers or skewed distributions
Mode: The value of the most frequent observation Bimodal distribution: Distribution with two most frequent observations (2 peaks) Example: Ian coaches two boys baseball teams. One team is made up of 10-year-olds and the other is made up of 16-year-olds. When he measured the height of all of his players he found a bimodal distribution
Overview Frequency distributions The normal curve Mean, Median, Mode, Trimmed Mean Standard deviation, Variance, Range Mean Absolute Deviation Skewed right, skewed left unimodal, bimodal, symmetric
Dispersion: Variability 5’ 7’ 6’ 6’6” 5’6” 5’ 7’ 6’ 6’6” 5’6” 5’ 7’ 6’ 6’6” 5’6” Some distributions are more variable than others The larger the variability the wider the curve tends to be The smaller the variability the narrower the curvetends to be A B Range: The difference between the largest and smallest observations C Range for distribution A? Range for distribution B? Range for distribution C?
Wildcats Basketball team: Tallest player = 84” (same as 7’0”)(KalebTarczewski) Shortest player = 71” (same as 5’11”) (Jacob Hazzard) Fun fact: Mean is 78 Range: The difference between the largest and smallest scores 84” – 71” = 13” xmax - xmin = Range Range is 13”
Baseball Fun fact: Mean is 72 Wildcats Baseball team: Tallest player = 81” (same as 6’9”) (Augey Bill) Shortest player = 70” (same as 5’10”) (Johnny Field) Range: The difference between the largest and smallest score 81” – 70” = 11” xmax - xmin = Range Range is 11”(81” –70”) Please note: No reference is made to numbers between the min and max
Frequency distributions The normal curve
Variability What might this be? Some distributions are more variable than others Let’s say this is our distribution of heights of men on U of A baseball team 5’ 7’ 6’ 6’6” 5’6” 5’ 7’ 6’ 6’6” 5’6” Mean is 6 feet tall What might this be? 5’ 7’ 6’ 6’6” 5’6”
5’ 7’ 6’ 6’6” 5’6” 5’ 7’ 6’ 6’6” 5’6” 5’ 7’ 6’ 6’6” 5’6” Variability The larger the variability the wider the curve the larger the deviations scores tend to be The smaller the variability the narrower the curve the smaller the deviations scores tend to be
Variability Standard deviation: The average amount by which observations deviate on either side of their mean Generally, (on average) how far away is each score from the mean? Mean is 6’
Let’s build it up again…U of A Baseball team Deviation scores Diallo is 0” Diallo is 6’0” Diallo’s deviation score is 0 6’0” – 6’0” = 0 Diallo 5’8” 5’10” 6’0” 6’2” 6’4”
Deviation scores Diallo is 0” Let’s build it up again…U of A Baseball team Preston is 2” Diallo is 6’0” Diallo’s deviation score is 0 Preston is 6’2” Preston Preston’s deviation score is 2” 6’2” – 6’0” = 2 5’8” 5’10” 6’0” 6’2” 6’4”
Deviation scores Diallo is 0” Let’s build it up again…U of A Baseball team Preston is 2” Mike is -4” Hunter is -2 Diallo is 6’0” Diallo’s deviation score is 0 Hunter Preston is 6’2” Preston’s deviation score is 2” Mike Mike is 5’8” Mike’s deviation score is -4” 5’8” – 6’0” = -4 5’8” 5’10” 6’0” 6’2” 6’4” Hunter is 5’10” Hunter’s deviation score is -2” 5’10” – 6’0” = -2
Deviation scores Diallo is 0” Let’s build it up again…U of A Baseball team Preston is 2” Mike is -4” Hunter is -2 Shea is 4 David is 0” Diallo’s deviation score is 0 David Preston’s deviation score is 2” Mike’s deviation score is -4” Shea Hunter’s deviation score is -2” Shea is 6’4” Shea’s deviation score is 4” 5’8” 5’10” 6’0” 6’2” 6’4” 6’4” – 6’0” = 4 David is 6’ 0” David’s deviation score is 0 6’ 0” – 6’0” = 0
Deviation scores Diallo is 0” Let’s build it up again…U of A Baseball team Preston is 2” Mike is -4” Hunter is -2 Shea is 4 David is 0” Diallo’s deviation score is 0 David Preston’s deviation score is 2” Mike’s deviation score is -4” Shea Hunter’s deviation score is -2” Shea’s deviation score is 4” David’s deviation score is 0” 5’8” 5’10” 6’0” 6’2” 6’4”
Deviation scores Diallo is 0” Let’s build it up again…U of A Baseball team Preston is 2” Mike is -4” Hunter is -2 Shea is 4 David is 0” 5’8” 5’10” 6’0” 6’2” 6’4”
Deviation scores Standard deviation: The average amount by which observations deviate on either side of their mean Diallo is 0” Preston is 2” Mike is -4” Hunter is -2 Shea is 4 David is 0” 5’8” 5’10” 6’0” 6’2” 6’4”
Deviation scores Standard deviation: The average amount by which observations deviate on either side of their mean Diallo is 0” Preston is 2” Mike is -4” Hunter is -2 Shea is 4 David is 0” 5’8” 5’10” 6’0” 6’2” 6’4”
Deviation scores Standard deviation: The average amount by which observations deviate on either side of their mean Diallo is 0” Preston is 2” Mike is -4” Hunter is -2 Shea is 4 David is 0” 5’8” 5’10” 6’0” 6’2” 6’4”
Σ(x - x) = 0 Deviation scores Standard deviation: The average amount by which observations deviate on either side of their mean Diallo is 0” Preston is 2” Mike is -4” Hunter is -2 Shea is 4 David is 0” Mike Σ(x - µ) = ? Hunter 5’8” - 6’0” = - 4” 5’9” - 6’0” = - 3” 5’10’ - 6’0” = - 2” 5’11” - 6’0” = - 1” 6’0” - 6’0 = 0 6’1” - 6’0” = + 1” 6’2” - 6’0” = + 2” 6’3” - 6’0” = + 3” 6’4” - 6’0” = + 4” 5’8” 5’10” 6’0” 6’2” 6’4” Diallo How do we find the average height? = average height Σx N How do we find the average spread? Preston Σ(x - µ) = average deviation Σ(x - µ) = 0 N
Σ(x - x) = 0 Σ(x - x) Deviation scores Standard deviation: The average amount by which observations deviate on either side of their mean Diallo is 0” Preston is 2” Mike is -4” Hunter is -2 Shea is 4 David is 0” Σ x - x = ? 2 5’8” - 6’0” = - 4” 5’9” - 6’0” = - 3” 5’10’ - 6’0” = - 2” 5’11” - 6’0” = - 1” 6’0” - 6’0 = 0 6’1” - 6’0” = + 1” 6’2” - 6’0” = + 2” 6’3” - 6’0” = + 3” 6’4” - 6’0” = + 4” 5’8” 5’10” 6’0” 6’2” 6’4” Square the deviations Σx Big problem Big problem N 2 Σ(x - µ) 2 Σ(x - µ) Σ(x - µ) = 0 N
Standard deviation: The average amount scores deviate on either side of their mean Mean: The average value in the data Mean is a measure of typical “value” (where the typical scores are positioned on the number line) Standard deviation is typical “spread” (typical size of deviations or distance from mean) – can never be negative
Standard deviation: The average amount by which observations deviate on either side of their mean These would be helpful to know by heart – please memorize these formula
Standard deviation: The average amount by which observations deviate on either side of their mean What do these two formula have in common?
Standard deviation: The average amount by which observations deviate on either side of their mean n-1 is “Degrees of Freedom” More, next lecture What do these two formula have in common?
Thank you! See you next time!!