570 likes | 668 Views
Statistics for the Social Sciences. Psychology 340 Fall 2006. Introductions. Outline (for week). Variables: IV, DV, scales of measurement Discuss each variable and it’s scale of measurement Characteristics of Distributions Using graphs Using numbers (center and variability)
E N D
Statistics for the Social Sciences Psychology 340 Fall 2006 Introductions
Outline (for week) • Variables: IV, DV, scales of measurement • Discuss each variable and it’s scale of measurement • Characteristics of Distributions • Using graphs • Using numbers (center and variability) • Descriptive statistics decision tree • Locating scores: z-scores and other transformations
Outline (for week) • Variables: IV, DV, scales of measurement • Discuss each variable and it’s scale of measurement • Characteristics of Distributions • Using graphs • Using numbers (center and variability) • Descriptive statistics decision tree • Locating scores: z-scores and other transformations
Describing distributions • Distributions are typically described with three properties: • Shape: unimodal, symmetric, skewed, etc. • Center: mean, median, mode • Spread (variability): standard deviation, variance
Describing distributions • Distributions are typically described with three properties: • Shape: unimodal, symmetric, skewed, etc. • Center: mean, median, mode • Spread (variability): standard deviation, variance
Which center when? • Depends on a number of factors, like scale ofmeasurement and shape. • The mean is the most preferred measure and it is closely related to measures of variability • However, there are times when the mean isn’t the appropriate measure.
Which center when? • Use the median if: • The distribution is skewed • The distribution is ‘open-ended’ • (e.g. your top answer on your questionnaire is ‘5 or more’) • Data are on an ordinal scale (rankings) • Use the mode if the data are on a nominal scale
Divide by the total number in the population Add up all of the X’s Divide by the total number in the sample The Mean • The most commonly used measure of center • The arithmetic average • Computing the mean • The formula for the population mean is (a parameter): • The formula for the sample mean is (a statistic): • Note: your book uses ‘M’ to denote the mean in formulas
The Mean • Number of shoes: • 5, 7, 5, 5, 5 • 30, 11, 12, 20, 14, 12, 15, 8, 6, 8, 10, 15, 25, 6, 35, 20, 20, 20,25, 15 • Suppose we want the mean of the entire group? • Can we simply add the two means together and divide by 2? • NO. Why not?
The Weighted Mean • Number of shoes: • 5, 7, 5, 5, 5,30, 11, 12, 20, 14, 12, 15, 8, 6, 8, 10, 15, 25, 6, 35, 20, 20, 20,25, 15 • Suppose we want the mean of the entire group? Can we simply add the two means together and divide by 2? • NO. Why not? Need to take into account the number of scores in each mean
Both ways give the same answer The Weighted Mean • Number of shoes: • 5, 7, 5, 5, 5, 30, 11, 12, 20, 14, 12, 15, 8, 6, 8, 10, 15, 25, 6, 35, 20, 20, 20, 25, 15 Let’s check:
The median • The median is the score that divides a distribution exactly in half. Exactly 50% of the individuals in a distribution have scores at or below the median. • Case1: Odd number of scores in the distribution Step1: put the scores in order Step2: find the middle score • Case2: Even number of scores in the distribution Step1: put the scores in order Step2: find the middle two scores Step3: find the arithmetic average of the two middle scores
major mode minor mode The mode • The mode is the score or category that has the greatest frequency. • So look at your frequency table or graph and pick the variable that has the highest frequency. so the mode is 5 so the modes are 2 and 8 Note: if one were bigger than the other it would be called the major mode and the other would be the minor mode
Describing distributions • Distributions are typically described with three properties: • Shape: unimodal, symmetric, skewed, etc. • Center: mean, median, mode • Spread (variability): standard deviation, variance
Variability of a distribution • Variability provides a quantitative measure of the degree to which scores in a distribution are spread out or clustered together. • In other words variabilility refers to the degree of “differentness” of the scores in the distribution. • High variability means that the scores differ by a lot • Low variability means that the scores are all similar
m Standard deviation • The standard deviation is the most commonly used measure of variability. • The standard deviation measures how far off all of the scores in the distribution are from the mean of the distribution. • Essentially, the average of the deviations.
-3 1 2 3 4 5 6 7 8 9 10 m Computing standard deviation (population) • Step 1: To get a measure of the deviation we need to subtract the population mean from every individual in our distribution. Our population 2, 4, 6, 8 X - = deviation scores 2 - 5 = -3
-1 1 2 3 4 5 6 7 8 9 10 m Computing standard deviation (population) • Step 1: To get a measure of the deviation we need to subtract the population mean from every individual in our distribution. Our population 2, 4, 6, 8 X - = deviation scores 2 - 5 = -3 4 - 5 = -1
1 1 2 3 4 5 6 7 8 9 10 m Computing standard deviation (population) • Step 1: To get a measure of the deviation we need to subtract the population mean from every individual in our distribution. Our population 2, 4, 6, 8 X - = deviation scores 2 - 5 = -3 6 - 5 = +1 4 - 5 = -1
3 1 2 3 4 5 6 7 8 9 10 m Computing standard deviation (population) • Step 1: Compute the deviation scores: Subtract the population mean from every score in the distribution. Our population 2, 4, 6, 8 X - = deviation scores 2 - 5 = -3 6 - 5 = +1 Notice that if you add up all of the deviations they must equal 0. 4 - 5 = -1 8 - 5 = +3
X - = deviation scores 2 - 5 = -3 6 - 5 = +1 4 - 5 = -1 8 - 5 = +3 Computing standard deviation (population) • Step 2: Get rid of the negative signs. Square the deviations and add them together to compute the sum of the squared deviations (SS). SS = (X - )2 = (-3)2 + (-1)2 + (+1)2 + (+3)2 = 9 + 1 + 1 + 9 = 20
Computing standard deviation (population) • Step 3: Compute the Variance (the average of the squared deviations) • Divide by the number of individuals in the population. variance = 2 = SS/N
standard deviation = = Computing standard deviation (population) • Step 4: Compute the standard deviation. Take the square root of the population variance.
Computing standard deviation (population) • To review: • Step 1: compute deviation scores • Step 2: compute the SS • SS = (X - )2 • Step 3: determine the variance • take the average of the squared deviations • divide the SS by the N • Step 4: determine the standard deviation • take the square root of the variance
Computing standard deviation (sample) • The basic procedure is the same. • Step 1: compute deviation scores • Step 2: compute the SS • Step 3: determine the variance • This step is different • Step 4: determine the standard deviation
Our sample 2, 4, 6, 8 1 2 3 4 5 6 7 8 9 10 X - X = deviation scores X Computing standard deviation (sample) • Step 1: Compute the deviation scores • subtract the sample mean from every individual in our distribution. 2 - 5 = -3 6 - 5 = +1 4 - 5 = -1 8 - 5 = +3
SS = (X - X)2 2 - 5 = -3 6 - 5 = +1 = (-3)2 + (-1)2 + (+1)2 + (+3)2 4 - 5 = -1 8 - 5 = +3 = 9 + 1 + 1 + 9 = 20 X - X = deviation scores Apart from notational differences the procedure is the same as before Computing standard deviation (sample) • Step 2: Determine the sum of the squared deviations (SS).
3 X X X X 2 1 4 m Computing standard deviation (sample) • Step 3: Determine the variance Recall: Population variance = 2 = SS/N The variability of the samples is typically smaller than the population’s variability
Sample variance = s2 Computing standard deviation (sample) • Step 3: Determine the variance Recall: Population variance = 2 = SS/N The variability of the samples is typically smaller than the population’s variability To correct for this we divide by (n-1) instead of just n
standard deviation = s = Computing standard deviation (sample) • Step 4: Determine the standard deviation
Changes the total and the number of scores, this will change the mean and the standard deviation Properties of means and standard deviations Mean Standard deviation • Change/add/delete a given score changes changes
X old • All of the scores change by the same constant. Properties of means and standard deviations Mean Standard deviation • Change/add/delete a given score changes changes • Add/subtract a constant to each score
X old • All of the scores change by the same constant. Properties of means and standard deviations Mean Standard deviation • Change/add/delete a given score changes changes • Add/subtract a constant to each score
X old • All of the scores change by the same constant. Properties of means and standard deviations Mean Standard deviation • Change/add/delete a given score changes changes • Add/subtract a constant to each score
X old • All of the scores change by the same constant. Properties of means and standard deviations Mean Standard deviation • Change/add/delete a given score changes changes • Add/subtract a constant to each score
X new • All of the scores change by the same constant. • But so does the mean Properties of means and standard deviations Mean Standard deviation • Change/add/delete a given score changes changes • Add/subtract a constant to each score changes
X old • It is as if you just pick up the distribution and move it over, but the spread (variability) stays the same Properties of means and standard deviations Mean Standard deviation • Change/add/delete a given score changes changes • Add/subtract a constant to each score changes
X old • It is as if you just pick up the distribution and move it over, but the spread (variability) stays the same Properties of means and standard deviations Mean Standard deviation • Change/add/delete a given score changes changes • Add/subtract a constant to each score changes
X old • It is as if you just pick up the distribution and move it over, but the spread (variability) stays the same Properties of means and standard deviations Mean Standard deviation • Change/add/delete a given score changes changes • Add/subtract a constant to each score changes
X old • It is as if you just pick up the distribution and move it over, but the spread (variability) stays the same Properties of means and standard deviations Mean Standard deviation • Change/add/delete a given score changes changes • Add/subtract a constant to each score changes
X old • It is as if you just pick up the distribution and move it over, but the spread (variability) stays the same Properties of means and standard deviations Mean Standard deviation • Change/add/delete a given score changes changes • Add/subtract a constant to each score changes
X old • It is as if you just pick up the distribution and move it over, but the spread (variability) stays the same Properties of means and standard deviations Mean Standard deviation • Change/add/delete a given score changes changes • Add/subtract a constant to each score changes
X old • It is as if you just pick up the distribution and move it over, but the spread (variability) stays the same Properties of means and standard deviations Mean Standard deviation • Change/add/delete a given score changes changes • Add/subtract a constant to each score changes
X X new old • It is as if you just pick up the distribution and move it over, but the spread (variability) stays the same Properties of means and standard deviations Mean Standard deviation • Change/add/delete a given score changes changes • Add/subtract a constant to each score changes No change
20 21 22 23 24 s = X Properties of means and standard deviations Mean Standard deviation • Change/add/delete a given score changes changes • Add/subtract a constant to each score changes No change • Multiply/divide a constant to each score (-1)2 21 - 22 = -1 23 - 22 = +1 (+1)2
Multiply scores by 2 40 42 44 46 48 X Properties of means and standard deviations Mean Standard deviation • Change/add/delete a given score changes changes • Add/subtract a constant to each score changes No change • Multiply/divide a constant to each score changes changes (-2)2 42 - 44 = -2 46 - 44 = +2 (+2)2 Sold=1.41 s =
Locating a score • Where is our raw score within the distribution? • The natural choice of reference is the mean (since it is usually easy to find). • So we’ll subtract the mean from the score (find the deviation score). • The direction will be given to us by the negative or positive sign on the deviation score • The distance is the value of the deviation score
Reference point Direction m Locating a score X1 - 100= +62 X1 = 162 X2 = 57 X2 - 100= -43
Reference point Below Above m Locating a score X1 - 100= +62 X1 = 162 X2 = 57 X2 - 100= -43
Raw score Population mean Population standard deviation Transforming a score • The distance is the value of the deviation score • However, this distance is measured with the units of measurement of the score. • Convert the score to a standard (neutral) score. In this case a z-score.