540 likes | 667 Views
Measures of Dispersion. 9/26/2013. Readings. Chapter 2 Measuring and Describing Variables (Pollock) (pp.37-44) Chapter 6. Foundations of Statistical Inference (128-133) (Pollock) Chapter 3 Transforming Variables (Pollock Workbook). Opportunities to discuss course content.
E N D
Measures of Dispersion 9/26/2013
Readings • Chapter 2 Measuring and Describing Variables (Pollock) (pp.37-44) • Chapter 6. Foundations of Statistical Inference (128-133) (Pollock) • Chapter 3 Transforming Variables (Pollock Workbook)
Office Hours For the Week • When • Friday 10-12 • Monday 10-12 • And by appointment
Homework • Chapter 2 • Question 1: A, B, C, D, E • Question 2: B, D, E (this requires a printout) • Question 3: A, B, D • Question 5: A, B, C, D • Question 7: A, B, C, D • Question 8: A, B, C
Course Learning Objectives • Students will learn the basics of research design and be able to critically analyze the advantages and disadvantages of different types of design. • Students Will be able to interpret and explain empirical data.
What are They? • these measure the uniformity of the data • they measure how closely or widely cases are separated on a variable.
The Standard Deviation • A More accurate and precise measure than dispersion and clustering • Is the average distance of values in a distribution from the mean
What it tells us • When the value of the standard deviation is small, values are clustered around the mean. • When the value of the standard deviation is high, values are spread far away from the mean.
About the Standard Deviation • its based on the mean • the larger the standard deviation, the more spread out the values are and the more different they are • if the standard deviation =0 it means there is no variability in the scores. They are all identical.
From 2008 Who was more divisive?
The Standard Deviation • It is a standardized measure…. So what? • This means it has ratio ( the actual value)and ordinal properties (the number of standard deviations 0,1,2,3.. From the mean • This means we can compare different means (e.g. test scores)
The Standard Deviation and Outliers • Any case that is more than 2 standard deviations away from the mean • These cases often provide valuable insights about our distribution
How to determine the value of a standard deviation • The value of +/- 1 s.d. = mean + value of s.d • e.g. if the mean is 8 and the s.d is 2, the value of -1 s.d's is 6, and + 1 s.d.'s is 10 • The value of +/- 2 s.d. = mean + (value of s.d. *2) • e.g. if the mean is 8 and the s.d is 2, the value of -2 s.d's is 4, and + 2 s.d.'s is 12 • Any value in the distribution lower than 4 and higher than 12 is an outlier
An Example from 2008 • States Database • What is the Value of +/- 1 S.D?. (mean+ 1.s.d) • What is the Value of +/-2 S.D? (mean +/- 2 s.d)
Unwrapping The Results • Which are Outliers • How did that shape the 2012 campaign
Camel Humps Dromedary (one hump) Bactrian (bi-modal)
The Normal/Bell Shaped curve • Symmetrical around the mean • It has 1 hump, it is located in the middle, so the mean, median, and mode are all the same!
Why we use the normal curve • To determine skewness • The Normal Distribution curve is the basis for hypothesis/significance testing
What is skewness? • an asymmetrical distribution. • Skewnessis also a measure of symmetry, • Most often, the median is used as a measure of central tendency when data sets are skewed.
The Mean or the Median? • In a normal distribution, the mean is the preferred measure • In a skewed distribution, you go with the median
Deviate from the norm? • Divide the skewness value • By the std. error of skewness
A distribution is said to be skewed if the magnitude of (Skewness value/ St. Error of Skew) is greater than 2 (in absolute value)
If the Value is Two or More 2 or More Use the Median
If the Value Is Two or Less Less Than 2 Mean
Baseball Salaries again • Divide the Skewness by its standard error • .800/.427 = 1.87 • This value is less than 2 so we use the mean (92 million) • What does the positive skew value mean???
Lets Try another One (Per Capita income in the states) • Divide the Skewness by its standard error .817/.337 = 2.42 • The value is greater than two, and the skewness value is positive • What is the better measure and what might cause this distribution shape?
Testing • Causality • Statistical Significance • Practical Significance
Statistical Significance • A result is called statistically significant if it is unlikely to have occurred by chance • You use these to establish parameters, so that you can state probability that a parameter falls within a specified range called the confidence interval (chance or not). • Practical significance says if a variable is important or useful for real-world. Practical significance is putting statistics into words that people can use and understand.
What this Tells us • Roughly 68% of the scores in a sample fall within one standard deviation of the mean • Roughly 95% of the scores fall 2 standard deviations from the mean (the exact # for 95% is 1.96 s.d) • Roughly 99% of the scores in the sample fall within three standard deviations of the mean
A Practice Example • Assuming a normal curve compute the age (value) • For someone who is +1 s.d, from the mean • what number is -1 s.d. from the mean • With this is assumption of normality, what % of cases should roughly fall within this range (+/-1 S.D.) • What about 2 Standard Deviations, what percent should fall in this range?
Life Expectancy in Latin America and Caribbean • Compute the estimated values for Average Life Expectancy for+/- 2 standard deviations from the mean. • With this is assumption of normality, what % of cases should fall within this range (+/-2 s.d).
For Ratio Variables Step 2 Step 4 Step 1 Step 3