1 / 23

Chapter 4

Chapter 4. The Description of Data: Measures of Variation and Dispersion. Measures of Variation. We have looked at measures of the center, or location, of data. We also need a measure of the dispersion of data. Range. The range is the distance spanned by the data .

avram-lang
Download Presentation

Chapter 4

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 4 The Description of Data:Measures of Variation and Dispersion

  2. Measures of Variation • We have looked at measures of the center, or location, of data. • We also need a measure of the dispersion of data.

  3. Range • The range is the distance spanned by the data. • The range is calculated by subtracting the smallest data value from the largest. • The range is sensitive to outliers. • The range does not provide any information regarding the data between the minimum and maximum.

  4. Interquartile Range • The interquartile range is the distance spanned by the middle 50% of the data. • The interquartile range is calculated by subtracting Q1 from Q3. • The interquartile range is not sensitive to outliers, but still gives insight into the dispersion of the data.

  5. Mean Absolute Deviation • The mean absolute deviation is the mean distance to the mean. In other words, it’s the average distance from the data to µ.

  6. Variance andStandard Deviation • The variance is the average squared distance to the mean. • The standard deviation is the square root of the variance.

  7. Variance andStandard Deviation • For samples, we divide by n-1 to avoid bias. • The standard deviations of populations and samples are available from your calculator. Variance can be calculated as the square of the standard deviation.

  8. Chebyshev’s Theorem • The minimum proportion of data that can be found within k standard deviations from the mean is:

  9. Chebyshev’s Theorem • Chebyshev’s Theorem works for any distribution, but it does not work very well. • This theorem gives the minimum proportion of data that will be found in a given interval, but in reality, the actual amount is usually much higher than Chebyshev predicts.

  10. The Empirical Rule • If the distribution of data is normal (bell shaped), then: • 68% of the data will be found within one standard deviation of the mean. • 95% of the data will be found within two standard deviations of the mean. • 99.7% of the data will be found within three standard deviations of the mean.

  11. The Empirical Rule • The empirical rule only works for distributions that are normal (bell shaped). • The empirical rule is much more accurate than Chebyshev’s Theorem.

  12. Coefficient of Variation • The coefficient of variation measures the relative variation of a distribution. • Since this is a relative measure, there are no units, making it easier to compare the variation of two different populations.

  13. Skewness • Distributions with a long right tail are positively skewed. • Distributions with a long left tail are negatively skewed. • Distributions that are not skewed are symmetric.

  14. Pearson’s Coefficient of Skewness • Pearson’s coefficient of skewness gives a numeric measurement of the skewness of a distribution. • Distributions with an SK of 0 are symmetric. • Distributions with a positive SK are positively skewed, while distributions with a negative SK are negatively skewed.

  15. Try it! • The median price of a home selling in San Diego during 1991 was $195,000. The first and third quartile prices were $170,500 and $232,000 respectively. What was the semi-interquartile range for the cost of a home in San Diego in 1991? • $30,750

  16. Try it! • A sample of 6 prices quoted for a particular television set are $326, $299, $345, $295, $310, and $345. • Find the range of this sample. • $50

  17. Try it! • A sample of 6 prices quoted for a particular television set are $326, $299, $345, $295, $310, and $345. • Find the variance for the quoted price of the TV. • $490.40

  18. Try it! • A sample of 6 prices quoted for a particular television set are $326, $299, $345, $295, $310, and $345. • Find the standard deviation for the quoted price of the TV. • $22.14

  19. Try it! • Given a set of data with a mean of 220.8 and a standard deviation of 17.0, find the k, or z, value of: • 200 • k = -1.2235

  20. Try it! • Given a set of data with a mean of 220.8 and a standard deviation of 17.0, find the k, or z, value of: • 238.4 • k = 1.0353

  21. Try it! • Given a set of data with a mean of 220.8 and a standard deviation of 17.0, find the k, or z, value of: • 229 • k = .4824

  22. Try it! • Given a set of data with a mean of 220.8 and a standard deviation of 17.0, find the k, or z, value of: • 198.1 • k = -1.3353

  23. Try It! • Exercise 4.12 • SK = -.5430

More Related