1 / 26

Chapter 5 Understanding and Comparing Distributions

Chapter 5 Understanding and Comparing Distributions. Example: The Hopkins Memorial Forest. A 2500-acre reserve in Massachusetts, New York, Vermont Managed by the Williams College center for Environmental Studies (CES) http://www.williams.edu/CES/hopkins.htm

caudillc
Download Presentation

Chapter 5 Understanding and Comparing Distributions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 5 Understanding and Comparing Distributions

  2. Example: The Hopkins Memorial Forest • A 2500-acre reserve in Massachusetts, New York, Vermont • Managed by the Williams College center for Environmental Studies (CES) • http://www.williams.edu/CES/hopkins.htm • Average wind speed for every day in 1989 • Important for monitoring storms

  3. Five-number summary

  4. Boxplot • Invented by John W. Tukey

  5. Draw a single vertical axis spanning the range of the data. Draw short horizontal lines at the lower and upper quartiles and at the median. Then connect them with vertical lines to form a box. Constructing Boxplots

  6. Erect “fences” around the main part of the data. The upper fence is 1.5 IQRs above the upper quartile. The lower fence is 1.5 IQRs below the lower quartile. Note: the fences only help with constructing the boxplot and should not appear in the final display. Constructing Boxplots (cont.)

  7. Use the fences to grow “whiskers.” Draw lines from the ends of the box up and down to the most extreme data values found within the fences. If a data value falls outside one of the fences, we do not connect it with a whisker. Constructing Boxplots (cont.)

  8. Add the outliers by displaying any data values beyond the fences with special symbols. We often use a different symbol for “far outliers” that are farther than 3 IQRs from the quartiles. Constructing Boxplots (cont.)

  9. How to make a boxplot? • Draw a single vertical axis spanning the extent of the data • Draw short horizontal lines at the Q1, median, Q3. Then connect them to make a box. • Draw ‘fences’ • Upper fence = Q3 + 1.5 * IQR • Lower fence = Q1 - 1.5 * IQR • Grow ‘whiskers’ • Add outliers • TI-83 can make boxplots

  10. Comparing groups • Relationship between a quantitative variable and a categorical variable • Is it windier in the winter or summer?

  11. Comparison

  12. Are some months windier than others?

  13. Summary

  14. Outliers • Some outliers are obviously errors • What to do with outliers?

  15. Timeplots • For some data sets, we are interested in how the data behave over time. In these cases, we construct timeplots of the data.

  16. Timeplots

  17. Re-expressing Skewed Data to Improve Symmetry When data are skewed, it is hard to simply summarize with a center and spread. Can we transform the data to be more symmetric? Histogram of the annual compensation to CEOs of the Fortune 500 companies in 2005

  18. Re-expressing Skewed Data to Improve Symmetry (cont.) • One way to make a skewed distribution more symmetric is to re-express or transform the data by applying a simple function (e.g., logarithmic function or square root).

  19. Re-expressing to equalize spread across groups

  20. After log transformation

  21. What Can Go Wrong?

More Related