1 / 10

Boxplots

Boxplots. Boxplots & the 5 # Summary. To construct a boxplot, first obtain the 5 number summary { Min , Q 1 , M , Q 3 , Max } Q 1 : 1 st quartile = 25 th percentile M = Q 2 : median = 2 nd quartile = 50 th percentile

tiger
Download Presentation

Boxplots

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Boxplots

  2. Boxplots & the 5 # Summary To construct a boxplot, first obtain the 5 number summary { Min, Q1, M, Q3, Max } Q1: 1st quartile = 25th percentile M = Q2: median = 2nd quartile = 50th percentile Q3: 3rd quartile = 75th percentile

  3. Example: Simple Boxplot Failure times of industrial machines (in hours) 189.1 193.2 200.4 202.6 202.7 206.7 211.4 211.6 212.1 215.7 216.6 218.7 219.3 228.5 236.1 236.5 237.0 239.7 244.3 251.5 253.3 257.0 259.9 265.2 267.4 287.7 308.0 329.0 5 # summary: { 189.1 , 211.55, 232.3 , 254.20, 329.0 }

  4. Example: Simple Boxplot Min M Max Q1 Q3

  5. Example: Simple Boxplot

  6. Example: Modified Boxplot A mathematical rule designates “outliers.” These are plotted using special symbols. (Note: The maximum is still 329.0.)

  7. Modified Boxplot: Outliers A mathematical rule designates “outliers.” Outliers are part of the data – do not assume they are “bad data” or can be deleted. In fact, they may constitute the most meaningful information in the data. If we discover why the one machine took so long to fail, perhaps we can adjust others to increase the likelihood of a large failure time for future machines. Then future failure times should be plotted in a new boxplot.

  8. Modified Boxplot: Outliers A mathematical rule designates “outliers” in a boxplot. Sometimes* an outlier is ripe for… …change learn why it’s wrong and correct it …deletion on the basis of something that distinguishes the unit (but not the value alone) determine why it’s different and doesn’t belong * not always – in fact, in this class, you may assume that outliers aren’t “wrong” and “do belong.

  9. Modified Boxplot: Outliers A mathematical rule designates “outliers.” For a large set of good data drawn from a Normal population, about 2% of the data will be marked “outlier” in a modified boxplot. For small but good data sets, generally more than 2% will be so marked. For good data from skewed distributions, generally much more than 2% will be so marked.

  10. Modified Boxplot: Outliers Learn to match shape with boxplot orientation.

More Related