1 / 25

STAT131 Week 2 Lecture 2 Making Comparisons

STAT131 Week 2 Lecture 2 Making Comparisons. Anne Porter. Lecture Outline. Histogram and Distributions Comparing two batches of data Comparing many batches of data Good graphics. Histograms and Distributions. Video Clip, Decisions through Data, Tape 1 Unit 3 Comparing mean and mode

marci
Download Presentation

STAT131 Week 2 Lecture 2 Making Comparisons

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. STAT131Week 2 Lecture 2Making Comparisons Anne Porter

  2. Lecture Outline • Histogram and Distributions • Comparing two batches of data • Comparing many batches of data • Good graphics

  3. Histograms and Distributions • Video Clip, • Decisions through Data, Tape 1 Unit 3 • Comparing mean and mode • Symmetric and asymmetric distributions • Changing the bin width

  4. Comparing data for men and women What might we do?

  5. Dot plot (with error) • Note how easy it is to see the outlier • What do you see?

  6. Dot plot (error removed) Note how easy it is to see • Spread • Centre of data • Absence of outliers • Comparison of male and female shoesize • We cannot see the shape of the distribution or density of dots (ie many people with the one shoe size)

  7. Comparing (two) batches of data • SPSS does not do this • Reverse the stems for one side in a table in word • What does the data reveal?

  8. Comparing (two) batches of data • SPSS does not do this • Reverse the stems for one side in a table in word • What does the data reveal? Shape of male and female distributions Centre (mode) for males (10) and females (8) Spread for males (7-15) and females (4-11.5) No outliers (?) Pattern within distribution Are there male half sizes above 13?

  9. Box-and-Whisker plots(with error)Boxplot plots the five number summary • What does it reveal?

  10. Box-and-Whisker plots(with error) • What does it reveal?

  11. Box-and-Whisker plots(with error) • What does it reveal? • Based on samples of 31 females and 118 males we can see: • Centre (median shoe size) for males (10) is higher than for females (8) • Spread (range and inter- quartile range for males (9-11) and females (7-8.5)is roughly the same. • Outliers: Females have two (size 4 & 11.5) and males two (14 & 15) • Female data is more asymmetric than male with relatively shorter tail of upper values

  12. When comparing batches of data • The plots must be on the same scale to allow comparison • Centre • Distribution (shape) • Spread • Outliers • Patterns • Comparison using two separate plots is not appropriate • Different plots will reveal different aspects of the data

  13. Utility: Box plots vs Stem and Leaf plots • Box plots are especially useful for comparing ≥ 2 samples. They show the key points of a sample, but not the individual values. • Stem and leaf plots show individual values, and give a better picture of the shape of the spread, but their detail makes them unsuitable for comparing more than two groups (side by side or back to back).

  14. Comparing more than two groups • Same axes/scale • Compare • Centre (medians) • Spread (interquartile range, middle 50%) • Outliers/extreme points • Shape (symmetry) • Be detailed • Superlooper median approx180, compared cone dart 590, glider 400 cm) • Include units of measurement

  15. Graphical Excellence • Convey the message about the data • Axes, units, variable names, figure labels DO NOT • Distort the data • Use pie charts (there is always a better chart) • More dimensions than necessary, 3D instead of 2D • Unnecessary pattern, fill, ink, decoration

  16. Never use a pie chart • Percentage of students in each lab group

  17. Never use a pie chart! • There is always a better chart than the pie chart. • Better - easier to read, minimum, maximum percentage found, comparison between groups easier

  18. Use the fewest dimensions possibleFewest dimensions possible • If same width bars the height is read • In general it is the area that matters in histograms and bar charts • 3D has volume which can distract the reader • Note use of bar chart for discrete data, histogram for continuous data

  19. Take care with colours

  20. To reveal Centre Spread Outliers Distribution Patterns Anything unusual Comparisons And more

  21. What do you want to see in data? • Information • Meaning • We must turn data into information in order to have meaning

  22. What can we see in data? Location (centre) Spread Shape Outliers Unusual patterns Gaps, clusters How do batches differ

  23. Tools for making meaning from data Ordering data Dot plots & jittered dot plots Stem-and-leaf plots Histograms, Boxplots, Bar charts Pie charts Frequency tables Numerical summaries

  24. Selecting the tool depends on The question asked How the variable is measured The structure of the data Utility of the tool

  25. Homework • Textbook reading Utts & Heckard (2004) Chapter 2 Or • Textbook reading Moore and McCabe pp38-55. Or • Textbook reading, Griffiths, Stirling and Weldon, 1998, Chapters 1, 2, 6 (pp. ) • Complete lab and preparation for next weeks lab.

More Related