310 likes | 519 Views
Chapter 4. Displaying Quantitative Data. Dealing With a Lot of Numbers. When looking at large sets of quantitative data, it can be difficult to get a sense of what the numbers are telling us without summarizing the numbers in some way.
E N D
Chapter 4 Displaying Quantitative Data
Dealing With a Lot of Numbers... • When looking at large sets of quantitative data, it can be difficult to get a sense of what the numbers are telling us without summarizing the numbers in some way. • In this chapter, we will concentrate on graphical displays of quantitative data.
Percent of Population over 65 per state (1996) 13.0 14.3 12.5 13.9 13.8 12.5 15.8 12.1 5.2 12.8 12.6 11.4 11.4 14.5 12.1 11.2 13.2 18.5 15.2 14.1 12.0 13.4 14.4 11.6 14.4 9.9 13.7 12.4 13.8 13.5 12.5 15.2 10.5 12.9 12.6 12.4 11.0 13.4 10.2 13.3 11.0 11.4 11.4 12.3 13.4 15.9 8.8 11.2 13.8 13.2
What do these data tell us? • Make a picture • Histogram • Stem-and-Leaf Display • Dot plot • First three things to do with data • Make a picture • Make a picture • Make a picture
Displaying Quantitative Data • Histogram • Give each graph a title • Give each one of the axes a label • Make as neat as possible • Computer • Grid paper
Displaying Quantitative Data • Histogram • Divide data values into equal-width piles (called bins) • Count number of values in each bin • Plot the bins on x-axis • Plot the bin counts on y-axis
Example – Population Over 65 • Decide on bin values • Low value is 5.2 and high value is 18.5 • Bins are 5.0 up to 6.0, 6.0 up to 7.0, etc. • Written as 5.0 ≤ X < 6.0, 6.0 ≤ X < 7.0 • Count number of values in each bin • Bin 5.0 ≤ X < 6.0 has 1 value • Bin 6.0 ≤ X < 7.0 has 0 values • Bin 7.0 ≤ X < 8.0 has 0 values • Bin 8.0 ≤ X < 9.0 has 1 value • Continue counting values in each bin
Example – Population Over 65 • Plot bins on x-axis • 14 bins from 5.0 ≤ X < 6.0 to 18.0 ≤ X < 19.0 • Plot bin counts on y-axis • Bin counts are: 1, 0, 0, 1, 1, 2, 9, 13, 13, 5, 4, 0, 0, 1
Displaying Quantitative Data • Stem and Leaf Display • Picture of Distribution • Generally used for smaller data sets • Group data like histograms • Still have original values (unlike histograms) • Two columns • Left column: Stem • Right column: Leaf
Displaying Quantitative Data • Stem and Leaf Display • Leaf • Contains the last digit of the values • Arranged in increasing order away from stem • Stem • Contains the rest of the values • Arranged in increasing order from top to bottom
Example – Population Over 65 • Leaf = tenths digit • Stem = tens and ones digits • Ex. 5 | 2 • Ex. 10| 2 5 • Ex. 14| 1 3 4 4 5
Example – Frank Thomas • Career Home Runs (1990-2004) 4 7 15 18 24 28 29 32 35 38 40 40 41 42 43
Displaying Quantitative Data • Back-to-back Stem-and-Leaf Display • Used to compare two variables • Stems in center column • Leafs for one variable – right side • Leafs for other variable – left side • Arrange leafs in increasing order, AWAY FROM STEM!
Example – Compare Frank Thomas to Ryne Sandberg • Career Home Runs for Ryne Sandberg (1981-1997) 0 5 7 8 9 12 14 16 19 19 25 26 26 26 30 40
Displaying Quantitative Data • If there are a large number of observations in only a few stems, we can split stems. • Split the stems into two stems • First stem is 0 – 4. • Second stem is 5 – 9. • If you choose to split one stem you MUST split them all!
Looking at Distributions • Always report 3 things when describing a distribution: • Shape • Center • Spread
Looking at Distributions • Shape • How many humps (called modes)? • None = uniform • One = unimodal • Two = bimodal • Three or more = multimodal
Looking at Distributions • Shape • Is it symmetric? • Symmetric = roughly equal on both sides • Skewed = more values on one side • Right = Tail stretches to large values • Left = Tail stretches to small values • Are there any outliers? • Interesting observations in data • Can impact statistical methods
Looking at Distributions • Center • A single number to describe the data • Can calculate different numbers for center
Looking at Distributions • Spread • Variation in the data values • Smallest observation to the largest observation • May take into account any outliers • Later, spread will be a single number
Example – Population Over 65 • Shape • Unimodal • Symmetric • Two Outliers (5% and 18%) • Center - 12% • Spread - Almost all observations are between 8% and 16%
Shape • Unimodal • Skewed left • No outliers • Center - 28 • Spread – between 4 and 43 Example – Frank Thomas
Shape • Unimodal • Skewed right • No Outliers • Center – 26 • Spread – between 0 and 40 • Both players have about the same spread • Thomas has more higher values Example – Compare Frank Thomas to Ryne Sandberg
What Do We Know? • Histograms, Stem-and-Leaf Displays, Back-to-Back Stem-and-Leaf Displays • When describing a display, always mention: • Shape: number of modes, symmetric or skewed • Spread • Center • Outliers (mention them if they exist; otherwise, say there are no outliers)
What Do We Know? (cont.) • A graph is either symmetric or skewed, not both! • If a graph is skewed, be sure to specify the direction: • Skewed left or skewed right