Chapter 4

Chapter 4 Displaying Quantitative Data

Dealing With a Lot of Numbers... • When looking at large sets of quantitative data, it can be difficult to get a sense of what the numbers are telling us without summarizing the numbers in some way. • In this chapter, we will concentrate on graphical displays of quantitative data.

Percent of Population over 65 per state (1996) 13.0 14.3 12.5 13.9 13.8 12.5 15.8 12.1 5.2 12.8 12.6 11.4 11.4 14.5 12.1 11.2 13.2 18.5 15.2 14.1 12.0 13.4 14.4 11.6 14.4 9.9 13.7 12.4 13.8 13.5 12.5 15.2 10.5 12.9 12.6 12.4 11.0 13.4 10.2 13.3 11.0 11.4 11.4 12.3 13.4 15.9 8.8 11.2 13.8 13.2

What do these data tell us? • Make a picture • Histogram • Stem-and-Leaf Display • Dot plot • First three things to do with data • Make a picture • Make a picture • Make a picture

Displaying Quantitative Data • Histogram • Give each graph a title • Give each one of the axes a label • Make as neat as possible • Computer • Grid paper

Displaying Quantitative Data • Histogram • Divide data values into equal-width piles (called bins) • Count number of values in each bin • Plot the bins on x-axis • Plot the bin counts on y-axis

Example – Population Over 65 • Decide on bin values • Low value is 5.2 and high value is 18.5 • Bins are 5.0 up to 6.0, 6.0 up to 7.0, etc. • Written as 5.0 ≤ X < 6.0, 6.0 ≤ X < 7.0 • Count number of values in each bin • Bin 5.0 ≤ X < 6.0 has 1 value • Bin 6.0 ≤ X < 7.0 has 0 values • Bin 7.0 ≤ X < 8.0 has 0 values • Bin 8.0 ≤ X < 9.0 has 1 value • Continue counting values in each bin

Example – Population Over 65 • Plot bins on x-axis • 14 bins from 5.0 ≤ X < 6.0 to 18.0 ≤ X < 19.0 • Plot bin counts on y-axis • Bin counts are: 1, 0, 0, 1, 1, 2, 9, 13, 13, 5, 4, 0, 0, 1

Displaying Quantitative Data • Stem and Leaf Display • Picture of Distribution • Generally used for smaller data sets • Group data like histograms • Still have original values (unlike histograms) • Two columns • Left column: Stem • Right column: Leaf

Displaying Quantitative Data • Stem and Leaf Display • Leaf • Contains the last digit of the values • Arranged in increasing order away from stem • Stem • Contains the rest of the values • Arranged in increasing order from top to bottom

Example – Population Over 65 • Leaf = tenths digit • Stem = tens and ones digits • Ex. 5 | 2 • Ex. 10| 2 5 • Ex. 14| 1 3 4 4 5

Percent of Population over Age 65 (by state) in 1996

Example – Frank Thomas • Career Home Runs (1990-2004) 4 7 15 18 24 28 29 32 35 38 40 40 41 42 43

Displaying Quantitative Data • Back-to-back Stem-and-Leaf Display • Used to compare two variables • Stems in center column • Leafs for one variable – right side • Leafs for other variable – left side • Arrange leafs in increasing order, AWAY FROM STEM!

Example – Compare Frank Thomas to Ryne Sandberg • Career Home Runs for Ryne Sandberg (1981-1997) 0 5 7 8 9 12 14 16 19 19 25 26 26 26 30 40

Displaying Quantitative Data • If there are a large number of observations in only a few stems, we can split stems. • Split the stems into two stems • First stem is 0 – 4. • Second stem is 5 – 9. • If you choose to split one stem you MUST split them all!

Example – Population Over 65

Looking at Distributions • Always report 3 things when describing a distribution: • Shape • Center • Spread

Looking at Distributions • Shape • How many humps (called modes)? • None = uniform • One = unimodal • Two = bimodal • Three or more = multimodal

Unimodal vs Bimodal

Looking at Distributions • Shape • Is it symmetric? • Symmetric = roughly equal on both sides • Skewed = more values on one side • Right = Tail stretches to large values • Left = Tail stretches to small values • Are there any outliers? • Interesting observations in data • Can impact statistical methods

Examples of Skewness

Looking at Distributions • Center • A single number to describe the data • Can calculate different numbers for center

Looking at Distributions • Spread • Variation in the data values • Smallest observation to the largest observation • May take into account any outliers • Later, spread will be a single number

Example – Population Over 65 • Shape • Unimodal • Symmetric • Two Outliers (5% and 18%) • Center - 12% • Spread - Almost all observations are between 8% and 16%

Shape • Unimodal • Skewed left • No outliers • Center - 28 • Spread – between 4 and 43 Example – Frank Thomas

Shape • Unimodal • Skewed right • No Outliers • Center – 26 • Spread – between 0 and 40 • Both players have about the same spread • Thomas has more higher values Example – Compare Frank Thomas to Ryne Sandberg

What Do We Know? • Histograms, Stem-and-Leaf Displays, Back-to-Back Stem-and-Leaf Displays • When describing a display, always mention: • Shape: number of modes, symmetric or skewed • Spread • Center • Outliers (mention them if they exist; otherwise, say there are no outliers)

What Do We Know? (cont.) • A graph is either symmetric or skewed, not both! • If a graph is skewed, be sure to specify the direction: • Skewed left or skewed right

Chapter 4

Chapter 4

Presentation Transcript

Chapter 4

Chapter 4

Chapter 4

Chapter 4

Chapter 4

Chapter 4

Chapter 4

Chapter 4-4

Chapter 4

Chapter 4

Chapter 4 - 4

Chapter 4

CHAPTER 4

Chapter 4

Chapter 4

CHAPTER 4

Chapter 4

Chapter 4

CHAPTER 4

Chapter 4

Chapter 4

Chapter 4