420 likes | 611 Views
Chapter 2 Summarizing and Graphing Data. 2-1 Review and Preview 2-2 Frequency Distributions 2-3 Histograms 2-4 Graphs that Enlighten and Graphs that Deceive. Definition. A histogram is a graph of a frequency distribution. Understanding Histograms.
E N D
Chapter 2Summarizing and Graphing Data 2-1 Review and Preview 2-2 Frequency Distributions 2-3 Histograms 2-4 Graphs that Enlighten and Graphs that Deceive
Definition A histogram is a graph of a frequency distribution.
Understanding Histograms A graph consisting of bars of equal width drawn adjacent to each other (unless there are gaps in the data) The horizontal scale represents the classes of quantitative data values and the vertical scale represents the frequencies.
Drawing Histograms • The heights (y values) of the bars correspond to the frequency values. • The values on the horizontal scale may be either boundaries, midpoints, or lower class limits. We will use midpoints.
Drawing Histograms • The values on the horizontal scale are boundaries in this picture. • The values on the horizontal scale are midpoints in this picture.
Example • What is a histogram? • What do the values on the vertical axis represent? • What do the values on the horizontal axis represent? • What does it mean that the bars are of equal width and adjacent? • How do we find the numbers on the vertical axis? The horizontal?
Purpose We use a histogram to analyze the shape of the distribution of the data.
Four Possible shapes • Normal • Uniform • Skewed Left • Skewed Right
Normal Distribution • In a normal distribution the values increase to a maximum and then decrease. • Normal distributions have a “bell” shape. • Many statistical methods require that a distribution (set of values) is normal. • Some really important methods presented later require sample data from a normal population.
Skewness A distribution of data is skewed if it is not symmetric and extends more to one side to the other. Data skewed to the right (positively skewed) have a longer right tail. Data skewed to the left (negative skewed) have a longer left tail.
Uniform Distribution In a uniform distribution, the frequencies are all roughly equal.
Example IQ scores from children with low levels of lead. • What is the shape of this distribution? • What is the center?
Example – IQ Scores • What is the shape of this distribution? NORMAL • What is the center? -About 90
Relative Frequency Histogram has the same shape and horizontal scale as a histogram, but the vertical scale is marked with relative frequencies instead of actual frequencies
Example – A consumer visits multiple stores and finds that most of the CD prices are about the same. Which histogram has a shape that represents this?
In summary A histogram is basically a graph of a frequency distribution. Histograms come in four shapes. The heights of the bars are frequencies and the widths are classes.
Assessing Normality with a Normal Quantile Plot • Many methods we will use later in the text require that the sample data must be from a population with a normal distribution. • A normal quantile plot can be interpreted on the following criteria: • Normal Distribution: Points are reasonably close to a straight line • Not a Normal Distribution: Points not reasonably close to a straight line or the points show some systemic pattern that is not straight
Chapter 2Summarizing and Graphing Data 2-1 Review and Preview 2-2 Frequency Distributions 2-3 Histograms 2-4 Graphs that Enlighten and Graphs that Deceive
Key Concept This section discusses other types of statistical graphs. Our objective is to identify a suitable graph for representing the data set. The graph should be effective in revealing the important characteristics of the data.
Key Concept • Two things can make a graph “bad”. • The graph is not accurate. • The graph is accurate but misleading. • In this section we develop the ability to recognize bad graphs of type 2 and identify exactly how they are misleading.
Graphs that Enlighten and Graphs that Deceive. • Graphs that give a better understanding of the data. • Graphs that deceive are misleading.
Scatterplot (or Scatter Diagram) Graphs that Enlighten A plot ordered pairs (x,y) of quantitative data with a horizontal x-axis and a vertical y-axis. Used to determine whether there is a correlation (relationship) between the two variables. Randomly selected males – the pattern suggests there is a relationship.
In the last example we used a subjective method to determine correlation. (Do the points come close to lying on a line?) In chapter 10 we develop more rigorous reliable methods. If there is correlation, the correlation coefficient r is close to -1 or 1. Otherwise r is close to 0. CFU Randomly selected males – the pattern suggests there is a relationship.
Graphs that Enlighten Time-Series Graph Data that have been collected at different points in time: time-series data. These graphs allow us to observe how the data is distributed over time. Points are connected with line segments unlike scatter plots. Yearly high values of the Dow Jones Industrial Average
Graphs that Enlighten Dotplot Consists of a graph in which each data value is plotted as a point (or dot) along a scale of values. Dots representing equal values are stacked. Dot plots are good for comparing different frequencies.
Stemplot (or Stem-and-Leaf Plot) Graphs that Enlighten Represents quantitative data. Separates each data value into two parts: the stem (on the left) and the leaf (on the right). This is good for comparing frequencies. Notice if you rotate it 90 degrees counterclockwise it is a histogram. CFU
Bar Graph Graphs that Enlighten Uses bars of equal width to show frequencies of categorical, or qualitative, data. Vertical scale represents frequencies or relative frequencies. Horizontal scale identifies the different categories of qualitative data. A multiple bar graph has two or more sets (see page 64) of bars and is used to compare two or more data sets.
Multiple Bar Graph Graphs that Enlighten Like a histogram but represents categorical data and uses two or more sets of data with the purpose of comparing their frequencies. CFU
Pareto Chart Graphs that Enlighten A bar graph with the bars arranged in descending order according to frequencies
Pie Chart Graphs that Enlighten A graph depicting Categorical data as slices of a circle, in which the size of each slice is proportional to frequency count
Graphs that Enlighten Frequency Polygon uses line segments connected to points directly above class midpoint values.
Graphs that Enlighten Relative Frequency Polygon Uses relative frequencies (proportions or percentages) for the vertical scale.
Graphs that Enlighten Ogive A line graph that depicts cumulative frequencies
Graphs That Deceive Nonzero Axis: Graphs can be misleading because one or both of the axes begin at some value other than zero, so that differences are exaggerated.
Pictographs Drawings of objects. Three-dimensional objects - money bags, stacks of coins, army tanks (for army expenditures), people (for population sizes), barrels (for oil production), and houses (for home construction) are commonly used to depict data. These drawings can create false impressions that distort the data. Pictographs using areas and volumes can be very misleading.
Example – Income and Education Misleading. Depicts one-dimensional data with three-dimensional boxes. Last box is 64 times as large as first box, but income is only 4 times as large.
Example – Income and Education Bars have same width, too busy, too difficult to understand.
Better example – Income and Education Fair, objective, unencumbered by distracting features.
Important PrinciplesSuggested by Edward Tufte For small data sets of 20 values or fewer, use a table instead of a graph. A graph of data should make the viewer focus on the true nature of the data, not on other elements, such as eye-catching but distracting design features. Do not distort data. Construct a graph to reveal the true nature of the data. Almost all of the ink in a graph should be used for the data, not for the other design elements.
Quiz 1. Explain why a nonzero axis is deceiving. 2. a)Draw a histogram for the table. b)Describe the shape of the graph.