500 likes | 866 Views
Chapter 4: Describing Distributions. 4.1 Graphs: good and bad 4.2 Displaying distributions with graphs 4.3 Describing distributions with numbers. Dow Jones Industrial Average. Pie Graph. Definitions. Types of variables Categorical E.g., gender, type of degree Quantitative
E N D
Chapter 4:Describing Distributions • 4.1 Graphs: good and bad • 4.2 Displaying distributions with graphs • 4.3 Describing distributions with numbers
Definitions • Types of variables • Categorical • E.g., gender, type of degree • Quantitative • E.g., time, mass, force, dollars • The distribution of a variable tells us what values it takes and how often it takes these values.
Exercises, pp. 207-208 • 4.1 • 4.5
Worker Salary $2000/mo Manager Salary $4000/mo Misleading Pictogram (p. 209)
Making good graphs (p. 213) • Graphs must have labels, legends, and titles. • Make the data stand out. • Pay attention to what the eye sees. • 3-D is really not necessary!
Exercises, pp. 214-216 • 4.6 through 4.8
Homework • Problems, pp. 219-221, to be done in Excel: • 4.11, 4.15 • Email Excel file by class time on Monday • Section 4.2 Reading, pp. 221-242
Displaying distributions graphically • The distribution of a variable tells us what values it takes and how often it takes these values. • Ways to display distributions for quantitative variables: • dotplots • histograms • stemplots • See example on pp. 221-222.
Histograms • Most common graph of the distribution of a quantitative variable. • How to make a histogram: Example 4.9, p. 224 • Range: 5.7 to 17.6 • Shoot for 6-15 classes (bars) • Read paragraph on p. 226
Exercise 4.18 • Histogram • By hand • Using calculator • Stemplot • By hand
Interpreting the graphical displays • Concentrate on the main features. • Overall pattern (p. 230) • Shape, center, spread • Outliers • Individual observations outside the overall pattern of the graph
Shape • Symmetric or skewed (p. 231)? • Is it unimodal (one hump) or bimodal (two humps)?
Homework • Reading: pp. 221-242
Stemplots • Usually reserved for smaller data sets. • Advantage: • Actual (or rounded) data are provided. • Possible drawback: • Many people are not used to this type of plot, so the presenter/writer has to describe it.
More problems • Exercises: • 4.24 and 4.25, p. 233 • 4.26, p. 233
Practice • Exercises 4.30, p. 239 and 4.32, p. 240 • 4.28, p. 238
Wrapping up Section 4.2 … • 4.28, p. 238 • 4.33, p. 242 • 4.36 • 4.37
4.3 Describing Distributionswith Numbers • Until now, we’ve been satisfied with using words to describe the center and spread of distributions. • Now, we will use numbers to describe these characteristics of a distribution. • The 5-number summary: • Center: Median (p. 248) • Spread: Find the Quartiles, Q1 and Q3. (p. 250) • Spread: Min and Max
Boxplots • We can use this information to construct a boxplot:
Practice • 4.46, p. 254 • Enter data in the Stat Edit menu in your calculator, and order them.
Boxplot vs. Modified Boxplot • The modified boxplot shows outliers … they are marked with a *. The lines extending from the quartiles go to the last number which is not an outlier. • If there are no outliers, the modified boxplot and the regular boxplot are identical. • Below are a boxplot (on the left) and modified boxplot (on the right) for Problem 4.39, p. 245.
Practice • Exercises: • 4.50, p. 256 • 4.49, p. 256
Testing for Outliers • Find the Inter-Quartile Range: • IQR=Q3-Q1 • Multiply: 1.5*IQR • Outliers on low side: • Q1-1.5*IQR • Outliers on high side: • Q3+1.5*IQR • Are there any numbers outside of these values? • If so, they are outliers, and are marked on boxplots with an asterisk. • The tail is drawn to the highest (or lowest) value which is not an outlier.
Measures of Center and Spread • Median and IQR • Mean and Standard Deviation • Mean is the arithmetic average • Standard deviation measures the average distance of the observations from their mean. • Variance is simply the squared standard deviation. • All of these statistics can be calculated by hand, but we use technology to do these today … • We use 1-sample stats on our calculators, or a stats program.
Properties of standard deviation (p. 259) • Use s as a measure of spread when you use the mean. • If s=0, there is no spread. • The larger the value for s, the larger the spread of the distribution.
Practice Problem • 4.52, p. 263 • Mike: 59,69,71,52,65,55,72,50,75,67,51,69,68,62,69
Practice Problem • 4.55, p. 263
Choosing a summary • The book has a section on which summary to use (mean and std. dev., or median with the quartiles). • I like to report all of them. • However, when writing about a distribution, or comparing distributions, we should think about which summary works best. See p. 266. • Skewed, outliers … median and quartiles • Symmetrical, no (or few) outliers … mean and std. dev. • Mean and standard deviation are most common. One reason is that they allow for more sophisticated calculations to be used in higher statistics.
More Practice … • p. 271: • 4.57, 4.58, 4.60