1 / 45

Chapter 4: Describing Distributions

Chapter 4: Describing Distributions. 4.1 Graphs: good and bad 4.2 Displaying distributions with graphs 4.3 Describing distributions with numbers. Dow Jones Industrial Average. Pie Graph. Definitions. Types of variables Categorical E.g., gender, type of degree Quantitative

bowen
Download Presentation

Chapter 4: Describing Distributions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 4:Describing Distributions • 4.1 Graphs: good and bad • 4.2 Displaying distributions with graphs • 4.3 Describing distributions with numbers

  2. Dow Jones Industrial Average

  3. Pie Graph

  4. Definitions • Types of variables • Categorical • E.g., gender, type of degree • Quantitative • E.g., time, mass, force, dollars • The distribution of a variable tells us what values it takes and how often it takes these values.

  5. Bar graph showing a distribution

  6. Exercises, pp. 207-208 • 4.1 • 4.5

  7. Bar graph for 4.1

  8. Pie Chart for 4.1

  9. Worker Salary $2000/mo Manager Salary $4000/mo Misleading Pictogram (p. 209)

  10. Dow Jones Industrial Average:This is a line graph (p. 210)

  11. Misleading Graphs?

  12. Making good graphs (p. 213) • Graphs must have labels, legends, and titles. • Make the data stand out. • Pay attention to what the eye sees. • 3-D is really not necessary!

  13. Exercises, pp. 214-216 • 4.6 through 4.8

  14. Homework • Problems, pp. 219-221, to be done in Excel: • 4.11, 4.15 • Email Excel file by class time on Monday • Section 4.2 Reading, pp. 221-242

  15. 4.2 Displaying Distributions with Graphs

  16. Displaying distributions graphically • The distribution of a variable tells us what values it takes and how often it takes these values. • Ways to display distributions for quantitative variables: • dotplots • histograms • stemplots • See example on pp. 221-222.

  17. Figure 4.15: A histogram

  18. Figure 4.16: A stemplot

  19. Histograms • Most common graph of the distribution of a quantitative variable. • How to make a histogram: Example 4.9, p. 224 • Range: 5.7 to 17.6 • Shoot for 6-15 classes (bars) • Read paragraph on p. 226

  20. Example 4.9, pp. 224-226

  21. Practice Problem: 4.18, p. 226

  22. Exercise 4.18 • Histogram • By hand • Using calculator • Stemplot • By hand

  23. Interpreting the graphical displays • Concentrate on the main features. • Overall pattern (p. 230) • Shape, center, spread • Outliers • Individual observations outside the overall pattern of the graph

  24. Example 4.10, p. 230

  25. Shape • Symmetric or skewed (p. 231)? • Is it unimodal (one hump) or bimodal (two humps)?

  26. Homework • Reading: pp. 221-242

  27. Stemplots • Usually reserved for smaller data sets. • Advantage: • Actual (or rounded) data are provided. • Possible drawback: • Many people are not used to this type of plot, so the presenter/writer has to describe it.

  28. How to make a stemplot, p. 236

  29. More problems • Exercises: • 4.24 and 4.25, p. 233 • 4.26, p. 233

  30. Practice • Exercises 4.30, p. 239 and 4.32, p. 240 • 4.28, p. 238

  31. Wrapping up Section 4.2 … • 4.28, p. 238 • 4.33, p. 242 • 4.36 • 4.37

  32. 4.3 Describing Distributionswith Numbers • Until now, we’ve been satisfied with using words to describe the center and spread of distributions. • Now, we will use numbers to describe these characteristics of a distribution. • The 5-number summary: • Center: Median (p. 248) • Spread: Find the Quartiles, Q1 and Q3. (p. 250) • Spread: Min and Max

  33. Boxplots • We can use this information to construct a boxplot:

  34. Practice • 4.46, p. 254 • Enter data in the Stat Edit menu in your calculator, and order them.

  35. Boxplot vs. Modified Boxplot • The modified boxplot shows outliers … they are marked with a *. The lines extending from the quartiles go to the last number which is not an outlier. • If there are no outliers, the modified boxplot and the regular boxplot are identical. • Below are a boxplot (on the left) and modified boxplot (on the right) for Problem 4.39, p. 245.

  36. Side-by-side boxplots (p. 252)

  37. Practice • Exercises: • 4.50, p. 256 • 4.49, p. 256

  38. Testing for Outliers • Find the Inter-Quartile Range: • IQR=Q3-Q1 • Multiply: 1.5*IQR • Outliers on low side: • Q1-1.5*IQR • Outliers on high side: • Q3+1.5*IQR • Are there any numbers outside of these values? • If so, they are outliers, and are marked on boxplots with an asterisk. • The tail is drawn to the highest (or lowest) value which is not an outlier.

  39. Measures of Center and Spread • Median and IQR • Mean and Standard Deviation • Mean is the arithmetic average • Standard deviation measures the average distance of the observations from their mean. • Variance is simply the squared standard deviation. • All of these statistics can be calculated by hand, but we use technology to do these today … • We use 1-sample stats on our calculators, or a stats program.

  40. Properties of standard deviation (p. 259) • Use s as a measure of spread when you use the mean. • If s=0, there is no spread. • The larger the value for s, the larger the spread of the distribution.

  41. Practice Problem • 4.52, p. 263 • Mike: 59,69,71,52,65,55,72,50,75,67,51,69,68,62,69

  42. Practice Problem • 4.55, p. 263

  43. Example 4.21, p. 265

  44. Choosing a summary • The book has a section on which summary to use (mean and std. dev., or median with the quartiles). • I like to report all of them. • However, when writing about a distribution, or comparing distributions, we should think about which summary works best. See p. 266. • Skewed, outliers … median and quartiles • Symmetrical, no (or few) outliers … mean and std. dev. • Mean and standard deviation are most common. One reason is that they allow for more sophisticated calculations to be used in higher statistics.

  45. More Practice … • p. 271: • 4.57, 4.58, 4.60

More Related