1 / 11

Chapter 2 Exploring Data with Graphs and Numerical Summaries

Chapter 2 Exploring Data with Graphs and Numerical Summaries. Section 2.3 Measuring the Center of Quantitative Data. The mean is the sum of the observations divided by the number of observations. Mean. Example: Center of the Cereal Sodium Data.

crescent
Download Presentation

Chapter 2 Exploring Data with Graphs and Numerical Summaries

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 2Exploring Data with Graphs and Numerical Summaries Section 2.3 Measuring the Center of Quantitative Data

  2. The mean is the sum of the observations divided by the number of observations. Mean

  3. Example: Center of the Cereal Sodium Data We find the mean by adding all the observations and then dividing this sum by the number of observations, which is 20: 0, 340, 70, 140, 200, 180, 210, 150, 100, 130, 140, 180, 190, 160, 290, 50, 220, 180, 200, 210 Mean = (0 + 340 + 70 + . . . +210)/20 = 3340/20 =167

  4. The median is the midpoint of the observations when they are ordered from the smallest to the largest (or from the largest to smallest). How to Determine the Median: Put the n observations in order of their size. If the number of observations, n, is: odd, then the median is the middle observation. even, then the median is the average of the two middle observations. Median

  5. CO2 pollution levels in 8 largest nations measured in metric tons per person: China 4.9 Brazil 1.9 India 1.4 Pakistan 0.9 United States 18.9 Russia 10.8 Indonesia 1.8 Bangladesh 0.3 The CO2 values have n = 8 observations. The ordered values are: 0.3, 0.9, 1.4, 1.8, 1.9, 4.9, 10.8, 18.9 Example: CO2 Pollution

  6. Since n is even, two observations are in the middle, the fourth and fifth ones in the ordered sample. These are 1.8 and 1.9. The median is their average, 1.85. The relatively high value of 18.9 falls well above the rest of the data. It is an outlier. The size of the outlier affects the calculation of the mean but not the median. Example: CO2 Pollution

  7. The shape of a distribution influences whether the mean is larger or smaller than the median. Perfectly symmetric, the mean equals the median. Skewed to the right, the mean is larger than the median. Skewed to the left, the mean is smaller than the median. Comparing the Mean and Median

  8. In a skewed distribution, the mean is farther out in the long tail than is the median. For skewed distributions the median is preferred because it is better representative of a typical observation. Comparing the Mean and Median Figure 2.9 Relationship Between the Mean and Median. Question: For skewed distributions, what causes the mean and median to differ?

  9. A numerical summary measure is resistant if extreme observations (outliers) have little, if any, influence on its value. The Medianis resistant to outliers. The Meanis not resistant to outliers. Resistant Measures

  10. Mode Value that occurs most often. Highest bar in the histogram. The mode is most often used with categorical data. The Mode

More Related