1 / 16

STA 291 Fall 2009

STA 291 Fall 2009. Lecture 6 Dustin Lueker. Measures of Variation. Statistics that describe variability Two distributions may have the same mean and/or median but different variability Mean and Median only describe a typical value, but not the spread of the data Range Variance

zody
Download Presentation

STA 291 Fall 2009

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. STA 291Fall 2009 Lecture 6 Dustin Lueker

  2. Measures of Variation • Statistics that describe variability • Two distributions may have the same mean and/or median but different variability • Mean and Median only describe a typical value, but not the spread of the data • Range • Variance • Standard Deviation • Interquartile Range • All of these can be computed for the sample or population STA 291 Fall 2009 Lecture 6

  3. Range • Difference between the largest and smallest observation • Very much affected by outliers • A misrecorded observation may lead to an outlier, and affect the range • The range does not always reveal different variation about the mean STA 291 Fall 2009 Lecture 6

  4. Example • Sample 1 • Smallest Observation: 112 • Largest Observation: 797 • Range = • Sample 2 • Smallest Observation: 15033 • Largest Observation: 16125 • Range = STA 291 Fall 2009 Lecture 6

  5. Deviations • The deviation of the ith observation xi from the sample mean is the difference between them, • Sum of all deviations is zero • Therefore, we use either the sum of the absolute deviations or the sum of the squared deviations as a measure of variation STA 291 Fall 2009 Lecture 6

  6. Sample Variance • Variance of nobservations is the sum of the squared deviations, divided by n-1 STA 291 Fall 2009 Lecture 6

  7. Example STA 291 Fall 2009 Lecture 6

  8. Interpreting Variance • About the average of the squared deviations • “average squared distance from the mean” • Unit • Square of the unit for the original data • Difficult to interpret • Solution • Take the square root of the variance, and the unit is the same as for the original data • Standard Deviation STA 291 Fall 2009 Lecture 6

  9. Properties of Standard Deviation • s ≥ 0 • s = 0 only when all observations are the same • If data is collected for the whole population instead of a sample, then n-1 is replaced by n • s is sensitive to outliers STA 291 Fall 2009 Lecture 6

  10. Variance and Standard Deviation • Sample • Variance • Standard Deviation • Population • Variance • Standard Deviation STA 291 Fall 2009 Lecture 6

  11. Population Parameters and Sample Statistics • Population mean and population standard deviation are denoted by the Greek letters μ (mu) and σ (sigma) • They are unknown constants that we would like to estimate • Sample mean and sample standard deviation are denoted by and s • They are random variables, because their values vary according to the random sample that has been selected STA 291 Fall 2009 Lecture 6

  12. Empirical Rule • If the data is approximately symmetric and bell-shaped then • About 68% of the observations are within one standard deviation from the mean • About 95% of the observations are within two standard deviations from the mean • About 99.7% of the observations are within three standard deviations from the mean STA 291 Fall 2009 Lecture 6

  13. Example • SAT scores are scaled so that they have an approximate bell-shaped distribution with a mean of 500 and standard deviation of 100 • About 68% of the scores are between • About 95% of the scores are between • If you have a score above 700, you are in the top % • What percentile would this be? STA 291 Fall 2009 Lecture 6

  14. Example • According to the National Association of Home Builders, the U.S. nationwide median selling price of homes sold in 1995 was $118,000 • Would you expect the mean to be larger, smaller, or equal to $118,000? • Which of the following is the most plausible value for the standard deviation? (a) –15,000, (b) 1,000, (c) 45,000, (d) 1,000,000 STA 291 Fall 2009 Lecture 6

  15. Coefficient of Variation • Standardized measure of variation • Idea • A standard deviation of 10 may indicate great variability or small variability, depending on the magnitude of the observations in the data set • CV = Ratio of standard deviation divided by mean • Population and sample version STA 291 Fall 2009 Lecture 6

  16. Example • Which sample has higher relative variability? (a higher coefficient of variation) • Sample A • mean = 62 • standard deviation = 12 • CV = • Sample B • mean = 31 • standard deviation = 7 • CV = STA 291 Fall 2009 Lecture 6

More Related