1 / 26

NUMERICAL DESCRIPTIVE STATISTICS Measures of Variability

NUMERICAL DESCRIPTIVE STATISTICS Measures of Variability. Another Description of the Data -- Variability. For Data Set A below, the mean of the 10 observations is 2.60. SET A: 4,2,3,3,2,2,1,4,3,2

rian
Download Presentation

NUMERICAL DESCRIPTIVE STATISTICS Measures of Variability

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. NUMERICAL DESCRIPTIVE STATISTICS Measures of Variability

  2. Another Description of the Data -- Variability • For Data Set A below, the mean of the 10 observations is 2.60. SET A: 4,2,3,3,2,2,1,4,3,2 • But each of the following two data sets with 10 observations also has a mean of 2.60 SET B: 2,2,2,2,3,3,3,3,3,3 SET C: 0,0,1,1,4,4,4,4,4,4 • Although sets A, B, an C all have the same mean, the “spread” of the data differs from set to set.

  3. Data Set A Data Set B Most “spread” Least “spread” Data Set C The “Spread” of the Data

  4. Measures of Variability • Population • Variance 2 • Standard Deviation  • Sample • Range • Variance s2 • Standard Deviation s

  5. The Range • When we are talking about a sample, the range is the difference between the highest and lowest observation • In the sample there were some A’s (4’s), and the lowest value in the sample was a D (1) • Sample range = 4 - 1 = 3

  6. Another Approach to Variability • The range only takes into account the two most extreme values • A better approach • Look at the variability of all the data • In some sense find the “average” deviation from the mean • The value of an observation minus the mean can be positive or negative • The plusses and minuses cancel each other out giving an average value of 0 • Need another measure

  7. How to Average OnlyPositive Deviations • MEAN ABSOLUTE DEVIATION (MAD) • Averages the absolute values of these differences • Used in quality control/inventory analyses • But this quantity is hard to work with algebraically and analytically • POPULATION VARIANCE (σ2) • Averages the squares of the differences from the mean

  8. Population Variance Formulas

  9. EXAMPLECalculation of σ2 Using the numbers from the population of 2000 GPA’s: 4,2,1,3,3,3,2,… 2

  10. Standard Deviation • But the unit of measurement for σ2 is: • Square Grade Points (???) • What is a square grade point? • To get back to the original units (grade points), take the square root of σ2 • STANDARD DEVIATION () – the square root of the variance, σ2

  11. Calculation of theStandard Deviation (σ) • For the grade point data:

  12. Estimating σ2 • SAMPLE VARIANCE (s2) • Best estimate for is 2 is s2 • s2 is found by using the sample data and using the formula for 2 except:

  13. Sample Variance Formulas

  14. Calculations for s2 • The data from the sample is: 4,2,3,3,2,2,1,4,3,2

  15. Sample Standard Deviation, s • The best estimate for  is denoted: s • It is called the sample standard deviation • s is found by taking the square root of s2

  16. s2 for Grouped Data • For the grade point example • 4 occurs 2 times • 3 occurs 3 times • 2 occurs 4 times • 1 occurs 1 time • To calculate the sample variance, s2, rather than write the term down each time: • Multiply the squared deviations by their class frequencies

  17. Calculation of s2-Grouped Data

  18. Empirical RuleInterpreting s(Mound Shaped Distribution) • If data forms a mound shaped distribution • Within  1s from the mean • Approximately 68% of the measurements • Within  2s from the mean • Approximately 95% of the measurements • Within  3s from the mean • Approximately all of the measurements

  19. Chebychev’s InequalityInterpreting s(Any Distribution) • If data is not mound shaped ( or shape is unknown) • Within  2s from the mean • At least 75% of the measurements • Within  3s from the mean • At least 88.9% of the measurements • Within  ks from the mean (k > 1) • At least 1 -1/k2 of the measurements

  20. Coefficient of Variation • Another measure of variability that is frequently used to compare different data sets (even if measured in different units) is the: Coefficient of Variation (CV) CV = (Standard Deviation/Mean) x 100%

  21. Range Approximation for σ • If data is relatively mound-shaped a “good” approximation for s is: σ (range)/4 Sometimes, when one is more certain that the sample range captures the entire population of data statisticians use, σ (range)/6

  22. Using Excel • Suppose population data is in cells A2 to A2001 Population variance (2) = VARP(A2:A2001) Population standard dev. () =STDEVP(A2:A2001) • Suppose sample data is in cells A2 to A11 Sample variance (s2) =VAR(A2:A11) Sample standard dev. (s) =STDEV(A2:A11) • Data Analysis

  23. Check Labels Where data values are stored Check both: Summary Statistics Confidence Level Enter Name of Output Worksheet

  24. Drag to make Column A wider Sample Standard Deviation Sample Variance

  25. Review • Measures of variability for Populations and Samples • Range • Variance • Standard Deviation • Interpretation of standard deviation • Empirical Rule for “mound-shaped” data • Chebychev’s Inequality for “other” data • Excel • Functions • Data Analysis

More Related