1 / 30

Why statisticians were created

Why statisticians were created . Measure of dispersion FETP India. Competency to be gained from this lecture. Calculate a measure of variation that is adapted to the sample studied. Key issues. Range Inter-quartile variation Standard deviation.

constantine
Download Presentation

Why statisticians were created

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Why statisticians were created Measure of dispersionFETP India

  2. Competency to be gained from this lecture Calculate a measure of variation that is adapted to the sample studied

  3. Key issues • Range • Inter-quartile variation • Standard deviation

  4. Measures of spread, dispersion or variability • The measure of central tendency provides important information about the distribution • However, it does not provide information concerning the relative position of other data points in the sample • Measure of spread, dispersion or variability address are needed Range

  5. Why one needs to measure variability Range

  6. Every concept comes from a failure of the previous concept • Mean is distorted by outliers • Median takes care of the outliers Range

  7. The range: A simple measure of dispersion • Take the difference between the lowest value and the highest value • Limitation: • The range says nothing about the values between extreme values • The range is not stable: As the sample size increases, the range can change dramatically • Statistics cannot be used to look at the range Range

  8. Example of a range • Take a sample of 10 heights: • 70, 95, 100, 103, 105, 107, 110, 112, 115 and 140 cms • Lowest (Minimum) value • 70cm • Highest (Maximum) value • 140cm • Range • 140 – 70 = 70cm Range

  9. Three different distributions with the same range (35 Kgs) Even X X X X X X X X X 70 30 40 50 60 Uneven X X X X X X X X X 70 30 40 50 60 Clumped X X X X X X X X X 70 30 40 50 60 Range

  10. The range increases with the sample size Two ranges based on different sample sizes are not comparable Range

  11. Percentiles and quartiles • Percentiles • Those values in a series of observations, arranged in ascending order of magnitude, which divide the distribution into two equal parts • The median is the 50th percentile • Quartiles • The values which divide a series of observations, arranged in ascending order, into 4 equal parts • The median is the 2nd quartile Inter-quartile range

  12. Sorting the data in increasing order • Median • Middle value (if n is odd) • Average of the two middle values (if n is even) • A measure of the “centre” of the data • Quartiles divide the set of ordered values into 4 equal parts Q2(Median) Q1 Q3 First 25% 2nd 25% 3rd 25% 4th 25%

  13. The inter-quartile range • The central portion of the distribution • Calculated as the difference between the third quartile and the first quartile • Includes about one-half of the observations • Leaves out one quarter of the observations • Limitations: • Only takes into account two values • Not a mathematical concept upon which theories can be developed Inter-quartile range

  14. The inter-quartile range: Example • Values • 29 , 31 , 24 , 29 , 30 , 25 • Arrange • 24 , 25 , 29 , 29, 30 , 31 • Q1 • Value of (n+1)/4=1.75 • 24+0.75 = 24.75 • Q3 • Value of (n+1)*3/4=5.2 • Q3 = 30+0.2 = 30.2 • Inter-quartile range = Q3 – Q1 = 30.2 – 24.75 Inter-quartile range

  15. Graphic representation of theinter-quartile range Inter-quartile range

  16. The mean deviation from the mean • Calculate the mean of all values • Calculate the difference between each value and the mean • Calculate the average difference between each value and the mean • Limitations: • The average between negative and positive deviations may generate a value of 0 while there is substantial variation Standard deviation

  17. The mean deviation from the mean:Example Data 10 20 30 40 50 60 70 Mean = 280/7 = 40 Mean deviation from mean 10-40 20-40 ……… -30 -20 -10 0 10 20 30 Sum = 0 Standard deviation

  18. Absolute mean deviation from the mean • Calculate the mean of all values • Calculate the difference between each value and the mean and take the absolute value • Calculate the average difference between each value and the mean • Limitations: • Absolute value is not good from a mathematical point of view Standard deviation

  19. Absolute mean deviation from the mean: Example Data 10 20 30 40 50 60 70 Mean = 280/7 = 40 Mean deviation from mean 10-40 20-40 ……… -30 -20 -10 0 10 20 30 Absolute values 30 20 10 0 10 20 30 Mean deviation from mean = 120/7 = 17.1 Standard deviation

  20. Calculating the variance (1/2) • Calculate the mean as a measure of central location (MEAN) • Calculate the difference between each observation and the mean (DEVIATION) • Square the differences (SQUARED DEVIATION) • Negative and positive deviations will not cancel each other out • Values further from the mean have a bigger impact Standard deviation

  21. Calculating the variance (2/2) • Sum up these squared deviations (SUM OF THE SQUARED DEVIATIONS) • Divide this SUM OF THE SQUARED DEVIATIONS by the total number of observations minus 1 (n-1) to give the VARIANCE • Why divide by n - 1 ? • Adjustment for the fact that the mean is just an estimate of the true population mean • Tends to make the variance larger Standard deviation

  22. The standard deviation • Take the square root of the variance • Limitations: • Sensitive to outliers Standard deviation

  23. Example Mean = 45/9 = 9 x-rays Mean deviation = 8/5 = 1.6 x-rays Variance = (20/(5-1)) = 20/4 = 5 x-rays Standard deviation = 5 = 2.2

  24. Properties of the standard deviation • Unaffected if same constant is added to (or subtracted from) every observation • If each value is multiplied (or divided) by a constant, the standard deviation is also multiplied (or divided) by the same constant Standard deviation

  25. Need of a measure of variation that is independent from the measurement unit • The standard deviation is expressed in the same unit as the mean: • e.g., 3 cm for height, 1.4 kg for weight • Sometimes, it is useful to express variability as a percentage of the mean • e.g., in the case of laboratory tests, the experimental variation is ± 5% of the mean Standard deviation

  26. The coefficient of variation • Calculate the standard deviation • Divide by the mean • The standard deviation becomes “unit free” • Coefficient of variation (%) = • [S.D / Mean] x 100 (Pure number) Standard deviation

  27. Uses of the coefficient of variation • Compare the variability in two variables studied which are measured in different units • Height (cm) and weight (kg) • Compare the variability in two groups with widely different mean values • Incomes of persons in different socio- economic groups Standard deviation

  28. A summary of measures of dispersion

  29. Choosing a measure of central tendency and a measure of dispersion

  30. Key messages • Report the range but be aware of its limitations • Report the inter-quartile deviation when you use the median • Report the standard deviation when you use a mean

More Related