1 / 17

Descriptive Statistics-IV (Measures of Variation)

Descriptive Statistics-IV (Measures of Variation). QSCI 381 – Lecture 6 (Larson and Farber, Sect 2.4). Deviation, Variance and Standard Deviation-I. Deviation. The of a data entry x i in a population data set is the difference between x i and population mean , i.e.

holden
Download Presentation

Descriptive Statistics-IV (Measures of Variation)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Descriptive Statistics-IV(Measures of Variation) QSCI 381 – Lecture 6 (Larson and Farber, Sect 2.4)

  2. Deviation, Variance and Standard Deviation-I Deviation • The of a data entry xi in a population data set is the difference between xi and population mean , i.e. • The sum of the deviations over all entries is zero. • The is the sum of the squared deviations over all entries: •  is the Greek letter sigma. Population variance

  3. Deviation, Variance and Standard Deviation-II Population standard deviation • The is the square root of the population variance, i.e.: • Note: these quantities relate to the population and not a sample from the population. • Note: sometimes the standard deviation is referred to as the standard error.

  4. The Sample variance and Standard Deviation • The and the of a data set with n entries are given by: Sample variance Sample standard deviation Note the division by n -1 rather than N or n.

  5. Calculating Standard Deviations

  6. Example • Find the standard deviation of the following bowhead lengths (in m): (8.5, 8.4, 13.8, 9.3, 9.7) • Key question (before doing anything) – is this a sample or a population?

  7. Formulae in EXCEL • Calculating Means: Average(“A1:A10”) • Calculating Standard deviations: Stdev(“A1:A10”) – this calculates the sample and not the population standard deviation!

  8. Standard Deviations-I SD=0 SD=2.1 SD=5.3

  9. Standard Deviations-II(Symmetric Bell-shaped distributions) k = 2: proportion > 75% k = 3: proportion > 88% Chebychev’s Theorem: The proportion of the data lying within k standard deviations (k >1) of the mean is at least 1 - 1/k2 68% 34% 95% 13.5% 99.7%

  10. Standard Deviations-III(Grouped data) • The standard deviation of a frequency distribution is: • Note: where the frequency distribution consists of bins that are ranges, xi should be the midpoint of bin i (be careful of the first and last bins).

  11. Standard Deviations-IV(The shortcut formula)

  12. The Coefficient of Variation • The is the standard deviation divided by the mean - often expressed as a percentage. • The coefficient of variation is dimensionless and can be used to compare among data sets based on different units. coefficient of variation

  13. Z-Scores Standard (or Z) score • The is calculated using the equation:

  14. Outliers-I • Outliers can lead to mis-interpretation of results. They can arise because of data errors (typing measurements in cm rather than in m) or because of unusual events. • There are several rules for identifying outliers: • Outliers: < Q2-6(Q2-Q1); > Q2+6(Q3-Q2) • Strays: < Q2-3(Q2-Q1); > Q2+3(Q3-Q2)

  15. Outliers-II • Strays and outliers should be indicated on box and whisker plots: • Consider the data set of bowhead lengths, except that a length of 1 is added! 5 10 15 Length (m)

  16. Review of Symbols in this Lecture

  17. Summary • We use descriptive statistics to “get a feel for the data” (also called “exploratory data analysis”). In general, we are using statistics from the sample to learn something about the population.

More Related