160 likes | 355 Views
Describing Distributions with Numbers: Center. The Mean (average). Median. The median is the midpoint of a distribution. Half of the numbers are larger and half of he numbers are smaller. (n + 1)/2. Calculate the mean and median for each data set. 2, 2, 2, 2, 2, 2, 2, 2, 2.
E N D
Median The median is the midpoint of a distribution. Half of the numbers are larger and half of he numbers are smaller. (n + 1)/2
Calculate the mean and median for each data set. 2, 2, 2, 2, 2, 2, 2, 2, 2 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 1, 2, 3, 4, 4, 4, 4, 4, 4, 5, 6 1, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 5, Calculate Q1 and Q3 for each data set.
Mean vs Median In a symmetric distribution, the mean and the median are exactly the same. In a skewed distribution, the mean is farther out in the direction of the skewness. The median is resistant to extreme values where the mean is affected by extreme values.
Examples of Distributions: Variable Mean Median Serum Cholesterol Levels 230.00 230.00
Variable Mean Median C3 7.249 6.558
Variable N N* Mean Median Sleep (H) 53 0 6.415 6.500
Variable N N* Mean Median Binomial 100 0 17.350 18.000
The Quartiles • Read like percentiles • Range of all numbers • The size of the quartiles shows spread (bigger range of values) or is bunched up (smaller range of values) • Outliers: 1.5 times IQR for suspected outliers
Calculate the IQR for each data set. 2, 2, 2, 2, 2, 2, 2, 2, 2 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 1, 2, 3, 4, 4, 4, 4, 4, 4, 5, 6 1, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 5, Where would suspected outliers fall?
VarianceStandard Deviation The average of the squared deviations. Measure of spread.
Calculate by hand the Standard deviation for each data set. 2, 2, 2, 2, 2, 2, 2, 2, 2 1, 2, 3, 4, 5, 6, 7 1, 2, 3, 4, 4, 4, 4, 4, 4, 5, 6, 7 1, 2, 2, 3, 3, 3, 4, 4, 4, 4 Calculate descriptive numbers with the calculator.
Standard Deviation • Uses the mean to measure the spread • Why square the deviations?: sum is zero if not squared and less obvious reasons. • Symmetrical (almost normal) have some interesting point. If the data is not symmetrical, we should not use standard deviation (it is not resistant to extreme values. • Variance does not have the proper units • (n – 1): degrees of freedom
Why have both: 5 number summaries and mean with standard deviation? • Each has its place • Know the proper place by looking at the shape of the data. ALWAYS GRAPH THE DATA!!!!!!!!! • Always use: Shape, Center, and Spread when describing a distribution!!!!!!!!!! • HW: Page 100 problem 1.53