130 likes | 293 Views
DESCRIBING DISTRIBUTION NUMERICALLY. MEASURES OF CENTER: MIDRANGE = (MAX + MIN) / 2 MEDIAN IS THE MIDDLE VALUE WITH HALF OF THE DATA ABOVE AND HALF BELOW IT. MEAN = (SUM OF DATA) / (NUMBER OF COUNTS n) EXAMPLE:
E N D
DESCRIBING DISTRIBUTION NUMERICALLY MEASURES OF CENTER: • MIDRANGE = (MAX + MIN) / 2 • MEDIAN IS THE MIDDLE VALUE WITH HALF OF THE DATA ABOVE AND HALF BELOW IT. • MEAN = (SUM OF DATA) / (NUMBER OF COUNTS n) EXAMPLE: DATA: 45, 46, 49, 35, 76, 80, 89, 94, 37, 61, 62, 64, 68, 56, 57, 57, 59, 71, 72. SORTED DATA: 35, 37, 45, 46, 49, 56, 57, 59, 61, 62, 64, 68, 71, 72, 76, 80, 89, 94. MIDRANGE = (94 + 35) / 2 = 64.5 MEDIAN = 61 MEAN = (35 + 37 + … + 94) / 19 = 62 NOTE: FOR SKEWED DISTRIBUTIONS THE MEDIAN IS A BETTER MEASURE OF THE CENTER THAN THE MEAN.
MEASURES OF THE SPREAD • RANGE = MAX – MIN • INTERQUARTILE RANGE (IQR) = Q3 – Q1 Q3 = UPPER QUARTILE = MEDIAN OF UPPER HALF OF DATA(INCLUDE MEDIAN IF n IS ODD) Q1 = LOWER QUARTILE MEDIAN OF LOWER HALF OF DATA(INCLUDE MEDIAN IF n IS ODD) • VARIANCE (later) • STANDARD DEVIATION (later)
Quartiles EXAMPLE: (odd number of observations, 19) Median = 61 UPPER HALF 35 37 45 46 49 56 57 57 59 [61 62 64 68 71 72 76 80 89 94] Q3 = (71 +72) / 2 = 71.5 LOWER HALF [35 37 45 46 49 56 57 57 59 61] 62 64 68 71 72 76 80 89 94 Q1 = (49 + 56) / 2 = 52.5 IQR = 71.5 – 52.5 = 19 Note: Include the median in the calculation of both quartiles
Quartiles EXAMPLE: (even number of observations, 18) 35 37 45 46 49 56 57 57 59 [60] [61 62 64 68 71 72 76 80 89 ] 60 = Median = (59+61)/2 (Average of the middle two numbers) UPPER HALF 35 37 45 46 49 56 57 57 59 [60] [61 62 64 68 71 72 76 80 89 ] Q3 = 71 LOWER HALF [35 37 45 46 49 56 57 57 59 ] 62 64 68 71 72 76 80 89 94 Q1 = 49 IQR = 71 – 49 = 42
5 – NUMBER SUMMARY: • THE 5-NUMBER SUMMARY OF A DISTRIBUTION REPORTS ITS MEDIAN, QUARTILES, AND EXTREMES(MINIMUM AND MAXIMUM) • MAX = 94 • Q3 = 71.5 • MEDIAN = 61 • Q1 = 52.5 • MIN=35 OUTLIERS: DATA VALUES WHICH ARE BEYOND FENCES IQR = Q3 – Q1 = 19 UPPER FENCE = Q3 + 1.5IQR = 71.5 + 1.5x19 = 100 LOWER FENCE = Q1 – 1.5IQR = 52.5 – 1.5x19 = 24 IN THE EXAMPLE CONSIDERED ABOVE, THERE ARE NO OUTLIERS.
BOXPLOTS WHENEVER WE HAVE A 5-NUMBER SUMMARY OF A\ (QUANTITATIVE) VARIABLE, WE CAN DISPLAY THE INFORMATION IN A BOXPLOT. • THE CENTER OF A BOXPLOT IS A BOX THAT SHOWS THE MIDDLE HALF OF THE DATA, BETWEEN THE QUARTILES. • THE HEIGHT OF THE BOX IS EQUAL TO THE IQR. • IF THE MEDIAN IS ROUGHLY CENTERED BETWEEN THE QUARTILES, THEN THE MIDDLE HALF OF THE DATA IS ROUGHLY SYMMETRIC. IF IT IS NOT CONTERED, THE DISTRIBUTION IS SKEWED. • THE MAIN USE FOR BOXPLOTS IS TO COMPARE GROUPS.
Examples: • 1. Here are costs of 10 electric smoothtop ranges rated very good or excellent by Consumers Reports in August 2002. • 850 900 1400 1200 1050 • 1000 750 1250 1050 565 • Find the following statistics by hand: • a) mean • b) median and quartiles • c) range and IQR
VARIANCE = “AVERAGE” SQUARE DEVIATION FROM THE MEAN • DEVIATION = (each data value) – mean • VARIANCE = 4648 / (19 -1) = 258.8 • STANDARD DEVIATION = SQUARE ROOT ( VARIANCE) = 16.1
VARIANCE = “AVERAGE” SQUARE DEVIATION FROM THE MEAN • Step 1: Sort Data: 565 Mean = 1001.5 750 Median =1025 850 Q1=850 900 Q3=1200 1000 Range = 835 1050 IQR= 350 1050 1200 1250 1400
VARIANCE = “AVERAGE” SQUARE DEVIATION FROM THE MEAN Computing the Variance • DEVIATION = (each data value) – mean • Squared Deviation= ((each data value) – mean)^2 • Sum all squared deviations • Variance = (sum of all squared deviations)/(n-1), where n = is the number of observations
Variance Example: Data Squared Deviations 35 54.76 37 29.16 • 6.76 46 12.96 49 43.56 Mean = 42.4 • Variance = 147.2/4 = 36.8 • Std Deviation = square root of variance • Std dev = 6.06
Some Remarks • If the shape is skewed, report the median and IQR. • Mean and median will be very differnet. • You may want to include the mean and std deviation, but you should point out why the mean and the median differ. • If the histogram is symmetric, report the mean and the std deviation and possibly the median and IQR.