1 / 13

DESCRIBING DISTRIBUTION NUMERICALLY

DESCRIBING DISTRIBUTION NUMERICALLY. MEASURES OF CENTER: MIDRANGE = (MAX + MIN) / 2 MEDIAN IS THE MIDDLE VALUE WITH HALF OF THE DATA ABOVE AND HALF BELOW IT. MEAN = (SUM OF DATA) / (NUMBER OF COUNTS n) EXAMPLE:

adia
Download Presentation

DESCRIBING DISTRIBUTION NUMERICALLY

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DESCRIBING DISTRIBUTION NUMERICALLY MEASURES OF CENTER: • MIDRANGE = (MAX + MIN) / 2 • MEDIAN IS THE MIDDLE VALUE WITH HALF OF THE DATA ABOVE AND HALF BELOW IT. • MEAN = (SUM OF DATA) / (NUMBER OF COUNTS n) EXAMPLE: DATA: 45, 46, 49, 35, 76, 80, 89, 94, 37, 61, 62, 64, 68, 56, 57, 57, 59, 71, 72. SORTED DATA: 35, 37, 45, 46, 49, 56, 57, 59, 61, 62, 64, 68, 71, 72, 76, 80, 89, 94. MIDRANGE = (94 + 35) / 2 = 64.5 MEDIAN = 61 MEAN = (35 + 37 + … + 94) / 19 = 62 NOTE: FOR SKEWED DISTRIBUTIONS THE MEDIAN IS A BETTER MEASURE OF THE CENTER THAN THE MEAN.

  2. MEASURES OF THE SPREAD • RANGE = MAX – MIN • INTERQUARTILE RANGE (IQR) = Q3 – Q1 Q3 = UPPER QUARTILE = MEDIAN OF UPPER HALF OF DATA(INCLUDE MEDIAN IF n IS ODD) Q1 = LOWER QUARTILE MEDIAN OF LOWER HALF OF DATA(INCLUDE MEDIAN IF n IS ODD) • VARIANCE (later) • STANDARD DEVIATION (later)

  3. Quartiles EXAMPLE: (odd number of observations, 19) Median = 61 UPPER HALF 35 37 45 46 49 56 57 57 59 [61 62 64 68 71 72 76 80 89 94] Q3 = (71 +72) / 2 = 71.5 LOWER HALF [35 37 45 46 49 56 57 57 59 61] 62 64 68 71 72 76 80 89 94 Q1 = (49 + 56) / 2 = 52.5 IQR = 71.5 – 52.5 = 19 Note: Include the median in the calculation of both quartiles

  4. Quartiles EXAMPLE: (even number of observations, 18) 35 37 45 46 49 56 57 57 59 [60] [61 62 64 68 71 72 76 80 89 ] 60 = Median = (59+61)/2 (Average of the middle two numbers) UPPER HALF 35 37 45 46 49 56 57 57 59 [60] [61 62 64 68 71 72 76 80 89 ] Q3 = 71 LOWER HALF [35 37 45 46 49 56 57 57 59 ] 62 64 68 71 72 76 80 89 94 Q1 = 49 IQR = 71 – 49 = 42

  5. 5 – NUMBER SUMMARY: • THE 5-NUMBER SUMMARY OF A DISTRIBUTION REPORTS ITS MEDIAN, QUARTILES, AND EXTREMES(MINIMUM AND MAXIMUM) • MAX = 94 • Q3 = 71.5 • MEDIAN = 61 • Q1 = 52.5 • MIN=35 OUTLIERS: DATA VALUES WHICH ARE BEYOND FENCES IQR = Q3 – Q1 = 19 UPPER FENCE = Q3 + 1.5IQR = 71.5 + 1.5x19 = 100 LOWER FENCE = Q1 – 1.5IQR = 52.5 – 1.5x19 = 24 IN THE EXAMPLE CONSIDERED ABOVE, THERE ARE NO OUTLIERS.

  6. BOXPLOTS WHENEVER WE HAVE A 5-NUMBER SUMMARY OF A\ (QUANTITATIVE) VARIABLE, WE CAN DISPLAY THE INFORMATION IN A BOXPLOT. • THE CENTER OF A BOXPLOT IS A BOX THAT SHOWS THE MIDDLE HALF OF THE DATA, BETWEEN THE QUARTILES. • THE HEIGHT OF THE BOX IS EQUAL TO THE IQR. • IF THE MEDIAN IS ROUGHLY CENTERED BETWEEN THE QUARTILES, THEN THE MIDDLE HALF OF THE DATA IS ROUGHLY SYMMETRIC. IF IT IS NOT CONTERED, THE DISTRIBUTION IS SKEWED. • THE MAIN USE FOR BOXPLOTS IS TO COMPARE GROUPS.

  7. BOXPLOTS

  8. Examples: • 1. Here are costs of 10 electric smoothtop ranges rated very good or excellent by Consumers Reports in August 2002. • 850 900 1400 1200 1050 • 1000 750 1250 1050 565 • Find the following statistics by hand: • a) mean • b) median and quartiles • c) range and IQR

  9. VARIANCE = “AVERAGE” SQUARE DEVIATION FROM THE MEAN • DEVIATION = (each data value) – mean • VARIANCE = 4648 / (19 -1) = 258.8 • STANDARD DEVIATION = SQUARE ROOT ( VARIANCE) = 16.1

  10. VARIANCE = “AVERAGE” SQUARE DEVIATION FROM THE MEAN • Step 1: Sort Data: 565 Mean = 1001.5 750 Median =1025 850 Q1=850 900 Q3=1200 1000 Range = 835 1050 IQR= 350 1050 1200 1250 1400

  11. VARIANCE = “AVERAGE” SQUARE DEVIATION FROM THE MEAN Computing the Variance • DEVIATION = (each data value) – mean • Squared Deviation= ((each data value) – mean)^2 • Sum all squared deviations • Variance = (sum of all squared deviations)/(n-1), where n = is the number of observations

  12. Variance Example: Data Squared Deviations 35 54.76 37 29.16 • 6.76 46 12.96 49 43.56 Mean = 42.4 • Variance = 147.2/4 = 36.8 • Std Deviation = square root of variance • Std dev = 6.06

  13. Some Remarks • If the shape is skewed, report the median and IQR. • Mean and median will be very differnet. • You may want to include the mean and std deviation, but you should point out why the mean and the median differ. • If the histogram is symmetric, report the mean and the std deviation and possibly the median and IQR.

More Related