1 / 40

Chapter 3, Part 1 Descriptive Statistics II: Numerical Methods

Chapter 3, Part 1 Descriptive Statistics II: Numerical Methods. Measures of Location Measures of Variability. x. s. s. m. Measures of Location. Mean Median Mode Percentiles. Example: Apartment Rents.

aletha
Download Presentation

Chapter 3, Part 1 Descriptive Statistics II: Numerical Methods

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 3, Part 1 Descriptive Statistics II: Numerical Methods • Measures of Location • Measures of Variability x s s m

  2. Measures of Location • Mean • Median • Mode • Percentiles

  3. Example: Apartment Rents Given below is a sample of monthly rent values ($) for one-bedroom apartments. The data is a sample of 70 apartments in a particular city. The data are presented in ascending order. 425 430 430 435 435 435 435 435 440 440 440 440 440 445 445 445 445 445 450 450 450 450 450 450 450 460 460 460 465 465 465 470 470 472 475 475 475 480 480 480 480 485 490 490 490 500 500 500 500 510 510 515 525 525 525 535 549 550 570 570 575 575 580 590 600 600 600 600 615 615

  4. Mean • The mean of a data set is the average of all the data values. • If the data are from a sample, the mean is denoted by (x-bar) • If the data are from a population, the mean is denoted by (mu). x å x i = x n m å x i m = N

  5. Example: Apartment Rents • Mean , å x 34 , 356 = i = = = = = . x 490 . 80 n 70

  6. Example: Apartment Rents • Trimmed Mean With n = 70, a 5% trimmed mean removes .05(70) = 3.5 = 4 values from each end of the set. 5% trimmed mean = , 30 , 206 = = . 487 . 19 62

  7. Median • The median of a data set is the value in the middle when the data items are arranged in ascending order. • If there is an odd number of items, the median is the value of the middle item. • If there is an even number of items, the median is the average of the values for the middle two items.

  8. Example: Apartment Rents • Median Median = 50th percentile i = (p/100)n = (50/100)70 = 35.5 Averaging the 35th and 36th data values: Median = (475 + 475)/2 = 475

  9. Mode • The mode of a data set is the value that occurs with greatest frequency.

  10. Example: Apartment Rents • Mode 450 occurred most frequently (7 times) Mode = 450

  11. Percentiles • The p th percentile of a data set is a value such that at least p percent of the items take on this value or less and at least (100-p) percent of the items take on this value or more. • Arrange the data in ascending order. • Compute index i, the position of the p th percentile. i = (p/100)n • If i is not an integer, round up. The p th percentile is the value in the i th position. • If i is an integer, the p th percentile is the average of the values in positions i and i+1.

  12. Example: Apartment Rents • 90th Percentile i = (p/100)n = (90/100)70 = 63 Averaging the 63rd and 64th data values: 90th Percentile = (580 + 590)/2 = 585

  13. Quartiles • Quartiles are specific percentiles. • First Quartile = 25th Percentile • Second Quartile = 50th Percentile = Median • Third Quartile = 75th Percentile

  14. Example: Apartment Rents • Third Quartile Third quartile = 75th percentile i = (p/100)n = (75/100)70 = 52.5 = 53 Third quartile = 525

  15. Measures of Variability • Range • Variance • Standard Deviation • Coefficient of Variation

  16. Range • The range of a data set is the difference between the largest and smallest data values. • It is the simplest measure of dispersion. • It is very sensitive to the smallest and largest data values.

  17. Example: Apartment Rents • Range Range = largest value - smallest value Range = 615 - 425 = 190

  18. Variance • The variance is the average of the squared differences between each data value and the mean. • If the data set is a sample, the variance is denoted by s2. • If the data set is a population, the variance is denoted by  2. 2 2 - - å ( x x ) i i 2 = = s - - n 1 2 - m x ) å ( x ) 2 i s = N

  19. Standard Deviation • The standard deviation of a data set is the positive square root of the variance. • It is measured in the same units as the data, making it more easily comparable to the mean. • If the data set is a sample, the standard deviation is denoted s. • If the data set is a population, the standard deviation is denoted  (sigma). = = 2 s s = s = s 2

  20. Coefficient of Variation • The coefficient of variation indicates how large the standard deviation is in relation to the mean. • If the data set is a sample, the coefficient of variation is computed as follows: • If the data set is a population, the coefficient of variation is computed as follows: s ( ) ( 100 ) x s ( 100 ) m

  21. Example: Apartment Rents • Variance • Standard Deviation • Coefficient of Variation å 2 ( x x ) i 2 s = = = , . 2 , 996 . 16 - - n 1 = = = 2 . . = s s 2996 . 47 54 . 74 . s 54 . 74 = = ´ = ´ = . 100 100 11 . 15 x 490 . 80

  22. Chapter 3, Part 2 Descriptive Statistics II: Numerical Methods • Measures of Relative Location and Locating Outliers • z -Scores • Chebyshev’s Theorem • The Empirical Rule • Detecting Outliers

  23. z -Scores • The z -score is often called the standardized value. • It denotes the number of standard deviations a data value xi is from the mean. • A data value less than the sample mean will have a z-score less than zero. • A data value greater than the sample mean will have a z -score greater than zero. • A data value equal to the sample mean will have a z -score of zero. - x x i = z i s

  24. -1.20 -1.02 -1.02 -0.93 -0.93 -1.20 -1.11 -1.11 -1.02 -1.02 -1.02 -1.02 -1.02 -0.93 -0.93 -0.93 -0.93 -0.93 -0.84 -0.84 -0.84 -0.84 -0.84 -0.75 -0.75 -0.75 -0.75 -0.75 -0.75 -0.75 -0.56 -0.56 -0.56 -0.47 -0.47 -0.47 -0.38 -0.38 -0.34 -0.29 -0.29 -0.29 -0.20 -0.20 -0.20 -0.20 -0.11 -0.01 -0.01 -0.01 0.17 0.17 0.17 0.17 0.35 0.35 0.44 0.62 0.62 0.62 0.81 1.06 1.08 1.45 1.45 1.54 1.54 1.63 1.81 1.99 1.99 1.99 1.99 2.27 2.27 Example: Apartment Rents • z -Score of Smallest Value (425) Standardized Values for Apartment Rents - - - - . x x 425 490 . 80 i = = - = = = - . z 1 . 20 . s 54 . 74

  25. Chebyshev’S Theorem At least (1 - 1/k 2) of the items in any data set will be within k standard deviations of the mean, where k is any value greater than 1. • At least 75% of the items must be within k = 2 standard deviations of the mean. • At least 89% of the items must be within k = 3 standard deviations of the mean. • At least 94% of the items must be within k = 4 standard deviations of the mean.

  26. x x x Example: Apartment Rents • Chebyshev’s Theorem Let k = 1.5 with = 490.80 and s = 54.74 At least (1 - 1/(1.5)2) = 1 - 0.44 = 0.56 or 56% of the rent values must be between - k(s) = 490.80 - 1.5(54.74) = 409 and + k(s) = 490.80 + 1.5(54.74) = 573

  27. Example: Apartment Rents • Chebyshev’s Theorem (continued) Actually, 86% of the rent values are between 409 and 573.

  28. Empirical Rule For data having a bell-shaped distribution: • Approximately 68% of the data values will be within one standard deviation of the mean. • Approximately 95% of the data values will be within two standard deviations of the mean. • Almost all of the items (99%) will be within three standard deviations of the mean.

  29. Example: Apartment Rents • Empirical Rule Interval% in Interval Within +/- 1s 436.06 to 545.54 48/70 = 69% Within +/- 2s 381.32 to 600.28 68/70 = 97% Within +/- 3s 326.58 to 655.02 70/70 = 100%

  30. Detecting Outliers • An outlier is an unusually small or unusually large value in a data set. • A data value with a z -score less than -3 or greater than +3 might be considered an outlier. • It might be an incorrectly recorded data value. • It might be a data value that was incorrectly included in the data set. • It might be a correctly recorded data value that belongs in the data set!

  31. Example: Apartment Rents • Detecting Outliers The most extreme z -scores are -1.20 and 2.27. Using |z | > 3 as the criterion for an outlier, there are no outliers in this data set.

  32. Chapter 3, Part 3 Descriptive Statistics II: Numerical Methods • Measures of Association Between Two Variables • Working with Grouped Data

  33. Measures of Association Between Two Variables • Covariance • Correlation Coefficient

  34. Covariance • Positive values indicate a positive relationship. • Negative values indicate a negative relationship. • If the data sets are samples, the covariance is denoted by sxy. • If the data sets are populations, the covariance is denoted by .

  35. Correlation Coefficient • The coefficient can take on values between -1 and +1. • Values near -1 indicate a strong negative linear relationship. • Values near +1 indicate a strong positive linear relationship. • If the data sets are samples, the coefficient is denoted by rxy. • If the data sets are populations, the coefficient is denoted by . Where Sx and Sy are the standard deviations for each variable!

  36. Mean for Grouped Data å • Sample Data • Population Data where fi = frequency of class i Mi = midpoint of class i f M i i = x n å f M i i m = N

  37. Example: Apartment Rents • Given below is the previous sample of monthly rents for one-bedroom apartments presented as grouped data in the form of a frequency distribution.

  38. Example: Apartment Rent • Mean for Grouped Data This approximation differs by $2.41 from the actual sample mean of $490.80.

  39. Variance for Grouped Data • Sample Data • Population Data - 2 å f ( M x ) i i = 2 s - n 1 - m 2 å f ( M ) i i s = 2 N

  40. Example: Apartment Rents • Sample Variance for Grouped Data • Sample Standard Deviation for Grouped Data This approximation differs by only $.20 from the actual standard deviation of $54.74. = 2 s 3 , 017 . 89 = = s 3 , 017 . 89 54 . 94

More Related