1 / 62

Understanding Central Tendency in Statistics

Learn about measures of central tendency, dispersion, bias, and skewness in statistical data analysis. Explore central values, modes, medians, and their importance in data representation.

galvant
Download Presentation

Understanding Central Tendency in Statistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Fundamental statistical characteristics I: Measures of central tendency Chapter 3

  2. Fundamental statistical characteristics Group indexes • Central tendency • Variability (Dispersion) • Bias (Asymmetry) • Skewness (Kurtosis) Individual indexes • Position • Centiles (Ci) • Percentiles (Pi) • Quartiles (Qi) • Raw scores (Xi) • Differentials scores (xi) • Standard scores (Zi)

  3. Which value represents the whole? Around which value are the majority of the data?

  4. How is the data arranged with respect to the distribution center? How far or together are the data from each other?

  5. How are the data arranged with respect to the rest? Are data piled at one end?

  6. Which form is the distribution? Is it flattened or sharp?

  7. Central tendency indexes

  8. To describe a data distribution we need at least two statistics: • 1. One that reflects the central tendency: value which represents the whole. Value around which the majority of the data is placed. • 2. Another that reflects the dispersion around this value, if the data are far apart or close together with respect to the central value.

  9. Central tendency measure • Is a brief description of a mass of data, usually obtained from a sample. • Serves to describe, indirectly, the population from which the sample was extracted. • Representative sample; the average of their values will say a lot of the average we would get on the population they represent.

  10. 1. Which is the value most often repeated? Mo

  11. Mode (Mo) • “The most often repeated value”. • “The value most frequently observed in a sample or population”. • “The variable value with the highest absolute frequency”. • It is symbolized byMo (Fechner y Pearson)

  12. Type I distributions:Small data set a) Unimodal distribution: • Data: [8 – 8 – 11 – 11 – 15 – 15 – 15 – 15 – 15 – 17 – 17 – 17 – 19 - 19] Mo = 15

  13. b) Amodal distribution: • Data: [8 – 8 – 8 – 11 – 11 – 11 – 15 – 15 – 15 – 17 – 17 – 17 – 19 – 19 –19] Withoutmode

  14. c) Bimodal distribution: • Data: [8 – 9 – 9 – 10 – 10 – 10 – 10 – 11 – 11 – 13 – 13 – 13 – 13 – 15] Mo1 = 10 Mo2 = 13

  15. d) Multimodal distribution: • Data: [8 – 8 – 9 – 9 – 9– 10 – 11 – 11 – 11– 12 – 12 – 13 – 13 – 13– 14 – 15 - 15] Mo2 = 11 Mo3 = 13 Mo1 = 9

  16. Type II distributions:Big data set Frequency table a) Unimodal distribution: Mo = 14 MOST OFTEN REPEATED VALUE 14

  17. b) Bimodal distribution:Mo1 = 2 y Mo2 = 6 2 MOST OFTEN REPEATED VALUES 6

  18. Complete the table if you know that the modes are: -2, -1 y 5 and that f3 = f4

  19. 2. What is the average score in motivation?

  20. Arithmetic mean • It is the central tendency index most commonly used • Definition: “It is the sum of all observed values divided by the total number of them”.

  21. Type I distributions:Small data set • Example: The following are 10 numbers remembered by 10 children in a immediate memory task • 6 – 5 – 4 – 7 – 5 – 7 – 8 – 6 – 7 - 8

  22. 6 – 5 – 4 – 7 – 5 – 7 – 8 – 6 – 7 - 8 5 8 6 7 4 6.3

  23. In the following serie, the “center of gravity” is:3 – 10 – 8 – 4 – 7 – 6 – 9 – 12 – 2 – 4

  24. 2 3 4 5 6 7 8 9 10 11 12 6.5

  25. Type II distributions:Big data set Possibility 1: MEAN FREQUENCY TABLE

  26. Frequency tables

  27. 0 1 3 4 2 1.65

  28. Clearances

  29. Possibility 2 (derived from possibility 1):

  30. 3. Which is the value exceeded by half of the subjects? Mdn

  31. Median (Mdn) • Definitions: • It is the distribution point that divides it into 2 equal parts. • It is the value with the property that the number of observations smaller than itself is equal to the number of observations higher than itself. • It is the value that holds the central point of an ordered series of data. • 50% of the values are above and the other 50% is below the central value.

  32. Graphic representation • It is defined as a point (a value), not like a data or particular measure. • A point whose value does not necessarily have to match any observed values.

  33. Type I distributions:Small data set ODD data set: [7 – 11 – 6 – 5 – 7 – 12 – 9 – 8 – 10 – 6 – 9] 1º) Data is sorted from the lowest to the highest: [5 – 6 – 6 – 7 – 7 – 8 – 9 – 9 – 10 – 11 – 12] 2º) Central value is obtained:

  34. [5 – 6 – 6 – 7 – 7 – 8 – 9 – 9 – 10 – 11 – 12] Mdn = 8

  35. EVEN data set: [23 – 35 – 43 – 29 – 34 – 41 – 33 – 38 – 38 – 32] 1º) Data is sorted from the lowest to the highest: [23 – 29 – 32 – 33 – 34 – 35 – 38 – 38 – 41 – 43] 2º)

  36. [23 – 29 – 32 – 33 – 34 – 35 – 38 – 38 – 41 – 43]

  37. Type II distributions:Big data set Frequency tables Example: • n= 36 • To be even, there • are 2 central data • 36/2=18. Central point between 18 and 19 (18’5) • x18=x19=10; x18’5=10

  38. Comparison between measures of central tendency • If there aren’t arguments against, we always prefer the mean: • Other statistics are based on the mean. • It's the best estimator of their parameter. • We prefer the median: • When the variable is ordinal. • When there exists very extreme data. • When there exist open intervals. • We prefer the mode: • When the variable is qualitative or nominal. • When the open interval matches the median.

  39. Degree of agreement to consider "shouting" as a sign of aggression

  40. Number of rituals that students do before an exam

  41. Position measures

  42. Central tendency measures: used to indicate around which particular value a concrete data set is placed. • Position measures: used to provide information about the relative position in which a case is with respect to the data set which it belongs to. • Are used to interpret specific data.

  43. Quantiles • Mdn: divides the distribution in 2 parts: • Quartiles (Qk): divide the distribution in 4 parts: Q1, Q2, Q3: i/k = 1/4 • Deciles (Dk): divide the distribution in 10 parts : D1, D2, ... , D9: i/k = 1/10 • Percentiles (Pk): divides the distribution in 100 parts: P1, P2, ... , P99: i/k = 1/100 • They divide the distribution in K parts with the same amount of data. i/k = ½

  44. Quantiles graphic representation

  45. Calculating the value that corresponds to a particular quantile • 1. Translate the position measurement to an absolute position • 2. Find out the value for the data that occupies the absolute position of our interest • The question is: What value takes the position ...?

  46. E.g. 7th decile corresponds to the position 20; • Which value is shown by the data that takes the absolute position 20?

  47. Example: Q3

More Related