340 likes | 350 Views
Discuss group project. Central Tendency. A statistical measure that serves as a descriptive statistic Mean, Median and Mode A single value that summarize or condense a large set of data accurately describes the center of the distribution represents the entire distribution of scores
E N D
Central Tendency A statistical measure that serves as a descriptive statistic Mean, Median and Mode A single value that summarize or condense a large set of data accurately describes the center of the distribution represents the entire distribution of scores Used to compare two (or more) sets of data comparing the average score for one set versus the average score for another set.
Figure 3-1Frequency distribution for ratings of attractiveness of a female face shown in a photograph for two groups of male participants: those who had consumed no alcohol and those who had consumed moderate amounts of alcohol.
Figure 3-2 Three distributions demonstrating the difficulty of defining central tendency. In each case, try to locate the “center” of the distribution
The Mean, the Median, and the Mode • No single procedure always produces a good, representative value. • Therefore, researchers have developed three commonly used techniques for measuring central tendency: • the mean • the median • the mode
The Mean • Most commonly used measure of central tendency • Requires scores on an interval or ratio scale • Computation • sum, or total, for the entire set of scores • then dividing this sum by the number of scores • Population formula m =ΣX/ N or • Some of scores divided by population size (N) • for example scores of 3, 7, 4, 6 • ΣX = 20 N = 4 m =20/4 = 5 • Sample formula M = ΣX/ n or in some books • Some of scores divided by sample size (n)
Figure 3-3The frequency distribution shown as a seesaw balanced at the mean. • Conceptually, the mean can also be defined as: • the amount that each individual receives when the total is divided equally among all individuals • n = 6 boys with 180 baseball cards divided equally • M = ΣX/ n M = 180/6 = 30 • the balance point of the distribution
Weighted Mean • When combining two sets of scores with different sample size • First Sample n = 12 ΣX = 72 M = 6 • Second Sample n = 8 ΣX = 56 M = 7 • Overall Mean • WRONG (6+7)/2 = 6.5 only if samples are the same size • Combined sum divided by combined n • Weight mean M = (ΣX1 + ΣX2 ) / n1 + n2 • CORRECT (72 + 56) / (12 + 8) = 128/20 = 6.4 • Skip Box 3.1
Table 3.1Calculating the mean from a frequency distribution table. Statistics quiz scores for a section of n = 8 students. Σfx = ΣX = 66 Σf = n = 8 M = 66/8 = 8.25
Changing the Mean • Calculation of the mean involves every score in the distribution, so: • modifying a set of scores • by discarding scores • by adding new scores • will usually change the value of the mean • To determine how the mean will be changed • determine how the number of scores (n) changes • determine how the sum of the scores (ΣX) changes
Figure 3-4 Adding a ScoreA distribution of N = 5 scores that is balanced with a mean of µ = 7. What if a new score X = 10 is added to the distribution? 10 ▼ 4 ▼ → New sample N = 6 ΣX = 45 m = 45/6 = 7.5 Original sample N = 5 ΣX =35 m = 35/5 = 7 New sample N = 7 ΣX = 49 m = 49/7 = 7
Changing the Mean • Changing the value of any score will change the value of the mean • If constant value is added to every score in a distribution, • then the same constant value is added to the mean • If every score is multiplied by a constant value, • then the mean is also multiplied by the same constant value
Table 3.2 Adding or Subtracting a ConstantNumber of sentences recalled for humorous and nonhumorous sentences. +2 n = 6 n = 6
Table 3.3Multiplying or dividing by a constant; Measurement of five pieces of wood in inches and transformed to centimeters. So mean centimeters is 2.54 times mean inches. n = 5 n = 5
The Median • The midpoint of scores listed in order from smallest to largest • Same as the 50th percentile • 50% of the scores are below the median • Computation • requires scores measured on an ordinal, interval, or ratio scale • simple counting procedure • With an odd number of scores • list the values in order • the median is the middle score in the list. • With an even number of scores see • list the values in order • the median is half-way between the middle two scores
Example 3.7The median divides the area in the graph exactly in half.Scores of 3, 5, 8, 10, 11 organized by valueAn odd number of scores so the middle score is 8
Example 3.8 The median divides the area in the graph exactly in half.Scores of 3, 3, 4, 5, 7, 8 organized by valueAn even number of scores so the middle is between 4 and 5median is (4 + 5)/2 = 9 / 2 = 4.5
The Median • If the scores are measurements of a continuous variable • Can calculate the 50% position on the number line (50th percentile) • Useful when scores are piled up around the median • Technique using a histogram • placing the scores in a frequency distribution histogram • with each score represented by a box on the graph. • draw a vertical line through the distribution that exactly half the boxes are on each side of the line • The median is defined by the location of the line. • Technique using a frequency distribution table • Organize the scores in a frequency distribution table • Calculate the 50th percentile • See below and Box 3.2
Figure 3-5 (page 84)A distribution with several scores clustered at the median. The median for this distribution is positioned so that each of the four boxes above X = 4 is divided into two sections, with 1/4 of each box below the median (to the left) and 3/4 of each box above the median (to the right). As a result, there are exactly four boxes, 50% of the distribution, on each side of the median. 3.5+0.25 4.5 – 0.75
Using a frequency distribution table to calculated 50% which is the median from example 3.7 numbers50% falls between 37.5 and 87.5 a distance of 5050% is 37.5 below 87.5% so 37.5/50 = 0.75distance between 3.5 and 4.5 is 1 1(0.75) = 0.75 4.5 – 0.75 = 3.75See box 3.2 on page 81
Figure 3-6 A population of N = 6 scores with a mean of = 4.Notice that the mean does not necessarily divide the scores into two equal groups.In this example, 5 out of the 6 scores have values less than the mean. For these six scores 2, 2, 2, 3, 3, 12The median is the middle point in the scores in this case 2.5 Median
The Mode • The most frequently occurring category or score in the distribution • Peak in a frequency distribution graph • For data measured on any scale of measurement: • nominal • ordinal • interval • ratio
Table 3.4 Favorite restaurants named by a sample of n = 100 students. Caution: The mode is a score or category, not a frequency. For this example, the mode is Luigi’s, not f = 42.
Bimodal Distributions It is possible for a distribution to have more than one mode. Such a distribution is called bimodal. In addition, the term "mode" is often used to describe a peak in a distribution that is not really the highest point. Thus, a distribution may have a major mode at the highest peak and a minor mode at a secondary peak in a different location.
Figure 3.7 Bimodal distributionA frequency distribution for tone identification scores. An example of a bimodal distribution.
Selecting a Measure of Central Tendency • Mean is preferred • uses every score in the distribution • commonly used in inferential statistics • Situations where you cannot or should not compute a mean at all • nominal data • ordinal data (usually inappropriate) • Situations where the mean does not provide a good, representative value • Extreme scores (see fig 3.8)
Figure 3-8 Frequency distribution of errors committed before reaching learning criterion. Example of effects of an extreme score on the mean producing a skewed distributionThis is an obvious example but what if the scores where only a little skewed.Statistics to the rescue, there are tests for skewness Mean M = ΣX/ n M = 203/10 = 20.3 Median is 11.5 Mode is 11.0
Selecting a Measure of Central Tendency • Situations where the mean does not provide a good, representative value • Missing values • Undetermined values (see table 3.5) • When using cut off times for measuring task performance • How long does it take to solve a puzzle • Open-ended distributions • For example: a score category of 5 or more pizzas • Can not calculate the mean • Plan ahead, try to get quantitative values
Table 3.5 Amount of time to complete puzzle.Undetermined values in the data set because Person 6 did not complete the puzzle. After 60 minutes the researcher stopped the test. There is no value for the 6th person so the mean can not be calculated.The Median 12.5 which between 3rd and 4th scores.Note 1: some researchers record the maximum time referring to it as “timed out” in this case 60 which will be an extreme score instead of missing valueGenerally a bad idea even for experienced researchers because the value is really unknown.Note 2: when it is one or two scores out of a set of one hundred scores some researchers treat this as random missing values and remove the person. i.e. remove #6. However, person #6 really did work on the puzzle and this person is part of the sample. 60
The Median • One advantage of the median is that it is relatively unaffected by extreme scores. • The median tends to stay in the "center" of the distribution even when • When the distribution is very skewed from a few extreme scores • Undetermined values; see table 3.5 • Open-ended distribution • Use the median for Ordinal measurement scale • In these situations, the median serves as a good alternative to the mean. • Used as a supplemental measure of central tendency that is reported along with the mean.
The Mode • The only measure of central tendency that can be used for data measured on a nominal scale. • Discrete variables are whole number • such as number of children in a family • Calculating the mean can produce fractions • Families have 2.33 children • Mode is more sensible but lacks accuracy • family has 2 children • Used as a supplemental measure of central tendency that is reported along with the mean or the median. • Helps to describe shaped
Central Tendency and the Shape of the Distribution • Mean, the median, and the mode are systematically related to each other. • In a unimodal symmetrical distribution, the mode, mean, and median will all have the same value. (see fig 3.11) • In a skewed distribution (see fig 3.12) • mode will be located at the peak on one side • the mean usually will be displaced toward the tail on the other side. • The median is usually located between the mean and the mode.
Figure 3-11 (p. 96)Measures of central tendency for three symmetrical distributions: normal, bimodal, and rectangular.
Figure 3-12 (p. 96) Measures of central tendency for skewed distributions.