420 likes | 1.1k Views
Measures of Central Tendency. Prepared by: Josefina V. Almeda Professor and College Secretary School of Statistics University of the Philippines, Diliman August 2009. Measures of Central Tendency. OUTLINE Mean Median Mode. Describing Data with Summary Measures. Summary Measures. Variation.
E N D
Measures ofCentral Tendency Prepared by:Josefina V. AlmedaProfessor and College SecretarySchool of StatisticsUniversity of the Philippines, DilimanAugust 2009
Measures of Central Tendency OUTLINE • Mean • Median • Mode
Describing Data with Summary Measures Summary Measures Variation Central Tendency Other Locations Mean Mode Coefficient of Variation Median Range Variance Quartiles Standard Deviation
Measures of Central Tendency • Measure of central tendency is an index of the central location of a distribution. It is a single value that is used to identify the “center” of the data or the typical value. • Precise yet simple • Most representative value of the data
The Arithmetic Mean The arithmetic mean is the sum of all observed values divided by the total number of observations. The population mean for a finite population with N elements, denoted by the Greek letter (lowercase Greek letter mu), is
The sample mean for a finite sample with n elements, denoted by The population mean is a parameter while the sample mean is a statistic.
Examples of Arithmetic Mean • Given the number of children of a sample of 10 currently married women: 3, 4, 2, 5, 1, 3, 4, 2, 3, 3, find the mean number of children of the currently married women. • Solution: We compute for the sample mean. The mean number of children of currently married women is 3.
Given the incidence of alleged human rights violations by region for the year 2004, find the mean incidence of alleged human rights violations.
Solution: We get the population mean incidence of alleged human rights violations The mean incidence of alleged human rights violations per region is 59.3.
Properties of the Mean • The mean is the most common measure of central tendency since it employs every observed value in the calculation. • 2. It may or may not be an actual observed value in the data set. • We may compute the mean for both ungrouped and grouped data sets. • 4. Extreme observations affect the value of the mean especially if the number of observations is small.
The value of the mean always exists and unique. • It is a widely understood measure of central tendency. • We use the mean if the distribution is not so asymmetrical; • when we give equal importance to the effect of all • observed values; and when we compute other statistics • later on.
The Weighted Mean * if the individual values do not have equal importance, then we compute for the weighted mean. * We assign weights to the observed values of the data set before we can get the weighted mean.
Formula of Weighted Mean If we assign a weight to each observation where i = 1, 2,…, n, and n is the number of observations in the sample, then the weighted sample mean is given by
Example of Weighted Mean Suppose a government agency gives scholarship grants to employees taking graduate studies. Courses in graduate studies earn credits of 1, 2, 3, 4, or 5 units. They can get a partial scholarship for the next semester if they get a weighted average of 1.5 to 1.75 and a full scholarship if the average is better than 1.5, which means an average of 1.0 to 1.49. What kind of scholarship will the 2 employees get given their grades for the previous semester?
Consider the grades of the two employees in the previous semester:
Solution: We let the units be the weights Wi and the grade is the Xi. Weighted average of employee A: Weighted average of employee B: Thus, employee A will get a partial scholarship while employee B will get a full scholarship.
The Combined Population Mean We can obtain the mean of several data sets given the means and number of observations of each data set. This is what we call the combined mean. Suppose that k finite populations having measurements, respectively, have means The combined population mean, of all the populations is
If random samples of size , selected from these k populations, have the means respectively, the combined sample mean of all the sample data is
Example of the Combined Mean The Philippines have 6028 male children deaths and 4948 female children deaths for the age group 1-4 in 2002. The average number of deaths for male and female children is 376.8 and 309.2. What is the combined population mean for both sexes? = 376.8 Solution: We let = 6028 and N2 = 4948. = 309.2 Thus, = 309.2 = 376.8 Thus, The average number of deaths for children 1-4 years old for both sexes is 346.
The Median * It divides an ordered observation into two equal parts so that half of the observations are below its value and the other half are above its value. * It is the positional middle of the array. Example: If the median annual family income of 500 families is P185,000, then this implies that half of the 500 families (250 families) have annual family income lower than P185,000 and the other half (250 families) have annual family income higher than P185,000.
Computation of the Median * The first step in finding the median, denoted by Md, is to arrange the observations in an array. Case 1: If the number of observations n is odd, the median is the middle observed value in the array. Case 2: If the number of observations n is even, the median is the average of the two middle observed values in the array.
Examples of the Median • The annual per capita poverty threshold in pesos of the different regions of the Philippines are as follows: 15,693, 13,066, 12,685, 11,128 13,760, 13,657, 11,995, 11,372, 11,313, 9,656, 9,518, 9,116, 10,503, 10,264, 10,466, 10,896, 12,192. • Solution: We arrange the 17 annual per capita poverty threshold • in pesos of the 17 regions of the Philippines from • lowest to highest.
Array: 9116, 9518, 9656, 10264, 10466, 10503, 10896, 11128, 11313, 11372, 11995, 12192, 12,685, 13066, 13657, 13760, 15693 Since n = 17 is odd, the median is the middle observed value in the array. That is the median is P11,313.00. Interpretation: Half of the 17 regions have annual per capita poverty threshold of P11,313 and the other half have annual per capita poverty threshold higher than P11,313 pesos.
The following are the number of telephone lines of 16 regions • for the year 2004: 2799079, 94079, 190335, 42860, 410841, 1049413, 125157, 427497, 470299, 151652, 35945, 147513, 295334, 82616, 117116, 33315. Find the median. Array: 33315, 35945, 42860, 82616, 94079, 117116, 125517, 147513, 151650, 190335, 295334, 410841, 427497, 470299, 1049413, 2799079 n = 16 is even Interpretation: 50% of the 16 regions have number of telephone lines less than 149581.5 and the upper 50% have number of telephone lines more than 149581.5.
Characteristics of the Median • The median is a positional measure. This implies that • extreme values affect the median less than the mean. • We use the median as a measure of central tendency if we • wish the exact middle value of the distribution, when there • are extreme observed values, and when the frequency • distribution table has open-ended class intervals.
The Mode * is the observed value that occurs with the greatest frequency in a data set. * determine the mode by counting the frequency of each observed value and finding the observed value with the highest frequency of occurrence. * Generally, the mode is a less popular measure of central tendency as compared to the mean and the median.
Examples of Mode • Given the data on number of children of 12 currently • married women: 2, 2, 1, 1, 1, 3, 3, 4, 4, 2, 2, 2. Find the mode. • By inspection, the mode is 2. • Interpretation: The most frequent number of children among • the 12 currently married women is 2. • Given the data on number of cases resolved by a 10 lawyers: 5, 4, 1, 1, 3, 3, 2, 1, 3, 0. Find the mode. The modes are 1 and 3.
Given the data on number of cases handled by 14 PAO lawyers : 629, 645, 356, 656, 231, 455, 412, 289, 444, 452, 642, 225, 335, 411. Find the mode.
Characteristics of the Mode • The mode gives the most typical value of a set of observations. • Few low or high values do not easily affect the mode. • The mode is sometimes not unique and does not exist. • We can have several modes for one data set. If there is one mode, it is unimodal. If there are two modes, we call it bimodal. If there are more than two modes, then we call it multimodal. • The value of the mode is always one of the observed values in the data set. • We can get the mode for both quantitative and qualitative types of data.
Example of Mode for Qualitative Data Given the number of cellular mobile telephone subscribers for the year 2001, what is the mode? Telephone Operator Number of Subscribers EXTELCOM 194,452 GLOBE TELECOM 5,405,415 ISLACOM 181,614 PILTEL 1,483,838 SMART 4,893,844
Round-Off Rule * In performing calculations, we only round-off the final answer and not the transitional values. * The final answer should increase by one digit of the original observations. Example: The mean of the data set 3, 4, and 6 is 4.3333333333….. Round this figure to the nearest tenth since the original observed values are whole numbers. Thus, the mean becomes 4.3. Example: If the original observed values have one decimal place like 4.5, 6.3, 7.7, 8.9, then we round the final answer to two decimal places. Thus, if we get the mean, the final answer is 6.85.