200 likes | 292 Views
Central Tendency. Statistics 2126. Introduction. As useful as like histograms and such are, it would be nice to describe data in terms of Central Tendency A single number to describe a sample BTW, the Sample is a subset of the population We are almost always dealing with samples.
E N D
Central Tendency Statistics 2126
Introduction • As useful as like histograms and such are, it would be nice to describe data in terms of Central Tendency • A single number to describe a sample • BTW, the Sample is a subset of the population • We are almost always dealing with samples
Back when I was in first year… • 77 80 83 70 90 • Would be nice to describe how I did in first year with a number • Well the one we are all pretty used to is the mean or arithmetic average • The sum of all of the data points, divided by the number of data points
The Mean • Sort of a balancing point in the data • Simply adding up the numbers and dividing by the number of observations (n) • X bar is for the sample • We might want to consider my first year marks as a population
For a population • The formula does not change, but the symbol does • We use statistics for samples • We use parameters for populations • • The formula is the same really
The mean is not mean • In the population, the mean does not change • The sample, yeah it changes, sample to sample • Parameters do NOT CHANGE
However, the lecture is getting meaner • If you sample from a population you will get different values for x bar each time • We don’t care about samples in the long run, we care about populations • Calculating is pretty hard, umm it takes forever • Used sometimes, elections, the census
Samples vs. populations • A good sample will give you a killer estimate of the population • The census could be done via sampling actually • This is because x bar is an unbiased estimator of • It overestimates as often as it underestimates
Weighted averages sometimes • Some assignments worth more than others for example • There are other measures of central tendency though
The median • No need for a formula here • 50th percentile • Midpoint • Half below, half above
The mode • The most common observation • Virtually useless • Example 25 25 37 42 25 • The mode is 25 • Tough eh…
If…. • If the median = mean = mode we have a unimodal, symmetrical distribution • Say IQ in the population, all measures of central tendency = 100
Normal distribution • You don’t have to get a normal distribution when you have a unimodal, symmetrical distribution • It is probably the most common one though
Why? • Why do we need all of these measures of central tendency? • They all have different properties • The mode is useless… • So let’s move on
Median vs. the Mean • Say you have five numbers • 1 2 3 4 5 • The mean is 3, as is the median • (BTW, the mode is umm well there are 5 of them) • Add another value • 750
Mean vs median in a final all out battle to the death • Now the mean is 127.5 • So adding an extreme value really affects the mean • Median is now umm let’s see • 1 2 3 4 5 750 • 3.5 • cool
Median for the win • So sometimes it is good • Think about say union negotiations • Both sides can talk about average salary • Both are right! • In this case the median is more useful
So the median is useful • Especially when there are outliers • However you want to leave them in • When you want to take all of the scores into account though you have to use the mean really • All of our techniques are about means • The median is, pretty much, a dead end statistically
Running out of pithy titles • The mean is most useful for symmetrical distributions • Most distributions we deal with will be like this • Most are pretty much symmetrical, more or less