410 likes | 542 Views
INF 397C Introduction to Research in Library and Information Science Fall, 2009 Day 3. Standard Deviation. σ = SQRT( Σ (X - µ) 2 /N). New formula for σ. σ = SQRT( Σ (X - µ) 2 /N) HARD to calculate when you have a LOT of scores. Gotta do that subtraction with every one!
E N D
INF 397CIntroduction to Research in Library and Information ScienceFall, 2009Day 3
Standard Deviation σ = SQRT(Σ(X - µ)2/N)
New formula for σ • σ = SQRT(Σ(X - µ)2/N) • HARD to calculate when you have a LOT of scores. Gotta do that subtraction with every one! • New, “computational” equation • σ = SQRT((Σ(X2) – (ΣX)2/N)/N) • We’ll convince ourselves it gives us the same answer in just a minute.
So far . . . • . . . we’ve talked of summarizing ONE distribution of scores. • By ordering the scores. • By organizing them in graphs/tables/charts. • By calculating a measure of central tendency and a measure of dispersion. • What happens when we want to compare TWO distributions of scores?
“Now, why would I want to do that”? • Is your child taller or heavier? • Is this month’s SAT test any easier or harder than last month’s? • Is my 91 in my Research Methods class better than my 95 in my Digital Libraries class? • Is the new library lay-out better than the old one? • Can more employees sign up, more quickly, for benefits with our new intranet site than with our old one? • Did my class perform better on the TAKS test than they did on the TAAS test?
Well? • COULD it be the case that your 91 in your Research Methods class is better than your 95 in your Digital Libraries class? • How?
What if . . . • The mean in Research Methods was 50, and the mean in Digital Libraries was 99? • (What, besides the fact that everyone else is trying to drop the Research class!) • So:
The Point • As I said last week, you need to know BOTH a measure of central tendency AND a measure of spread to understand a distribution. • BUT STILL, this can be convoluted . . . • “Well, daughter, how are you doing in grad school this semester”?
“Well, Mom . . . • “. . . I have a 91 in Research Methods but the mean is 50 and the standard deviation is 12. But I only have a 95 in Digital Libraries, whereas the mean in that class is 99 with a standard deviation of 1.” • Of course, your mom’s reaction will be, “Just call home more often, dear.”
Wouldn’t it be nice . . . • . . . if there could be one score we could use for BOTH classes, for BOTH the TAKS test and the TAAS test, for BOTH your child’s height and weight? • There is – and it’s called the “standard score,” or “z score.” (Get ready for another headache.)
Standard Score • z = (X - µ)/σ • “Hunh”? • Each score can be expressed as the number of standard deviations it is from the mean of its own distribution. • “Hunh”? • (X - µ) – This is how far the score is from the mean. (Note: Could be negative! No squaring, this time.) • Then divide by the SD to figure out how many SDs you are from the mean.
Z scores (cont’d.) • z = (X - µ)/σ • Notice, if your score (X) equals the mean, then z is, what? • If your score equals the mean PLUS one standard deviation, then z is, what? • If your score equals the mean MINUS one standard deviation, then z is, what?
So . . . z = (X - µ)/σ • Kris had a 76 on both tests. • Test 1 - µ = 61, σ = 9 • So her z score was (76-61)/9 or 15/9 or 1.67. So we say that Kris’s score was 1.67 standard deviations above the mean. • Test 2 - µ = 83, σ = 5.4 • So her z score was (76-83)/5.4 or -7/5.4 or –1.3. So we say that Kris’s score was 1.3 standard deviations BELOW the mean. • Given what I said earlier about two-thirds of the scores being within one standard deviation of the mean . . . . • Wouldn’t it be nice if we knew exactly how many . . . ?
z = (X - µ)/σ • If I tell you that the average IQ score is 100, and that the SD of IQ scores is 16, and that Bob’s IQ score is 2 SD above the mean, what’s Bob’s IQ? • If I tell you that your 75 was 1.5 standard deviations below the mean of a test that had a mean score of 90, what was the SD of that test?
Notice . . . • The mean of all z scores (for a particular distribution) will be zero, as will be their sum. • With z scores, we transform raw scores into standard scores. • These standard scores are RELATIVE distances from their (respective) means. • All are expressed in units of σ.
z scores – table values • z = (X - µ)/σ • It is often the case that we want to know “What percentage of the scores are above (or below) a certain other score”? • Asked another way, “What is the area under the curve, beyond a certain point”? • THIS is why we calculate a z score, and the way we do it is with the z table, on p. 362 of Hinton.
z table practice • What percentage of scores fall above a z score of 1.0? • What percentage of scores fall between the mean and one standard deviation above the mean? • What percentage of scores fall within two standard deviations of the mean? • 200 people took a test. My z score is .1. How many scores did I “beat”? • My z score is .01. How many scores did I “beat”? • My score was higher than only 3% of the class. (I suck.) What was my z score. • Oooh, get this. My score was higher than only 3% of the class. The mean was 50 and the standard deviation was 10. What was my raw score?
Probability • Remember all those decisions we talked about, last week. • VERY little of life is certain. • It is PROBABILISTIC. (That is, something might happen, or it might not.)
Prob. (cont’d.) • Life’s a gamble! • Just about every decision is based on a probable outcomes. • None of you raised your hands in Week 1 when I asked for “statistical wizards.” Yet every one of you does a pretty good job of navigating an uncertain world. • None of you touched a hot stove (on purpose.) • All of you made it to class.
Probabilities • Always between one and zero. • Something with a probability of “one” will happen. (e.g., Death, Taxes). • Something with a probability of “zero” will not happen. (e.g., My becoming a Major League Baseball player). • Something that’s unlikely has a small, but still positive, probability. (e.g., probability of someone else having the same birthday as you is 1/365 = .0027, or .27%.)
Just because . . . • . . . There are two possible outcomes, doesn’t mean there’s a “50/50 chance” of each happening. • When driving to school today, I could have arrived alive, or been killed in a fiery car crash. (Two possible outcomes, as I’ve defined them.) Not equally likely. • But the odds of a flipped coin being “heads,” . . . .
Prob (cont’d.) • Probability of something happening is • # of “successes” / # of all events • P(one flip of a coin landing heads) = ½ = .5 • P(one die landing as a “2”) = 1/6 = .167 • P(some score in a distribution of scores is greater than the median) = ½ = .5 • P(some score in a normal distribution of scores is greater than the mean but has a z score of 1 or less is . . . ? • P(drawing a diamond from a complete deck of cards) = ?
Probabilities – and & or • From Runyon: • Addition Rule: The probability of selecting a sample that contains one or more elements is the sum of the individual probabilities for each element less the joint probability. When A and B are mutually exclusive, • p(A and B) = 0. • p(A or B) = p(A) + p(B) – p(A and B) • Multiplication Rule: The probability of obtaining a specific sequence of independent events is the product of the probability of each event. • p(A and B and . . .) = p(A) x p(B) x . . .
More prob. • From Slavin: • Addition Rule: If X and Y are mutually exclusive events, the probability of obtaining either of them is equal to the probability of X plus the probability of Y. • Multiplication Rule: The probability of the simultaneous or successive occurrence of two events is the product of the separate probabilities of each event.
Yet more prob. • http://www.midcoast.com.au/~turfacts/maths.html • The product or multiplication rule. "If two chances are mutually exclusive the chances of getting both together, or one immediately after the other, is the product of their respective probabilities.“ • the addition rule. "If two or more chances are mutually exclusive, the probability of making ONE OR OTHER of them is the sum of their separate probabilities."
What’s the probability . . . • That the next card is a king? • That the next card is a heart? • That the next card is a spade? • That the next card is a club and a king? • That the next card is a spade OR a heart? • That the next two cards are kings?
Think this through. • What are the odds (“what are the chances”) (“what is the probability”) of getting two “heads” in a row? • Three heads in a row? • Three flips the same (heads or tails) in a row?
So then . . . • WHY were the odds in favor of having two people in our class with the same birthday? • Think about the problem! • What if there were 367 people in the class. • P(2 people with same b’day) = 1.00
Happy B’day to Us • But we had 50. • Probability that the first person has a birthday: 1.00. • Prob of the second person having the same b’day: 1/365 • Prob of the third person having the same b’day as Person 1 and Person 2 is 1/365 + 1/365 – the chances of all three of them having the same birthday.
Sooooo . . . • http://en.wikipedia.org/wiki/Birthday_paradox
Homework Keep reading. Practice problems. Date for midterm. Next week – Dr. Mary Lynn Rice Lively on qualitative research methods.