220 likes | 348 Views
INF397C Introduction to Research in Information Studies Spring, 2005 Day 12. Context. Where we’ve been: Descriptive statistics Frequency distributions, graphs Types of scales Probability Measures of central tendency and spread z scores Experimental design The scientific method
E N D
INF397CIntroduction to Research in Information StudiesSpring, 2005Day 12
Context • Where we’ve been: • Descriptive statistics • Frequency distributions, graphs • Types of scales • Probability • Measures of central tendency and spread • z scores • Experimental design • The scientific method • Operational definitions • IV, DV, controls, counterbalancing, confounds • Validity, reliability • Within- and between-subject designs • Qualitative research • Gracy, Rice Lively • Inferential statistics • Dillon – standard error of the mean, t-tests • Doty – Chi square
Context (cont’d.) • Where we’re going: • More descriptive statistics • Correlation • Inferential statistics • Confidence intervals • Hypothesis testing, Type I and II errors, significance level • t-tests • Anova • Which method when? • Cumulative final
Standard Error of the Mean • So far, we’ve computed a sample mean (M, X bar), and used it to estimate the population mean (µ). • One thing we’ve gotten convinced of (I hope) is . . . larger sample sizes are better. • Think about it – what if I asked ONE of you, what School are you a student in? Versus asking 10 of you?
Standard Error (cont’d.) • Well, instead of picking ONE sample, and using that mean to estimate the population mean, what if we sampled a BUNCH of samples? • If we sampled ALL possible samples, the mean of the means would equal the population mean. (“µM”) • Here are some other things we know: • As we get more samples, the mean of the sample means gets closer to the population mean. • Distribution of sample means tends to be normal. • We can use the z table to find the probability of a mean of a certain value. • And most importantly . . .
Standard Error (cont’d.) • We can easily work out the standard deviation of the distribution of sample means: • SE = SM = S/SQRT(N) • So, the standard error of the mean is the standard distance that a sample mean is from the population mean. • Thus, the SE tells us how good an estimate our sample mean is of the population mean. • Note, as N gets larger, the SE gets smaller, and the better the sample mean estimates the population mean. • Hold on – we’ll use SE later.
Two methods of making statistical inferences • Null hypothesis testing • Assume IV has no effect on DV; differences we obtain are just by chance (error variance) • If the difference is unlikely enough to happen by chance (and “enough” tends to be p < .05), then we say there’s a true difference. • Confidence intervals • We compute a confidence interval for the “true” population mean, from sample data. (95% level, usually.) • If two groups’ confidence intervals don’t overlap, we say (we INFER) there’s a true difference.
Remember . . . • Earlier I said that there are two ways for us to be confident that something is true: • Statistical inference • Replicability • Now I’m saying there are two avenues of statistical inference: • Hypothesis testing • Confidence intervals
Effect Size • How big of an effect does the IV have on the DV? • Remember, two things that make it hard to find a difference are: • There’s a small actual difference. • There’s a lot of within-group variability (error variance). • (Demonstrate with moving distributions.)
Effect Size (cont’d.) • From S, Z, & Z: “To be able to observe the effect of the IV, given large within-group variability, the difference between the two group means must be large.” • Cohen’s d = (µ1 – µ2)/ σ • “Because effect sizes are presented in standard deviation units, they can be used to make meaningful comparisons of effect sizes across experiments using different DVs.”
Effect Size (cont’d.) • When σ isn’t known, it’s obtained by pooling the within-group variability across groups and dividing by the total number of scores in both groups. • σ= SQRT {[(n1-1)S12 + (n2-1) S22]/N} • And, by convention: • d of .20 is considered small • d of .50 is considered medium • d of .80 is considered large
Effect Size example • Let’s look at the heights of men and women. • Just for grins, intuitively, what you say – small, medium, or large difference? • µwomen = 64.6 in. µmen = 69.8 in. σ = 2.8 in. • d = (µ1 – µ2)/ σ = (69.8 – 64.6)/2.8 = 1.86 • So, very large difference. Indeed, one that everyone is aware of.
t-tests • Remember the z scores: • z = (X - µ)/σ • It is often the case that we want to know “What percentage of the scores are above (or below) a certain other score”? • Asked another way, “What is the area under the curve, beyond a certain point”? • THIS is why we calculate a z score, and the way we do it is with the z table, on p. 306 of Hinton. • Problem: We RARELY truly know µ or σ.
t-tests (cont’d.) • So, typically what we do is use M to estimate µ and s to estimateσ. (Duh.) (Note: When we estimate σ with s, we divide by N-1, which is degrees of freedom.) • Then, instead of z, we calculate t. • Hinton’s example on p. 64 is for a t-test when you have a null hypothesis population mean (µ0). (That is, you want to test if your observed sample mean is different from some value.) • Hinton then offers examples in Chapter 8 of related (dependent, within-subjects) and independent (unrelated, between-subjects) t-tests. • S, Z, & Z’s example on p. 409 is for a t-test to compare independent means.
Formulae • For a single mean(compared with µ0): • t = (M - µ)/(s/SQRTn) • For related (within-subjects) groups: • t = (M1 – M2)/s M1 – M2 • Where s M1 – M2 = (sx1 – x2)/SQRTn • See Hinton, p. 83 • For independent groups: • From S, Z, & Z, p. 409, and Hinton, p. 87 • t = (M1 – M2)/s M1 – M2 • Where s M1 – M2 = SQRT [(S12/n1) + (S22/n2)] • See Hinton, p. 87 • Will’s correction – The minus sign in the numerator of the formula at the top of page 87 should be a minus sign. Also two formulas down. Hinton has it right by the bottom of the page.
Steps • For a t test for a single sample • Restate the question as a research hypothesis and a null hypothesis about the populations. • Determine the characteristics of the comparison distribution. • The mean is the known population mean. • Compute the standard deviation by: • Calculate the estimated population variance (S2 = SS/df) • Calculate the variance of the distribution of means (S2/n) • Take the square root, to get SE. • Note, we’re calculating t with N-1 df. • Determine the cutoff sample score on the comparison distribution at which the null hypothesis should be rejected. • Decide on an alpha and one-tailed vs. two-tailed • Look up the critical value in the table • Determine your sample’s t score: t = m- µ / SE • Decide whether to reject or not reject the null hypothesis. (If the observed value of t exceeds the table value, reject.)
Steps • For a t test for dependent means • Restate the question as a research hypothesis and a null hypothesis about the populations. • Determine the characteristics of the comparison distribution. • Make each person’s score into a difference score. From here on out, use difference scores. • Compute the mean of the difference scores. • Assume a population mean of 0: µ = 0. • Compute the standard deviation of the difference scores: • Calculate the estimated population variance (S2 = SS/df) • Calculate the variance of the distribution of means (S2/n) • Take the square root, to get SE. • Note, we’re calculating t with N-1 df. • Determine the cutoff sample score on the comparison distribution at which the null hypothesis should be rejected. • Decide on an alpha, and one-tailed vs. two-tailed • Look up the critical value in the table • Determine your sample’s t score: t = m - µ / SE • Decide whether to reject or not reject the null hypothesis. (If the observed value of t exceeds the table value, reject.)
Steps • For a t test for independent means • Same as for dependent means, except the value for SE is that squirrely formula on Hinton, p. 87. • Basically, here’s the point. When you’re comparing DEPENDENT (within-subject, related) means, you can assume both sets of scores come from the same distribution, thus have the same standard deviation. • But when you’re comparing independent (between-subject, unrelated) means, you gotta basically average the variability of each of the two distributions.
Three points • df • Four people, take your choice of candy. • One df used up calculating the mean. • One or two tails • Must be VERY careful, choosing to do a one-tailed test. • Comparing the z and t tables • Check out the .05 t table values for infinity df (1.96 for two-tailed test, 1.645 for one-tailed). • Now find the commensurate values in the z table.
Significance Level • Remember, two ways to test statistical significance – hypothesis tests and confidence intervals. • With confidence intervals, if two groups yield data and their confidence intervals don’t overlap, then we conclude a significant difference. • In hypothesis testing, if the probability of finding our differences by chance is smaller than our chosen alpha, then we say we have a significant difference. • We select alpha (α), by tradition. • Statistical significance isn’t the same thing as practical significance.
Power of the Test • The power of a statistical test refers to its ability to find a difference in distributions when there really is one there. • Things that influence the power of a test: • Size of the effect. • Sample size. • Variability. • Alpha level.