Hypothesis Testing

Hypothesis Testing

Hypothesis Testing What is a hypothesis? A hypothesis is a tentative answer to a research question. It is used to test a theory. Hypothesis testing is used when you want to answer a research question (hypothesis) about a population parameter using sample data. What does this mean? You compare empirically observed samples findings with theoretically expected findings. Suppose you theorize that children model behaviors they observe. How would you test this theory?

Step 2: Design your experiment What is the independent variable? What is the dependent variable? This is also where you would determine your sampling strategy, assignment procedures, etc.

Bandura example • Children observing an adult hitting a bobo the clown are more likely to hit bobo the clown than children who observe an adult just entering the room.

The logic of hypothesis testing:The null and alternative hypotheses Step 1. Generate a null hypothesis and an alternative hypothesis (about the population of interest) Step 2. Collect sample data relevant to your hypotheses Step 3. Determine the probability of your sample data if the null hypothesis about the population is true.

Step 1: The Null Hypothesis Generate two major hypotheses, the null hypothesis and the alternative hypothesis, that capture all the possibilities concerning the population parameter. Statistical hypothesis relevant for inferential statistics. What is the null hypothesis? The null hypothesis: The hypothesis to be nullified or rejected. Usually the hypothesis that disproves the researcher's theory (i.e., the researcher is hoping that the null hypothesis is false). Not real meaningful. Ho: no difference between children who observe an adult hitting Bobo the clown and adult just entering the room. The null hypothesis is symbolized by H0 ("H sub oh" or "H sub zero)

Step 1: The Alternative Hypothesis Generate two hypotheses, the null hypothesis and the alternative hypothesis, that capture all the possibilities concerning the population parameter. What is the alternative (or research) hypothesis? The alternative (or research) hypothesis: The logical opposite to the null hypothesis. Usually the hypothesis that supports the researcher's theory. The alternative hypothesis is symbolized by H1 ("H sub one") Come up with the null and alternative hypotheses that you would use to test the theory of racial prejudice and ignorance.

Alternative Hypotheses • Directional or nondirectional • Directional predict the direction of the difference (one-tailed test) • Nondirectional predict a difference but don’t know how it will go (two-tailed) • The critical value for a one-tailed test is lower

Step 3: Determine the probability of your sample data if the null hypothesis about the population is true If the null hypothesis is true, what do you think the outcome of your experiment would look like? If the null hypothesis is false, what do you think the outcome of the experiment would look like? The two possible outcomes of hypothesis testing: 1: If the outcome of the experiment is similar to what you would expect given the null hypothesis, then you would accept the null hypothesis. 2: If the outcome is not what you would expect, given the null hypothesis, then you would reject the null hypothesis and favor the alternative hypothesis.

 (alpha) Before you calculate any inferential statistics you must set alpha level. What is alpha? Alpha is the level of probability beyond which you will reject the null hypothesis. What does an alpha of .05 mean in English. If the probability of your sample mean is less than or equal to .05 then you reject the null hypothesis and say that the result is statistically significant. If the probability of your sample mean is greater than .05 then, you accept the null hypothesis.

The true situation Your statistical decision Null is true Null is false Retain null Reject null Type I and Type II errors: In hypothesis testing, there are two ways you could make a mistake. What are they? Type II correct correct Type I

A (somewhat trivial) example Philips claims that their new "eco-friendly" lightbulbs last 10,000 hours (almost 10 times as long as a regular lightbulb) Do you think that Philips “Earthlight” lightbulbs really last 10,000 hours? How would you test this assumption? What is the null hypothesis? What is the alternative hypothesis?

Example: From a sample of 20 lightbulbs: Mean = 9467.54 S-hat = 927.45 The sample mean is 532.46 hours less than the claim of 10,000. Do you think this evidence is strong enough to refute the Earthlight claim? What is the sample estimate of the standard deviation? How does this value factor into your thoughts about the null hypothesis? The correct question: What is the probability of obtaining a sample mean of 9467 hours, if the population mean is 10,000 hours? What would your conclusion about the 10,000 hour claim be if the probability of 9467 hours is large? What if it is small?

Finding the probability of the sample mean given the population mean The probability of obtaining a sample mean of 9,467.54 hours if the population mean is 10,000 is .02 Now what do you have to say about the Earthlight claim?

Testing Hypotheses • When the population standard deviation is known compute z-test (chapter 10) • For two-tailed test: z-score critical value of +/-1.96 • For one-tailed test: z-score critical value of 1.67 • Typically don’t know population standard deviation, so when use estimate we denote it as a t and use t tables • Referred to as the Student’s t-distribution • Can compute a single sample t-test (df = N-1) • Independent t-test df=(n1-1) + (n2-1) • Matched group or related t-test df=N-1

The larger the degrees of freedom (I.e., sample size the more the t-distribution approximates the z-distribution and therefore be normally distributed • a “sufficiently large” sample would be n=25

Steps • state the null hypothesis • State the alternative hypothesis • Determine level of significance • Determine if one- or two-tailed test • Check assumptions: • Random sample of interval or ratio scores • Raw score population forms a normal distribution • Standard deviation of the raw score population is estimated (standard error of the mean) • Compute the mean and estimate of standard deviation • Plug into formula (p. 281) and compute • Look up critical value

Calculating t

Example • Low birth weight (LBW) experimental group and LBW control group • Bayley Scales of Infant Development (M = 100) • M= 104.13 • S.D. = 12.584 • N = 56 • What is the t-observed • What is df? • What is the t-critical

Independent t-test • Same assumptions with the additional one of homogeneity of variance • Need to compute pooled variance (p. 313) • Formulae on p.316 • Example of homophobic versus nonhomophobic arousal to homosexual videos (Adams, Wright, & Lohr, 1996)

The rejection region What is the rejection region? The area of the sampling distribution represented by the probabilities less than or equal to alpha is known as the rejection region. Why is it called the rejection region?

Using t to make a decision about the null hypothesis Some computer programs give the actual probability of the observed t value, p(t). If you know the probability of t then you reject the null hypothesis if the probability of t is smaller than alpha. If you are calculating a t by hand, then you must use a t table to make a statistical decision. A t table presents critical values of t for a given distribution (i.e., number of degrees of freedom) and a given alpha level. Find the critical value of t for our example ( = .05, and N = 20). What does the critical value tell you? If our observed t was 2.57 and the critical value is _________, do you reject or accept the null hypothesis?

Characteristics of the t distribution for differences between the population mean and sample means. What will the mean of this distribution be? zero What happens to t as the size of the difference between the sample mean and the population mean increases? t increases As t (and therefore differences) increase, what happens to the probability of t? it decreases small probabilities are represented by small areas at the tails of the curve

Hypothesis Testing