400 likes | 535 Views
COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology. Chapter 8 : Introduction to Hypothesis Testing. Hypothesis Testing.
E N D
COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology Chapter 8: Introduction to Hypothesis Testing
Hypothesis Testing A hypothesis testis a statistical method that uses sample data to evaluate a hypothesis about a population. The general goal of a hypothesis test is to rule out chance (sampling error) as a plausible explanation for the results from a research study.
Hypothesis Test - Steps • State hypothesis about the population. • Use hypothesis to predict the characteristics the sample should have. • Obtain a sample from the population. • Compare data with the hypothesis prediction. • If the sample mean is consistent with the prediction, then we conclude that the hypothesis is reasonable.
Basic Experimental Situation for Hypothesis Testing • Basic Assumption of Hypothesis Testing • If the treatment has any effect, it is simply to add or subtract a constant amount to each individual’s score • Remember that adding or subtracting a constant changes the mean, but not the shape of the distribution for the population and/or the standard deviation. • Thus, the population after treatment has the same shape and standard deviation as the population prior to treatment
Hypothesis Testing (cont'd.) • If the individuals in the sample are noticeably different from the individuals in the original population, we have evidence that the treatment has an effect. • However, it is also possible that the difference between the sample and the population is simply sampling error • The question that this chapter addresses is as follows: • How much sampling error are you willing to tolerate?
Hypothesis Testing (cont'd.) • The purpose of the hypothesis test is to decide between two explanations: • The difference between the sample and the population can be explained by sampling error (there does not appear to be a treatment effect) • The difference between the sample and the population is too large to be explained by sampling error (there does appear to be a treatment effect).
The Hypothesis Test: Step 1Clearly State The Hypothesis • State the hypothesis about the unknown population. • The null hypothesis, H0, states that there is no change in the general population before and after an intervention. In the context of an experiment, H0 predicts that the independent variable had no effect on the dependent variable. • The alternative hypothesis, H1, states that there is a change in the general population following an intervention. In the context of an experiment, predicts that the independent variable did have an effect on the dependent variable.
The Hypothesis Test: Step 2Set Criteria for Decision • The α levelestablishes a criterion, or "cut-off", for making a decision about the null hypothesis. The alpha level also determines the risk of a Type I Error (False Positive). α = .05 (most used), α = .01, α = .001 Find values in the unit normal table for z-scores • The critical regionconsists of outcomes that are very unlikely to occur if the null hypothesis is true. That is, the critical region is defined by sample means that are almost impossible to obtain if the treatment has no effect.
Hypothesis Testing and the Critical Region for z-Scores Remember that each tail has 2.5%
The Hypothesis Test: Step 3Collect Data & Compute Sample Statistics • Compare the sample means (data) with the null hypothesis. • Compute the test statistic. The test statistic(z-score) forms a ratio comparing the obtained difference between the sample mean and the hypothesized population mean versus the amount of difference we would expect without any treatment effect (the standard error).
The Hypothesis Test: Step 4Make A Decision • If the test statistic results are in the critical region, we conclude that the difference is significantor that the treatment has a significant effect. • In this case we reject the null hypothesis. • If the mean difference is not in the critical region, we conclude that the evidence from the sample is not sufficient to show a treatment effect • In this case we fail to reject the null hypothesis.
Hypothesis Testing Example:Re-Arrests for Heroin Addicts • Let us assume that the population of all heroin addicts in the US commit an average of μ = 80 property crimes per year with a standard deviation of σ = 20 when they live in the free world without treatment. The National Institute of Justice wants to determine if treatment with opiate antagonists (Naltrexone) significantly alters the average number of crimes committed per year (M=70) for a small random sample (n = 16) heroin-addicted felons who are completely detoxed and placed on daily doses of Naltrexone.
Hypothesis Testing Example:Re-Arrests for Heroin Addicts • Step 1: State the Hypothesis • H0: μwith Naltrexone= 80 Property Crimes/Year • - - The null hypothesis suggests no treatment effect: even with Naltrexone, the mean number of property crimes per year will be 80 • H1: μwith Naltrexone ≠ 80 Property Crimes/Year • - - The alternative hypothesis suggests the presence of a treatment effect: with Naltrexone, the mean number of property crimes per year will be different from 80.
Hypothesis Testing Example:Re-Arrests for Heroin Addicts • Step 2: Set Criteria for Decision • Select an Alpha Level and determine the boundaries for the critical regions • Most studies use an alpha of .05, which corresponds to a z-score of +/- 1.96 (2-tailed test) • If the z-score for the treated sample does not fall into the critical region, fail to reject H0 • If the z-score for the treated sample falls into the critical region (z≤-1.96 or z≥+1.96), reject H0
Hypothesis Testing Example:Re-Arrests for Heroin Addicts • Step 3: Compute Sample Statistic • Complete the computation for the z-statistic based upon the difference between the sample mean (M=70, n=16) and the population mean (μ=80) using the standard error (as calculated below) for the sample mean (σ M=5) in the denominator of the z-statistic as noted below. Remember to calculate the standard error for the sample mean and use this in the denominator. Do not use the standard deviation as the denominator of the z-statistic for the sample.
Hypothesis Testing Example:Re-Arrests for Heroin Addicts • Step 4: Make A Decision • To make a decision, you must compare the z-statistic for the sample (zsample = - 2.00) against the z-statistic that defines the boundaries of your critical region. As discussed earlier, we set the alpha level at .05 (zcritical = +/- 1.96). • Thus, we reject the null hypothesis (H0) and note that there does appear to be a treatment effect on the average number of property crimes committed per year for felons taking daily doses of Naltrexone.
Errors in Hypothesis Tests Just because the sample mean (following treatment) is different from the original population mean does not necessarily indicate that the treatment has caused a change. You should recall that there usually is some discrepancy between a sample mean and the population mean simply as a result of sampling error.
Errors in Hypothesis Tests (cont'd.) Because the hypothesis test relies on sample data, and because sample data are not completely reliable, there is always the risk that misleading data will cause the hypothesis test to reach a wrong conclusion. Two types of errors are possible.
Type I Errors • A Type I error occurs when the sample data appear to show a treatment effect when, in fact, there is none. • In this case the researcher will reject the null hypothesis and falsely conclude that the treatment has an effect. • Type I errors are caused by unusual, unrepresentative samples, falling in the critical region even though the treatment has no effect. • The hypothesis test is structured so that Type I errors are very unlikely; specifically, the probability of a Type I error is equal to the alpha level.
Type I Errors • The α level • Also known as the Level of Significance • Also known as Type I Error • Also determines the risk of a false positive finding • The probability that a result would be produced by chance (sampling error or random error) alone • Commonly used levels of significance (α) • α = .05 (most used) • 5% or 5 out of every 100 results would be due to chance • α = .01 • 1% or 1 out of every 100 results would be due to chance • α = .001 • 0.1% or 1 out of every 1000 results would be due to chance
Type I Errors:Alpha Levels and z-Scores • Select α level for two-tailed tests • Two-tailed tests hypothesize the presence of a difference, but not a particular direction for the difference between a sample mean (M) and a population mean (μ). • H0: M = μ • H1: M ≠ μ
Type II Errors • A Type II error occurs when the sample does not appear to have been affected by the treatment when, in fact, the treatment does have an effect. • In this case, the researcher will fail to reject the null hypothesis and falsely conclude that the treatment does not have an effect. • Type II errors are commonly the result of a very small treatment effect. Although the treatment does have an effect, it is not large enough to show up in the research study.
Type II Errors • Type II Errors • Also known as beta error (β) • Defined by the probability of false negatives • An error made by accepting or retaining a false null hypothesis (H0) • Stated simply, you fail to reject a false null hypothesis (H0) and claim that a relationship does not exist when (in fact) it does exist
Type I versus Type II Error FALSE + TRUE + TRUE - FALSE -
Directional Tests When a research study predicts a specific direction for the treatment effect (increase or decrease), it is possible to incorporate the directional prediction into the hypothesis test. The result is called a directional testor a one-tailed test. A directional test includes the directional prediction in the statement of the hypotheses and in the location of the critical region.
Type I Errors:Set Criteria for Decision • Select α level for one-tailed tests • These tests hypothesize the presence of a difference between a sample mean (M) and a population mean (μ) that falls in a particular direction. • M > μ or M < μ
Directional Tests (cont'd.) • For the prior example with Naltrexone treatment, if the original population has a mean number of property crimes per year of μ = 80 and the treatment is predicted to decrease the mean number of property crimes per year, then the null and alternative hypotheses would state that after treatment: H0: μ ≥ 80 (there is no decrease) H1: μ < 80 (there is a decrease) • In this case, the entire critical region would be located in the left-hand tail of the distribution because smaller values for M would demonstrate that there is a decrease in arrests per year for Naltrexone recipients and we would reject the null hypothesis if the z-score for the sample was lower than the critical cutoff identified for a particular level of significance (for example, zcrit = -1.65 for α = .05 ).
Measuring Effect Size • A hypothesis test evaluates the statistical significance of the results from a research study. • That is, the test determines whether or not it is likely that the obtained sample mean occurred without any contribution from a treatment effect. • The hypothesis test is influenced not only by the size of the treatment effect but also by the size of the sample. • Thus, even a very small effect can be significant if it is observed in a very large sample.
Measuring Effect Size Because a significant effect does not necessarily mean a large effect, it is recommended that the hypothesis test be accompanied by a measure of the effect size. We use Cohen’s d as a standardized measure of effect size. Much like a z-score, Cohen’s d measures the size of the mean difference in terms of the standard deviation.
Cohen’s d andEstimated Cohen’s d Calculations for Cohen’s d are fairly simple Note: Sample size does not affect Cohen’s d Evaluating Effect Sizes for d
The Effect of Standard Deviation on Calculations for Cohen’s d
Power of a Hypothesis Test • The power of a hypothesis test is defined is the probability that the test will reject the null hypothesis when the treatment does have an effect. • Probability of Type II Error (False Negative) = β • Power of Hypothesis Test = 1 - β • The power of a test depends on a variety of factors, including the size of the treatment effect and the size of the sample.
Factors that Affect Power • You can decrease power when • 1. sample size is decreased • 2. Alpha is decreased (e.g., from .05 to .01) • 3. You go from a 1- to 2-tail test • You can increase power when • 1. sample size is increased • 2. Alpha is increased (e.g., from .01 to .05) • 3. You go from a 2- to 1-tail test
How to Calculate the Power of a Hypothesis Test The previous slide was based upon a study from your book with μ = 80, σ = 10, and a sample (n=25) that is drawn with an 8-point treatment effect (M=88). What is the power of the related statistical test for detecting the difference between the population and sample mean?
How to Calculate the Power of a Hypothesis Test • Step #1: Calculate standard error for sample • In this step, we work from the population’s standard deviation (σ) and the sample size (n)
How to Calculate the Power of a Hypothesis Test • Step #2: Locate Boundary of Critical Region • In this step, we find the exact boundary of the critical region • Pick a critical z-score based upon alpha (α =.05)
How to Calculate the Power of a Hypothesis Test Step #3: Calculate the z-score for the difference between the treated sample mean (M=83.92) for the critical region boundary and the population mean with an 8-point treatment effect (μ = 88).
How to Calculate the Power of a Hypothesis Test • Interpret Power of the Hypothesis Test • Find probability associated with a z-score > - 2.04 • Look this probability up as the proportion in the body of the normal distribution (column B in your textbook) • p = .9793 • Thus, with a sample of 25 people and an 8-point treatment effect, 97.93% of the time the hypothesis test will conclude that there is a significant effect.