Inferential Statistics

Inferential Statistics Chapter 13

Inferential Statistics • Inferential stats are used to determine whether we can make statements that the results found in the present experiment reflect a true difference in the entire population of interest and not just the sample used in the experiment. • Therefore inferential statistics allow us to make predictions about the entire population based on the findings of sample groups. • Inferential statistics give a probability that the difference between the two means from the sample used in the experiment represents a true difference based on the manipulation of the IV, and not random error.

Null and Research Hypotheses • Null hypothesis states simply that the population means (after conducting the experiment) are equal and that any observed differences are due to random error. • Alternative hypothesis states that the population means are not equal and therefore the treatment or independent variable had an effect. • Statistical significance indicates that there is a low probability that the difference between the obtained sample was due to random error. • Alpha level -pre-determined probability level used to make a decision about statistical significance.

Probability and Sampling distributions • Probability likelihood of the occurrence or some event or outcome. • Statistical Significance—is a matter of probability. • Sampling distribution probability distributions based on many different samples taken over and over and shows the frequency of different sample outcomes from many separate random samples. • Sampling distribution- is based on the assumption that the null hypothesis is true. • Critical Values are obtained from Sampling Distributions and they are calculations of probability based on sample size and degrees of freedom

Sample Size • The sample size also has an effect on determining statistical significance. • The more samples you collect, the more likely you are to obtain an accurate estimate of the true population value • Thus, as your sample size increases, you can be more confident that your outcome is actually different from the expectations of the null hyp

Differential Statistics • T-tests and F-tests are differential statistics because they detect differences between groups. • The sampling distribution of all possible t values has a mean of 0 and a standard deviation of 1 • It reflects all the possible outcomes we could expect if we compared the means of two groups and the null hypothesis is correct

T-test • The calculated t value is a ratio of two aspects of the data • The difference between the group means • The variability within groups Group difference difference between your obtained means • Under the null hypothesis you expect this difference to be 0. • The value of t increases as the difference betweent your obtained sample means increases Within-group variability the amt of variability of scores about the mean

T-test Formula • t = group difference • within-group variability • The numerator of the formula is the difference between the means of the two groups • The denominator is the variance (s2) of each group divided by the number of Ss in the group, which are added together • The square root of the variance divided by the number of subjects = standard deviation • Finally, we calculate our obtained t value by dividing the mean difference by the SD • You would then compare your obtained t to those listed in the t-table of critical values to determine if it is significant or not

One Tailed d vs. Two-Tailed Tests A one-tailed test is conducted if you are interested only in whether the obtained value of the statistic falls in one tail of the sampling distribution for that statistic. --This is usually the case when your research hypothesis is directional. ---Group one will score higher than group two. ---The critical region in a one-tailed test contains 5% of the total area under the curve (alpha = .05)

Two Tailed Test • Two-tailed test if you wanted to know whether the new therapy was either better or worse than the standard method. • You need to check whether your obtained statistic falls into either tail of the distribution • There are two critical region in a two-tailed test • To keep the probability at .05, the total percentage of cases found in the two tails of the distribution must equal 5% • Thus each critical region must contain 2.5% of the cases • So the scores required to reach statistical significance must be more extreme than was necessary for the one-tailed test

When to use a one vs. two tailed? • Major implication - for a given alpha level, you must obtain a greater difference between the means of your two treatment groups to reach statistical significance if you use a two-tailed test than if you used a one-tailed test • The one-tailed test is more likely to detect a real difference if one is present (that is, it is more powerful) • However, using a one-tailed test means giving up any info about the reliability of a difference in the other, untested direction • The general rule of thumb is: Always use a two-tailed test unless there are compelling

F-test • The analysis of variance or F test is an extension of the t test • When a study has only one IV, F and t are virtually identical—the F = t-squared • ANOVA is used when there are more than two levels of an independent variable • The F statistic is a ratio of two types of variance: • Systematic variance the deviation of the group means from the grand mean or the mean score of all individual groups • Error variance the deviation of the individual scores in each goup from their respective group means • The larger the F value, the more likely the score is significant

Effect Size • Effect size quantifies the size of the difference between groups • If we have two grps, the effect size is the difference between the groups expressed in standard deviation units. • Therefore, the effect size is between O and 1. The effect size indicates the strength of the relationship. The closer to one, the stronger the relationship. • The advantage of the effect size is that it is not a function of the sample size

Type one and Type two errors • Type I error occurs when the researcher says that a relationship exists when in fact it does not • You have falsely rejected the null hyp • Type II error occurs when the researcher says that a relationship does not exist, when in fact it does • You have falsely accepted the null hyp

True State of Affairs • Null is true Null is False C Reject Null Type I error alpha Correct Decision 1-beta Correct Decision 1-alpha Accept Null Type II error beta

Probability of Type II error • If we set a low alpha level to decrease the chances of a Type I error (accepting a hypothesis that is true when it is not (e.g., p<.01), we increase the chances of a Type II error • True differences are more likely to be detected if the sample size is large. • If the effect size is large, a Type II error is unlikely

Interpreting non-significant results • Negative or nonsignificant results are difficult to interpret • There are several causes for nonsignificant results: • The instruction could be hard to understand • Have a weak manipulation of the indep var • Using an unreliable or insensitive dep measure • Sample size is too small.

Choosing a Sample Size: Power Analysis • Sample size can be based on what is typical in that particular area of research • Sample size can also be based on a desired probability of correctly rejecting the null hyp • This probability is called the power of the statistical test  the sensitivity of the statistical procedure to detect differences in your data • Power = 1 – p (Type II error ) • Power analysis-computer generated • Higher desired power demands a greater sample size • Researchers usually use a power between .70 and .90 to determine sample size

Inferential Statistics