Educational Research: Data analysis and interpretation – 2 Inferential statistics

Educational Research:Data analysis and interpretation – 2Inferential statistics EDU 8603 Educational Research Richard M. Jacobs, OSA, Ph.D.

Statistics... • A set of mathematical procedures for describing, synthesizing, analyzing, and interpreting quantitative data …the selection of an appropriate statistical technique is determined by the research design, hypothesis, and the data collected

inferential statistics... …mathematical tools that permit the researcher to generalize to a population of individuals based upon information obtained from a limited number of research participants

sampling error... …the differences in samples due to random fluctuations within the population

…sampling errors vary in size …but are normally distributed around the population mean (M) …and take the shape of a bell curve

standard error... …the standard deviation of the sample means (SEx)

…tells the researcher by how much the researcher would expect the sample means to differ if the researcher used other samples from the same population

but... …the researcher does not have to select a large number of samples from a population to estimate the standard error

a mathematical formula can be used to estimate the standard error... SD . SEx = √ N - 1

…a smaller standard error indicates less sampling error

…the major factor affecting the size of the standard error of the mean is sample size …but, the size of the population standard deviation also affects the standard error of the mean

The null hypothesis (H0)... • the statement that the difference between two sample means is due to random, chance, sampling error …indicates that there is no true difference or relationship between parameters in the populations

the null hypothesis differs in most instances from the research hypothesis (H1) …which states that one method is expected to be more effective than another

rejecting the null hypothesis provides evidence (but not proof) that the treatment had an effect …in other words, that the difference between dependent variables is due to something other than random, chance, sampling error

The research question, then, is: …whether to accept the null hypothesis or to reject it

There are four possibilities: 1. The null hypothesis is true and the researcher concludes that it is true A = B…a correct decision

2. The null hypothesis is false and the researcher concludes that it is false A ≠ B…a correct decision

3. The null hypothesis is true but the researcher concludes that it is false A = B…an incorrect decision

4. The null hypothesis is false but the researcher concludes that it is true A ≠ B…an incorrect decision

Decisions concerning rejecting the null hypothesis… The true status of the null hypothesis… True False True Correct Incorrect The researcher’s decision about the null hypothesis… False Incorrect Correct

Decisions concerning rejecting the null hypothesis… The true status of the null hypothesis… True False Type II Error True Correct The researcher’s decision about the null hypothesis… Type I Error False Correct

researchers use a test of significance to determine whether to reject or fail to reject the null hypothesis …involves pre-selecting a level of probability, “α” (e.g., α = .05) that serves as the criterion to determine whether to reject or fail to reject the null hypothesis

Steps in using inferential statistics… 1. select the test of significance 2. determine whether significance test will be two-tailed or one tailed 3. select α (alpha), the probability level 4. compute the test of significance 5. consult table to determine the significance of the results

Tests of significance... • statistical formulas that enable the researcher to determine if there was a real difference between the sample means

…different tests of significance account for different factors including: the scale of measurement represented by the data; method of participant selection, number of groups being compared, and, the number of independent variables

…the researcher must first decide whether a parametric or nonparametric test must be selected

parametric test... …assumes that the variable measured is normally distributed in the population …the data must represent an interval or ratio scale of measurement

…the selection of participants is independent …the variances of the population comparison groups are equal

…a “more powerful” test in that it is more likely to reject a null hypothesis that is false, that is, the researcher is less likely to commit a Type II error …used when the data represent a interval or ratio scale

nonparametric test... …makes no assumption about the distribution of the variable in the population, that is, the shape of the distribution

…used when the data represent a nominal or ordinal scale, when a parametric assumption has been greatly violated, or when the nature of the distribution is not known

…a “less powerful” test in that it is less likely to reject a null hypothesis at a given level of significance …usually requires a larger sample size to reach the same level of significance as a parametric test

The most common tests of significance… t-test ANOVA Chi Square

t-test... …used to determine whether two means are significantly different at a selected probability level …adjusts for the fact that the distribution of scores for small samples becomes increasingly different from the normal distribution as sample sizes become increasingly smaller

…the strategy of the t-test is to compare the actual mean difference observed to the difference expected by chance

…forms a ratio where the numerator is the difference between the sample means and the denominator is the chance difference that would be expected if the null hypothesis were true

…after the numerator is divided by the denominator, the resulting t value is compared to the appropriate t table value, depending on the probability level and the degrees of freedom

…if the t value is equal to or greater than the table value, then the null hypothesis is rejected because the difference is greater than would be expected due to chance

…there are two types of t-tests: the t-test for independent samples (randomly formed) and the t-test for nonindependent samples (nonrandomly formed, e.g., matching, performance on a pre-/post- test, different treatments)

ANOVA... …used to determine whether two or more means are significantly different at a selected probability level …avoids the need to compute duplicate t-tests to compare groups

…the strategy of ANOVA is that total variation, or variance, can be divided into two sources: a) treatment variance (“between groups,” variance caused by the treatment groups) and error variance (“within groups” variance)

…forms a ratio, the F ratio, with the treatment variance as the numerator (between group variance) and error variance as the denominator (within group variance)

…the assumption is that randomly formed groups of participants are chosen and are essentially the same at the beginning of a study on a measure of the dependent variable

…at the study’s end, the question is whether the variance between the groups differs from the error variance by more than what would be expected by chance

…if the treatment variance is sufficiently larger than the error variance, a significant F ratio results, that is, the null hypothesis is rejected and it is concluded that the treatment had a significant effect on the dependent variable

…if the treatment variance is not sufficiently larger than the error variance, an insignificant F ratio results, that is, the null hypothesis is accepted and it is concluded that the treatment had no significant effect on the dependent variable

…when the F ratio is significant and more than two means are involved, researchers use multiple comparison procedures (e.g., Scheffé test, Tukey’s HSD test, Duncan’s multiple range test)

FANOVA... …used when a research study uses a factorial design to investigate two or more independent variables and the interactions between them …provides a separate F ratio for each independent variable and each interaction

Multiple Regression... …a prediction equation that includes more than one predictor …predictors are variables known to individually predict (correlate with) the criterion to make a more accurate prediction

Chi Square (Χ2)... …a nonparametric test of significance appropriate for nominal or ordinal data that can be converted to frequencies …compares the proportions actually observed (O) to the proportions expected (E) to see if they are significantly different

Educational Research: Data analysis and interpretation – 2 Inferential statistics