310 likes | 620 Views
Bivariate Statistical Analysis: Differences Between Two Variables. Recognize when a bivariate statistical test is appropriate Calculate and interpret a χ 2 test for a contingency table Calculate and interpret an independent samples t -test comparing two means
E N D
Bivariate Statistical Analysis:Differences Between Two Variables
Recognize when a bivariate statistical test is appropriate • Calculate and interpret a χ2 test for a contingency table • Calculate and interpret an independent samples t-test comparing two means • Understand the concept of analysis of variance (ANOVA) • Interpret an ANOVA table
What Is the Appropriate Test of Difference? • Test of Differences • An investigation of a hypothesis that two (or more) groups differ with respect to measures on a variable. • Behavior, characteristics, beliefs, opinions, emotions, or attitudes • Bivariate Tests of Differences • Involve only two variables: a variable that acts like a dependent variable and a variable that acts as a classification variable. • Differences in mean scores between groups or in comparing how two groups’ scores are distributed across possible response categories.
Cross-Tabulation Tables: The χ2 Test for Goodness-of-Fit • Cross-Tabulation (Contingency) Table • A joint frequency distribution of observations on two more variables. • χ2 Distribution • Provides a means for testing the statistical significance of a contingency table. • Involves comparing observed frequencies (Oi) with expected frequencies (Ei) in each cell of the table. • Captures the goodness- (or closeness-) of-fit of the observed distribution with the expected distribution.
Chi-Square Test χ² = chi-square statistic Oi= observed frequency in the ith cell Ei = expected frequency on the ith cell Ri = total observed frequency in the ith row Cj = total observed frequency in the jth column n = sample size
Degrees of Freedom (d.f.) d.f.=(R-1)(C-1)
Example: Papa John’s Restaurants Univariate Hypothesis:Papa John’s restaurants are more likely to be located in a stand-alone location or in a shopping center. Bivariate Hypothesis: Stand-alone locations are more likely to be profitable than are shopping center locations.
Example: Papa John’s Restaurants (cont’d) • In this example, χ2 = 22.16 with 1 d.f. • From Table A.4, the critical value at the 0.05 level with 1 d.f. is 3.84. • Thus, we are 95 percent confident that the observed values do not equal the expected values. • But are the deviations from the expected values in the hypothesized direction?
χ2 Test for Goodness-of-Fit Recap Testing the hypothesis involves two key steps: • Examine the statistical significance of the observed contingency table. • Examine whether the differences between the observed and expected values are consistent with the hypothesized prediction.
Accurate Information? How About a Chi-Square Test? When is a cross-tabulation with a χ2 test appropriate? Ask these questions: Are multiple variables expected to be related to one another? Is the independent variable nominal or ordinal? Is the dependent variable nominal or ordinal? If answers are “Yes,” then it is appropriate.
The t-Test for Comparing Two Means • Independent Samples t-Test • A test for hypotheses stating that the mean scores for some interval- or ratio-scaled variable grouped based on some less-than-interval classificatory variable are not the same.
The t-Test for Comparing Two Means (cont’d) • Pooled Estimate of the Standard Error • An estimate of the standard error for a t-test of independent means that assumes the variances of both groups are equal.
Expert “T-eeze” When is an independent samples t-test appropriate? Is the dependent variable interval or ratio? Can the dependent variable scores be grouped based upon some categorical variable? Does the grouping result in scores drawn from independent samples? Are two groups involved in the research question? If answer to all questions is “yes,” then it is appropriate.
Comparing Two Means (cont’d) • Paired-Samples t-Test • Compares the scores of two interval variables drawn from related populations. • Used when means need to be compared that are not from independent samples.
The Z-Test for Comparing Two Proportions • Z-Test for Differences of Proportions • Tests the hypothesis that proportions are significantly different for two independent samples or groups. • Requires a sample size greater than thirty. • The hypothesis is: Ho: π1 = π2may be restated as: Ho: π1 - π2 = 0
The Z-Test for Comparing Two Proportions • Z-Test statistic for differences in large random samples: p1 = sample portion of successes in Group 1 p2 = sample portion of successes in Group 2 (p1 - p1) = hypothesized population proportion 1 minus hypothesized population proportion 2 Sp1-p2= pooled estimate of the standard errors of differences of proportions
The Z-Test for Comparing Two Proportions • To calculate the standard error of the differences in proportions:
One-Way Analysis of Variance (ANOVA) • Analysis of Variance (ANOVA) • An analysis involving the investigation of the effects of one treatment variable on an interval-scaled dependent variable. • A hypothesis-testing technique to determine whether statistically significant differences in means occur between two or more groups. • A method of comparing variances to make inferences about the means. • The substantive hypothesis tested is: • At least one group mean is not equal to another group mean.
Partitioning Variance in ANOVA • Total Variability • Grand Mean • The mean of a variable over all observations. • SST = Total of (observed value-grand mean)2
Partitioning Variance in ANOVA • Between-Groups Variance • The sum of differences between the group mean and the grand mean summed over all groups for a given set of observations. • SSB = Total of ngroup(Group Mean − Grand Mean)2 • Within-Group Error or Variance • The sum of the differences between observed values and the group mean for a given set of observations • Also known as total error variance. • SSE = Total of (Observed Mean − Group Mean)2
The F-Test • F-Test • Used to determine whether there is more variability in the scores of one sample than in the scores of another sample. • Variance components are used to compute F-ratios