150 likes | 255 Views
Intro to Statistics for the Behavioral Sciences PSYC 1900. Lecture 17: Chi-Square. Chi-Square Analyses. Chi-Square tests are used to analyze categorical (as opposed to continuous or ranked) data. Both independent and depended variables are on nominal scales
E N D
Intro to Statistics for the Behavioral SciencesPSYC 1900 Lecture 17: Chi-Square
Chi-Square Analyses • Chi-Square tests are used to analyze categorical (as opposed to continuous or ranked) data. • Both independent and depended variables are on nominal scales • Data in cells represent frequencies as opposed to measured scores on variables.
One Classification Variable:Chi-Square Goodness-of-Fit Test • Sometimes, we may be interested in determining if a specific category for a nominal variable occurs more frequently than would be expected by chance alone. • For example, are people more likely to be right-handed than left-handed? Is there a significant preference for salty as opposed to sweet or spicy snacks? • We can answer such questions by comparing observed frequencies with theoretically predicted ones.
Example • We have a sample of 99 participants and ask them to choose one of 3 snacks (salty, sweet, spicy). • The null hypothesis would be that no divergent preferences exist – each option is as likely to be selected. • Expected frequencies are the number of observations expected if the null is true. • This would imply that the expected frequencies would be 33 for each type of snack.
Example • We can then compare the actual versus predicted preferences. • Observed: • 45 (Salty), 26 (Sweet), 28 (Spicy) • Expected: • 33 (Salty), 33 (Sweet), 33 (Spicy) • Our task now is to determine if the deviation from expected frequencies is unlikely to represent sampling error.
Chi-Square Test • The logic of the Chi-Square test is straightforward. • We calculate the size of the squared deviations scaled by the average size of the expected values. • For example, if we had expected only 10 observations and found 20, that is a large discrepancy. If we had expected 100 and found 110, it is much less consequential.
Is it Significant? • Of course, we now have to determine the likelihood of this value. We do so by referring to the Chi-Square distribution. • df=#groups-1 • Like t and F, Chi-Square distribution is a family of distributions whose shape changes as a function of df’s. • It is positively skewed, especially for small df’s.
Is it Significant? • We can see that the critical value for a df=2 test at alpha = .05 is 5.99. • We can reject the null and state that there seems to be a significant preference for salty snacks.
Two Classification Variables • A more common use occurs with 2 variables (often iv and dv). • For example, does a political advertisement that makes you angry result in greater votes for a candidate than a more neutral one? • Have participants watch one type of ad and then record their voting behavior.
Data Formula is the same to calculate chi-square. Expected frequencies are calculated as the product of the row and column total (i.e., marginal totals) divided by the total sample size N.
Results df = (R-1)(C-1)
Effect Size • Most common is Cramer’s Phi. • Cramer’s squared gives an index of the amount of variance explained (similar to eta sqaured):
Chi-Square and Proportions • Chi-Square tests can be used to analyze proportions if you convert the proportions to actual frequencies.
Chi-Square Assumptions • All data are independent. • No participant can be included more than once. • As a rule of thumb, the expected frequencies for all cells should be no smaller than 5.