470 likes | 485 Views
Chi-Square Analyses. Please refrain from typing, surfing or printing during our conversation! . Outline of Today’s Discussion. The Chi-Square Test of Independence – Introduction The Chi-Square Test of Independence – Excel The Chi-Square Test of Independence – SPSS
E N D
Please refrain from typing, surfing or printing during our conversation! Outline of Today’s Discussion • The Chi-Square Test of Independence – Introduction • The Chi-Square Test of Independence – Excel • The Chi-Square Test of Independence – SPSS • The Chi-Square Test for Goodness of Fit - Introduction • The Chi-Square Test for Goodness of Fit – Excel
Part 1 Chi-Square Test of Independence (Introduction)
Chi-Square: Independence • The chi-square is a non-parametric test - It’s NOT based on a mean, and it does not require that the data are bell-shaped (I.e., Gaussian distributed). • We can use the Chi-square test for analyzing data from certain between-subjects designs. Will someone remind us about between-subject versus within-subject designs? • Chi-square tests are appropriate for the analysis of categorical data (i.e., on a nominal scale).
Chi-Square: Independence • Sometimes a behavior can be described only in an all-or-none manner. • Example: Maybe a particular behavior was either observed or not observed. • Example: Maybe a participant either completed an assigned task, or did not. • Example: Maybe a participant either solved a designated problem, or did not.
Chi-Square: Independence • We can use the Chi-square to test data sets that simply reflect how frequently a particular category of behavior is observed. • The Chi-square test of independence is also called the two-way chi-square. • The Chi-square test of independence requires that two variables are assessed for each participant.
Chi-Square: Independence • The Chi-square test is based on a comparison between values that are observed (O), and values that would be expected (E) if the null hypothesis were true. • The null hypothesis would state that there is no relationship between the two variables, i.e., that the two variables are independent of each other. • The chi-square test allows us to determine if we should reject or retain the null hypothesis…
Chi-Square: Independence • To calculate the chi-square statistic, we need to develop a so-called “contingency table”. • In the contingency table, the levels ofone variable are displayed across rows, and the levels of the other variable are displayed across columns. • Let’s see a simple 2 x 2 design…
Chi-Square: Independence Contingency Table: 2 rows by 2 columns Political Party City Democrat Republican Minneapolis Atlanta The “marginal frequencies” are the row totals and column totals for each level of a particular variable.
Chi-Square: Independence • A “cell” in the table is defined as a unique combination of variables (e.g., city, political party). • For each cell in the contingency table, we need to calculate the expected frequency. • To get the expected frequency for a cell, we use the following formula…
Chi-Square: Independence The expected (E) frequency of a cell. Example
Chi-Square: Independence Political Party City Democrat Republican Minneapolis Atlanta Does everyone now understand where this 28 came from?
Chi-Square: Independence Here’s the Chi-square statistic. Let’s define the components…
Chi-Square: Independence Components of the Chi-Square Statistic
Chi-Square: Independence We’ll need one of these for each cell in our contingency table. Then, we’ll sum those up!
Chi-Square: Independence Check: Be sure to have one of these for each cell in your contingency table. We’ll reduce them, then sum them…
Chi-Square: Independence Finally, for each cell, reduce the parenthetical expression to a single number, and sum those up.
Chi-Square: Independence • After calculating the Chi-square statistic, we need to compare it to a “critical value” to determine whether to reject or accept the null hypothesis. • The critical value depends on the alpha level. What does the alpha level indicate, again? • The critical value also depends on the “degrees of freedom”, which is directly related to the number of levels in being tested…
Chi-Square: Independence Formula for the “degrees of freedom” In our example, we have 2 rows and 2 columns, so df = (2-1) (2-1) df = 1
Chi-Square: Independence • We will soon attempt to develop some intuitions about the “degrees of freedom” (df), and why they are important. • For now, we will simply compute the df so that we can determine the critical value. • For df = 1, and an alpha level of 0.05, what is the critical value? (see the hand-out showing the critical values table). • How does the critical value compare to the value of chi-square that we obtained (i.e., 6.43)? • So, what do we decide about the null hypothesis?
Chi-Square: Independence • Congratulations! You’ve completed your first try at hypothesis testing! • In a way, the computations are somewhat similar to the various “r” statistics you’ve previously calculated. • However, we had not previously compared our “r” statistics to a critical value. So, we had not previously drawn any conclusions about statistical significance. • Questions so far?
Chi-Square: Independence • Before we move on, I’d like you to develop some intuitions about the computations… • Let’s look at a portion of the computation that you just completed, and really understand it…
Chi-Square: Independence Under what circumstances would the expression that’s circled produce a zero?
Chi-Square: Independence In general, when the observed and expected values are very similar to each other, the chi-square statistic will be small (and we’ll likely retain the null hypothesis).
Chi-Square: Independence By contrast, when the observed and expected values are very different from each other, the chi-square statistic will be large (and we’ll likely reject the null hypothesis).
Chi-Square: Independence • The decision to reject or retain the null hypothesis depends, of course, not only on the chi-square value that we obtain, but also on the critical value. • Look at the critical values on the Chi-square table that was handed out. What patterns do you see, and why do those patterns occur? • Questions or comments?
Part 2 Chi-Square Test of Independence In Excel
Part 3 Chi-Square Test of Independence In SPSS
Chi-Square in SPSS • Here’s the sequence of steps for Chi-Square in SPSS. • Analyze --> Descriptive Statistics --> CrossTabs (yeah, it’s weird). • Select the two variables of interest by moving one into the ROWS box, and the other into COLUMNS box. • Statistics --> check off the chi-square • Cell display --> check off observed, expected, row & column • In the output, look for a large value of Pearson Chi square, we need “asymp sig (2 sided)” to be < 0.05, our alpha level.
Chi-Square in SPSS When the “asymp sig (2 sided)” value is < 0.05, reject the null hypothesis. In practice, there are 2 alpha levels: There’s the criterion alphalevel (usually 0.05), and the observed alpha level (shown in SPSS output)
Chi-Square in SPSS • For a given degree-of-freedom level, there is an inverse relationship between the observed chi-square statistic and observed alpha level. • The higher the observed chi-square value, the smaller the observed alpha level, i.e., “sig” value. Probability by Chance
Chi-Square in SPSS There is a low probability of large c2 values. Probability by Chance Large chi-square values are unlikely to occur just by chance, So….large chi-square values correspond to low alpha levels. (Note: Alpha levels are called “sig” values in SPSS)
Part 4 Chi-Square Test: Goodness-of-Fit
Chi-Square: Goodness-of-Fit • Good news! The test for the goodness-of-fit is much simpler than that for independence! :-) • In the test for goodness-of-fit, each participant is categorized on ONLY ONE VARIABLE. • In the test for independence, participants were categorized on two different variables (i.e, city and political party)).
Chi-Square: Goodness-of-Fit • The null hypothesis states that the expected frequencies will provide a “good fit” to the observed frequencies. • The expected frequencies depend on what the null hypothesis specifies about the population… • For example, the null hypothesis might state that all levels of the variable under investigation are equally likely in the population. • Example: Let’s consider the factors that go in to choosing a course for next semester…
Chi-Square: Goodness-of-Fit • Perhaps we’re identified 4 factors affecting course selection: Time,Instructor, Interest, Ease. • The null hypothesis might indicate that, in the population of Denison students, these four factors are equally likely to affect course selection. • If we sample 80 Denison students, the expected value for each category would be (80 / 4 categories = 20). • Questions so far?
Chi-Square: Goodness-of-Fit • Let’s further assume that, after asking students to decide which of the 4 factors most affects their course selection, we obtain the following observed frequencies. • Time = 30; Instructor=10; Interest=22; Ease=18. • We now have the observed and expected values for all levels being examined…
Chi-Square: Goodness-of-Fit The chi-square computation is simpler than before, since we only have one variable (i.e., only one row).
Chi-Square: Goodness-of-Fit df = C - 1 Calculating the degrees of freedom is also simpler than before. (C = # of columns = one for each level of the variable)
Chi-Square: Goodness-of-Fit • Let’s now evaluate the null hypothesis using the chi-square test for goodness-of-fit. Again the observed frequencies are… • Time = 30; Instructor=10; Interest=22; Ease=18. • The expected frequencies are 20 for each category (because the null hypothesis specifies that the four factors are equally likely to effect course selection in the population).
Chi-Square: Goodness-of-Fit • Note: There are two assumptions that underlie both chi-square tests. • First, each participant can contribute ONLY ONE response to the observed frequencies. • Second, each expected frequency must be at least 10 in the 2x2 case, or in the single variable case; each expected frequency must be at least 5 for designs that are 3x2 or higher.
Chi-Square: Goodness-of-Fit • Lastly, there are standards by which the chi-square statistics are to be reported, formally. • Statistics like this are to be reported in the Method section of an APA style report. • APA = American Psychological Association p = 0.033
Chi-Square: Goodness-of-Fit Memorize This! All APA-style manuscripts consist of the following sections, in this order: • Abstract • Introduction • Method (singular, not not Methods) • Results • Discussion • References
Part 5 Chi-Square Test for Goodness-of-Fit In Excel
Goodness of Fit in Excel • We’ve already seen how we can use a Chi-square table to find the critical value (“the number to beat”). • We can also use the following Excel command: =chiinv( probability, degrees of freedom) where probability = criterion alpha level i.e., 0.05 in most cases. • The output of “=Chiinv()” is the critical value “the number to beat”
Goodness of Fit in Excel • We can also use Excel to find the observed alpha level, given an observed c2 value. • Here’ the Excel command: =chidist(c2 , degrees of freedom) • The output of “=Chidist()” is the observed alpha level (“sig value in SPSS”). • The observed alpha level must be less than 0.05 (or the criterion alpha level) to reject the null hypothesis.
Goodness of Fit in Excel • We can also use Excel to find the observed alpha level, given an observed c2 value. • Here’ the Excel command: =chidist(c2 , degrees of freedom) • The output of “=Chidist()” is the observed alpha level (“sig value in SPSS”). • The observed alpha level must be less than 0.05 (or the criterion alpha level) to reject the null hypothesis.