1 / 15

Chi-square Basics

Chi-square Basics. The Chi-square distribution. Positively skewed but becomes symmetrical with increasing degrees of freedom Mean = k where k = degrees of freedom Variance = 2k Assuming a normally distributed dataset and sampling a single z 2 value at a time  2 (1) = z 2

eoscar
Download Presentation

Chi-square Basics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chi-square Basics

  2. The Chi-square distribution • Positively skewed but becomes symmetrical with increasing degrees of freedom • Mean = k where k = degrees of freedom • Variance = 2k • Assuming a normally distributed dataset and sampling a single z2 value at a time • 2(1) = z2 • If more than one… 2(N) =

  3. Why used? • Chi-square analysis is primarily used to deal with categorical (frequency) data • We measure the “goodness of fit” between our observed outcome and the expected outcome for some variable • With two variables, we test in particular whether they are independent of one another using the same basic approach.

  4. One-dimensional • Suppose we want to know how people in a particular area will vote in general and go around asking them. • How will we go about seeing what’s really going on?

  5. Hypothesis: Dems should win district • Solution: chi-square analysis to determine if our outcome is different from what would be expected if there was no preference

  6. Plug in to formula

  7. Reject H0 • The district will probably vote democratic • However…

  8. Conclusion • Note that all we really can conclude is that our data is different from the expected outcome given a situation • Although it would appear that the district will vote democratic, really we can only conclude they were not responding by chance • Regardless of the position of the frequencies we’d have come up with the same result • In other words, it is a non-directional test regardless of the prediction

  9. More complex • What do stats kids do with their free time?

  10. Is there a relationship between gender and what the stats kids do with their free time? • Expected = (Ri*Cj)/N • Example for males TV: (100*50)/200 = 25

  11. df = (R-1)(C-1) • R = number of rows • C = number of columns

  12. Interpretation • Reject H0, there is some relationship between gender and how stats students spend their free time

  13. Other • Important point about the non-directional nature of the test, the chi-square test by itself cannot speak to specific hypotheses about the way the results would come out • Not useful for ordinal data because of this

  14. Assumptions • Normality • Rule of thumb is that we need at least 5 for our expected frequencies value • Inclusion of non-occurences • Must include all responses, not just those positive ones • Independence • Not that the variables are independent or related (that’s what the test can be used for), but rather as with our t-tests, the observations (data points) don’t have any bearing on one another. • To help with the last two, make sure that your N equals the total number of people who responded

  15. Measures of Association • Contingency coefficient • Phi • Cramer’s Phi • Odds Ratios • Kappa • These were discussed in 5700

More Related