1 / 23

Chi Square & Correlation

Chi Square & Correlation. Nonparametric Test of Chi 2. Used when too many assumptions are violated in T-Tests: Sample size too small to reflect population Data are not continuous and thus not appropriate for parametric tests based on normal distributions.

quasar
Download Presentation

Chi Square & Correlation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chi Square & Correlation

  2. Nonparametric Test of Chi2 • Used when too many assumptions are violated in T-Tests: • Sample size too small to reflect population • Data are not continuous and thus not appropriate for parametric tests based on normal distributions. • χ2 is another way of showing that some pattern in data is not created randomly by chance. • X2 can be one or two dimensional. • X2 deals with the question of whether what we observed is different from what is expected

  3. Calculating X2 What would a contingency table look like if no relationship exists between gender and voting for Bush? (i.e. statistical independence) Male Female Voted for Bush 50 Voted for Kerry 50 100 50 50 NOTE: INDEPENDENT VARIABLES ON COLUMS AND DEPENDENT ON ROWS

  4. Calculating X2 What would a contingency table look like if a perfect relationship exists between gender and voting for Bush? Male Female Voted for Bush Voted for Kerry

  5. Calculating the expected value The expected frequency of the cell in the ith row and jth column Fi = The total in the ith row marginal Fj = The total in the jth column marginal N = The grand total, or sample size for the entire table Expected Voted for Bush = 50x50 / 100 = 25

  6. Nonparametric Test of Chi2 • Again, the basic question is what you are observing in some given data created by chance or through some systematic process? O= Observed frequency E= Expected frequency

  7. Nonparametric Test of Chi2 • The null hypothesis we are testing here is that the proportion of occurrences in each category are equal to each other (Ho: B=K). Our research hypothesis is that they are not equal (Ha: B =K). Given the sample size, how many cases could we expect in each category (n/#categories)? The obtained/critical value estimation will provide a coefficient and a Pr. that the results are random.

  8. (50-25)2/25=25 (0 - 25)2 /25=25 (0 - 25)2 /25=25 (50-25)2 /25=25 X2=100 Let’s do a X2 Male Female Voted for Bush Voted For Kerry What would X2 be when there is statistical independence?

  9. Let’s corroborate with SPSS

  10. How do we know if the relationship is statistically significant? We need to know the df (df= (R-1) (C-1) ) (2-1)(2-1)= 1 We go to the X2 distribution to look for the critical value (CV= 3.84) We conclude that the relationship gender and voting is statistically significant. Testing for significance Male Female Voted for Bush Voted for Kerry X2= 4

  11. When is X2 appropriate to use? • X2 is perhaps the most widely used statistical technique to analyze nominal and ordinal data • Nominal X nominal (gender and voting preferences) • Nominal and ordinal (gender and opinion for W)

  12. X2 can also be used with larger tables 45 (19.4) (15.8) 30 (.88) (.72) 70 (8.6) (6.9) 65 80 145 X2=52.3 Do we reject the null hypothesis?

  13. Correlation (Does not mean causation) • We want to know how two variables are related to each other • Does eating doughnuts affect weight? • Does spending more hours studying increase test scores? • Correlation means how much two variables overlap with each other

  14. Types of Correlations

  15. Conceptualizing Correlation Measuring Development Strong Weak GPD POP WEIGHT GDP EDUCATION Correlation will be associated with what type of validity?

  16. Correlation Coefficient

  17. Home Value & Square footage

  18. Correlation Coefficient

  19. Rules of Thumb

  20. Multiple Correlation Coefficients

  21. Limitation of correlation coefficients • They tell us how strong two variables are related • However, r coefficients are limited because they cannot tell anything about: • Causation between X and Y • Marginal impact of X on Y • What percentage of the variation of Y is explained by X • Forecasting Because of the above Ordinary Least Square (OLS) is most useful

  22. Do you have the BLUES? • B for Best (Minimum error) • L for Linear (The form of the relationship) • U for Un-bias (does the parameter truly reflect the effect?) • E for Estimator

  23. Home value and sq. Feet Does the above line meet the BLUE criteria?

More Related