270 likes | 459 Views
Applied Statistics Using SAS and SPSS. Topic: Chi-square tests By Prof Kelly Fan, Cal. State Univ., East Bay. Outline. ALL variables must be categorical Goal one: verify a distribution of Y One-sample Chi-square test (SPSS lesson 40; SAS handout)
E N D
Applied Statistics Using SAS and SPSS Topic: Chi-square tests By Prof Kelly Fan, Cal. State Univ., East Bay
Outline • ALL variables must be categorical • Goal one: verify a distribution of Y • One-sample Chi-square test (SPSS lesson 40; SAS handout) • Goal two: test the independence between two categorical variables • Chi-square test for two-way contingency table (SPSS lesson 41; SAS section 3.G) • McNemar’s test for paired data (SPSS lesson 44; SAS section 3.L) • Measure the dependence (Phil and Kappa coefficients) (SPSS lesson 41, 44; SAS section 3.G, 3.M)
Example: Postpartum Depression Study • Are women equally likely to show an increase, no change, or a decrease in depression as a function of childbirth? • Are the proportions associated with a decrease, no change, and an increase in depression from before to after childbirth the same?
Example: Postpartum Depression Study From a random sample of 60 women
One-sample Chi-Square Test • Must be a random sample • The sample size must be large enough so that expected frequencies are greater than or equal to 5 for 80% or more of the categories
One-sample Chi-Square Test • Test statistic: Oi = the observed frequency of i-th category ei = the expected frequency of i-th category
SPSS Output • Weight your data by count first • Analyze >> Nonparametric Tests >> Legacy Dialogs >> Chi Square, count as test variable
Conclusion • Reject Ho • The proportions associated with a decrease, no change, and an increase in depression from before to after childbirth are significantly different to 1/3, 1/3, 1/3.
Example: Postpartum Depression Study • Are the proportions associated with a change and no change from before to after childbirth the same?
Example: Postpartum Depression Study From a random sample of 60 women
Two-way Contingency Tables • Report frequencies on two variables • Such tables are also called crosstabs.
Contingency Tables (Crosstabs) 1991 General Social Survey
Crosstabs Analysis (Two-way Chi-square test) • Chi-square test for testing the independence between two variables: • For a fixed column, the distribution of frequencies over rows keeps the same regardless of the column • For a fixed row, the distribution of frequencies over columns keeps the same regardless of the row
Measure of dependence for 2x2 tables • The phi coefficient measures the association between two categorical variables • -1 < phi < 1 • | phi | indicates the strength of the association • If the two variables are both ordinal, then the sign of phi indicate the direction of association
SPSS Output • P. 332 – 333
SAS Output Statistic DF Value Prob Chi-Square 2 79.4310 <.0001 Likelihood Ratio Chi-Square 2 90.3311 <.0001 Mantel-Haenszel Chi-Square 1 79.3336 <.0001 Phi Coefficient 0.2847 Contingency Coefficient 0.2738 Cramer's V 0.2847 Sample Size = 980
Measure of dependence for non-2x2 tables • Cramers V • Range from 0 to 1 • V may be viewed as the association between two variables as a percentage of their maximum possible variation. • V= phi for 2x2, 2x3 and 3x2 tables
Fisher’s Exact Test for Independence • The Chi-squared tests are ONLY for large samples: The sample size must be large enough so that expected frequencies are greater than or equal to 5 for 80% or more of the categories
SAS/SPSS Output • SAS output: Fisher's Exact Test • Table Probability (P) 3.823E-22 • Pr <= P 2.787E-20 • SPSS output: in “crosstabs” window, click “exact”, then tick “exact”:
Matched-pair Data • Comparing categorical responses for two “paired” samples When either • Each sample has the same subjects (or say subjects are measured twice) Or • A natural pairing exists between each subject in one sample and a subject form the other sample (eg. Twins)
Marginal Homogeneity • The probabilities of “success” for both samples are identical • Eg. The probability of approve at the first and 2nd surveys are identical
McNemar Test (for 2x2 Tables only) • SAS: Section 3.L; SPSS: Lesson 44 • Ho: marginal homogeneity Ha: no marginal homogeneity • Exact p-value • Approximate p-value (When n12+n21>10)
SAS Output McNemar's Test Statistic (S) 17.3559 DF 1 Asymptotic Pr > S <.0001 Exact Pr >= S 3.716E-05 Simple Kappa Coefficient Kappa 0.6996 ASE 0.0180 95% Lower Conf Limit 0.6644 95% Upper Conf Limit 0.7348 Sample Size = 1600 Level of agreement
SPSS Output • SPSS: p. 361 and in “two-samples tests” window tick McNemar and click “exact”, then tick “exact”: