1 / 26

Statistical Analysis

Statistical Analysis. Professor Lynne Stokes Department of Statistical Science Lecture #1 Chi-square Contingency Table Test. Independence. Employment Status is independent of Age. Note: One population, responses formed by two categorizations. Homogeneity.

bart
Download Presentation

Statistical Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistical Analysis Professor Lynne Stokes Department of Statistical Science Lecture #1 Chi-square Contingency Table Test

  2. Independence Employment Status is independent of Age Note: One population, responses formed by two categorizations

  3. Homogeneity If nondiscriminatory, promotions are binomially distributed with a common p for both gender categories Note: Two populations, common distribution of responses

  4. Cognitive Learning in Rats -- Tolman, Ritchie, Kalish (1946) Barrier Prior Theory: Discrete Learning Steps Goal -- Hull C D B Candidate Theory: Cognitive Learning -- Tolman A

  5. Goodness of Fit Path Chosen A B C D Total Number of Rats 4 5 8 15 32 Evidence of cognitive learning ? If random selection, Multinomial with pj = 1/4

  6. Compare Incidence of Death Penalty Are victim’s race and sentence independent?Is aggravation level an explanatory factor? Drunk, Lover’s Quarrel, Argument, etc. More Serious Vicious, Cold-blooded, Unprovoked, Murder, etc.

  7. Chi-Square Tests for Count Data • Independence • Distribution of responses across one categorization is identical for each category of a second categorization • Homogeneity • Distribution of responses is identical across several categories of one categorical variable or across several independent samples • Goodness of Fit • Responses are consistent with a stated probability distribution • Parameters specified • Unknown parameter values

  8. Sampling Schemes

  9. Chi-square Tests • Tests for independence in contingency tables

  10. Are the two categorizations statistically independent? e.g., Is employment status statistically independent of age? Contingency Tables(Crosstabs) • Two categorizations (rows and columns) • Each with mutually exclusive categories • Sample of n independent observations Note: Equivalent to Homogeneity Test, Unspecified p, When Only 2 Rows

  11. Notation for Observed Frequencies Column Categories • 1 ... j ... c Total • 1 • ... • i Oij Row i Total • ... • r • Total Column n • j Total Row Categories (Ri) (Cj)

  12. If row and column categories are independent, Reject Ho if X2 > Xa2 Xa2 = Chi-Square df = (r - 1)(c - 1) Chi-square Test for Independence Ho: Row and column categories are independent Ha: Row and column categories are not independent

  13. df = c - 1 Row 1: df = c - 1 Row 2: df = c - 1 Row r-1: Row r: Estimated expected frequencies in column j sum to Cj Degrees of Freedom for Contingency Tables Given Row and Column Totals, df = (r – 1)(c – 1) . . .

  14. Notational Convention: Eij Even Though Estimated Reject Ho if X2 > Xa2 Xa2 = Chi-Square df = (r - 1)(c - 1) Chi-square Contingency Table Test Summary

  15. Expected Frequencies Chi-square Calculation Employment Discrimination Observed Frequencies

  16. Age (yrs) Employment Status Age (yrs) Employment Status Employment Discrimination Are age and employment status related ?

  17. Employment Discrimination Ho: Employment Status and Age are independent Ha: Employment Status and Age are not independent Reject Ho if X2 > 6.635 (a = 0.01, df = 1) X2 = 138.67 Conclusion: There is sufficient evidence (p < 0.001), using a significance level of 0.05, to conclude that employment status and age are not statistically independent. Reason: A greater number of older employees were terminated than expected under the hypothesis of independence.

  18. Drug Usage Group Frequency of Drug Use Group Frequency of Drug Use

  19. Drug Usage Observed Frequencies Expected Frequencies Chi-Square Calculation

  20. Drug Usage Ho: Drug Usage and Campus Group are Independent Ha: Drug Usage and Campus Group are Not Independent Reject Ho if X2 > 5.991 (a = 0.05, df = 2) X2 = 6.87 Conclusion: Using a significance level of 0.05, there is sufficient evidence (0.025 < p < 0.05) to conclude that drug usage and campus group are not statistically independent. Reason: A greater number of athletes and fewer members of campus organizations reported monthly usage of drugs than expected under the hypothesis of independence.

  21. Chi-square Tests • Tests for independence in contingency tables • Tests for homogeneity

  22. Binomial Samples(Product Binomial Sampling) Genetic Theory:Ho: pW = 0.5 vs. Ha: pW 0.5 • Hypothesis #1: Is pw = 0.5? • Binomial inference on p • Equivalently, overall goodness of fit (known p) • Hypothesis #2: Are all the pw equal? • Test for homogeneity (equal but unknown p) • Hypothesis #3: Is eachpw = 0.5? • Goodness of fit (8 Samples, known p) Assumptions: 8 Samples, mutually independent counts

  23. Does not assume homogeneity (see below) Test of Homogeneity of k Binomial Samples, Specified p Ho: p1 = p2 = … =p8 = 0.5 vs. Ha: pj 0.5 for some j X2= 22.96 , df = 8 , p = 0.003

  24. Test of Homogeneity of k Binomial Samples: Unspecified p Ho: p1 = p2 = … =p8 vs. Ha: pjpk for some (j,k)

  25. Test of Homogeneity of k Binomial Samples: Unspecified p Ho: p1 = p2 = … =p8 vs. Ha: pjpk for some (j,k) X2 = 20.43 , df = 7 , p = 0.005 Note: Only one of each pair of expected vlues is independently estimated (k = 8, not 16)

More Related