1 / 30

Statistical Analysis

Statistical Analysis. Professor Lynne Stokes Department of Statistical Science Lecture #2 Chi-square Tests for Homogeneity, Chi-square Goodness of Fit Test,. Chi-square Tests. Tests for independence in contingency tables Tests for homogeneity. Binomial Samples (Product Binomial Sampling).

niran
Download Presentation

Statistical Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistical Analysis Professor Lynne Stokes Department of Statistical Science Lecture #2 Chi-square Tests for Homogeneity, Chi-square Goodness of Fit Test,

  2. Chi-square Tests • Tests for independence in contingency tables • Tests for homogeneity

  3. Binomial Samples(Product Binomial Sampling) Genetic Theory: Ho: pW = 0.5 vs. Ha: pW 0.5 • Hypothesis #1: Is pw = 0.5? • Binomial inference on p • Equivalently, overall goodness of fit (known p) • Hypothesis #2: Are all the pw equal? • Test for homogeneity (equal but unknown p) • Hypothesis #3: Is eachpw = 0.5? • Goodness of fit (8 samples, known p) Assumptions: 8 samples, mutually independent counts

  4. Does not assume homogeneity (see below) Test of Homogeneity of k Binomial Samples, Specified p Ho: p1 = p2 = … =p8 = 0.5 vs. Ha: pj 0.5 for some j X2 = 22.96 , df = 8 , p = 0.003

  5. Test of Homogeneity of k Binomial Samples: Unspecified p Ho: p1 = p2 = … =p8 vs. Ha: pjpk for some (j,k)

  6. Test of Homogeneity of k Binomial Samples: Unspecified p Ho: p1 = p2 = … =p8 vs. Ha: pjpk for some (j,k) X2 = 20.43 , df = 7 , p = 0.005 Note: Only one of each pair of expected values is independently estimated (k = 8, not 16)

  7. Chi-square Tests • Tests for independence in contingency tables • Tests for homogeneity • Goodness of fit tests

  8. Chi-square Goodness of Fit Test:Specified Probabilities Assumptions • n independent observations • k mutually exclusive possible outcomes • pj = Pr(outcome j) is the same on every trial Sample size condition All npj 1 At least 80% of the npj 5

  9. Ho: Pr(outcome j) = pj for j = 1 , ... , k Ha: Pr(outcome j) pj for at least one j Reject Ho if X2 > Xa2 Xa2 = Chi-Square df = k - 1 Goodness of Fit Test:Specified Probabilities Sample size: n Observed count for outcome j : Oj Expected count for outcome j : Ej = npj

  10. p = 0.026 Sufficient Evidence of Cognitive Learning ? Cognitive Learning Path Chosen A B C D Total Number of rats 4 5 8 15 32 Expected number 8 8 8 8 32 Using a significance level of a = 0.05, there is sufficient evidence (p = 0.026) to reject the hypothesis that rats choose the 4 doors with equal probability.

  11. Mendelian Inheritance Do the genotypes of a cross-breeding occur in the ratio 9:3:3:1 ? Reject Ho if X2 > 7.815 (a = 0.05)

  12. Mendelian Inheritance 0.25 0.08 1.33 1.00 X2 = 0.25 + 0.08 + 1.33 + 1.00 = 2.66 There is insufficient evidence (p > 0.10) at a significance level of 0.05 to conclude that the genotypes from this type of cross-breeding occur in proportions that differ from those predicted by Mendelian inheritance theory.

  13. Chi-Square Goodness of Fit Test:Unknown Parameters • Estimate the parameters of the distribution • Divide range of data values into mutually exclusive and exhaustive classes • Discrete data: often use the values themselves • Continuous data: use k = n1/2 or k = log(n) classes • Estimate the probability of being in each class • Compare the observed (Oi) counts in each class with the estimated expected (Ei) counts

  14. Chi-Square Goodness of Fit Test for the Poisson Distribution Number of senders (automated telephone equipment) in use at a given time 23 – 1 = 22 Categories H0: number ~ Poisson Ha: number not Poisson Reject if X > C20.05(20) = 31.4 df: 22 – 1 (mutually exclusive & exhaustive) – 1 (estimated parameter) = 20

  15. Chi-Square Goodness of Fit Test for the Normal Distribution • Divide the data into mutually exclusive and exhaustive (contiguous) classes • First and last classes are open-ended • ( , U1), (L2,U2), (L3, U3) … (Lk, ) with Lj = Uj-1 • Estimate the mean and standard deviation • Calculate z-scores for the limits of each class • Estimate the Probability Content for Each Class • pj = Pr(zLj < z < zUj) • Estimate the Expected Frequency for Each Class • Ej = npj

  16. Chi-Square Goodness of Fit Test • Can be applied to any discrete or continuous probability distribution, only probabilities need be specified: Ei = npi • Asymptotic chi-square distribution • All Ei > 1 & at Least 80% of the Ei > 5 • Does not have the highest power for specific distributions, against specific alternatives • Degrees of freedom (k classes) • If each class represents an independent sample (i.e, k replicate samples) and all parameters are known (i.e., known probabilities), df = k • If the classes represent mutually exclusive and exhaustive categories (i.e., expected frequencies must sum to n), data are independent and from a single sample • All parameters are known, df = k – 1 • r parameters are estimated: df = k – r – 1 • e.g., (n – 1)s2/s2 ~ C2(n – 1)

  17. Goodness of Fit to the Binomial,Known p • Normal theory approximation • Chi-square tests

  18. Binomial Sample, Specified p:Normal Theory Approximation Genetic Theory: Ho: pW = 0.5 vs. Ha: pW 0.5 Greater Power by Combining Samples (Assuming Homogeneity) p = 0.110

  19. Alternative to the Binomial Test: Chi-square Goodness of Fit, Specified p Genetic Theory: Ho: pW = 0.5 vs. Ha: pW 0.5 p = 0.110

  20. Greater Power if Homogeneous X2 = 2.56 , df = 1 , p = 0.110 Greater Power if Not Homogeneous X2 = 22.96 , df = 8 , p = 0.003 Overall Binomial Test vs.Test of Homogeneity, Specified p Ho: p1 = p2 = … =p8 = 0.5 vs. Ha: pj 0.5 for some j

  21. pw unspecified Binomial Samples Homogeneity, unspecified pequivalent toindependence

  22. Some Goodness of Fit Tests • Chi-square Goodness-of-fit test • Very general, can have little power • Kolmogorov-Smirnov goodness-of-fit test • Good general test, especially for continuous random variables • Wilk-Shapiro test for normality • Regarded as the best test for normality

  23. Comparing Odds Ratios Across Categories

  24. Race and Death Penalty Punishment Are the results consistent across aggravation levels ?

  25. Mantel-Haenszel Test • Several 2 x 2 tables • Assuming a common odds ratio, test that the odds ratio = 1

  26. Race and Death Penalty Punishment Expected frequencies for chi-square test of independence Note: None have sufficient sample sizes for tests of independence

  27. Mantel-Haenszel Test • Select one cell; e.g., upper-left • Calculate the excess for each table • Excess = Observed – Expected • e.g., Excess = O11 – E11 • Calculate the variances of the excesses • Variance = R1R2C1C2/n2(n-1)

  28. Race and Death Penalty Punishment Conclusion: Nearly 7 more white-victim murderers received the death penalty than would be expected if the odds were the same for white- and black-victim murderers

  29. Estimating the Common Odds Ratio Death Penalty and Race

More Related