150 likes | 360 Views
Statistical Inference for Frequency Data. Chapter 16. Chapter Topics. Applications of Pearson’s Chi-Square Goodness of Fit Computations Practical Significance Independence Computations Practical Significance Testing equality of 2 or more proportions Computations Test Extensions
E N D
Statistical Inference for Frequency Data Chapter 16
Chapter Topics • Applications of Pearson’s Chi-Square • Goodness of Fit • Computations • Practical Significance • Independence • Computations • Practical Significance • Testing equality of 2 or more proportions • Computations • Test Extensions • JMP Procedures
Pearson’s Chi-Square Statistic • Statistic developed by Karl Pearson to test hypotheses about frequency data • Do not confuse with the chi-square distribution, which was derived by Helmert • Used for three purposes • Testing goodness of fit • Did our sample really come from that distribution? • Testing independence • Are gender and alcohol consumption really independent? • Testing equality of proportions • Is the proportion of residents who are college grads the same in every state?
Testing Goodness of Fit • Did our sample really come from that population? • Null Hypothesis • States that the sample did come from that population • Alternative Hypothesis • States that the sample did not come from that population • The test statistic is given by: • Degrees of freedom: k-1 • Involves a value for each observation as well as a value for what the observation should have been if the sample really did come from that population
Testing Goodness of Fit (continued) • The Chi-Square Distribution • We reject the null hypothesis if the test statistic computed on the previous page is greater than the critical value. df=1 df=5 df=9 df=2 χ2
Testing Goodness of Fit • Did the following data come from a binomial population with 25% graduate students? • If the null hypothesis were true, then we would expect to have 75% undergraduates (150) and 25% graduates (50). • The critical value is 3.841 (α=.05,ν=1); therefore, we reject the null hypothesis. There is evidence the sample did not come from that population. • Note: the degrees of freedom are k-1
Testing Goodness of Fit • Cohen developed the following measure of effect size • is the observed proportion • is the expected proportion • Guidelines are interpreted as follows: • 0.1 – small effect • 0.3 – medium effect • 0.5 – large effect
Testing Goodness of Fit • Assumptions • Every sample observation must fall in one and only one category • The observations must be independent • The sample n must be large • Rules of thumb: • When we have one degree of freedom, the expected value of all cells should be at least ten • When we have two or more degrees of freedom, the expected value of all cells should be at least five • Yates’ Correction for Continuity • Apply this correction when we have one degree of freedom and the expected value of one or some cells is not appreciably greater than ten
Testing Independence • Are gender and classification independent? • Null Hypothesis • States that the variables are independent • Alternative Hypothesis • States that the variables are not independent • The test statistic is given by: • Degrees of freedom: (r-1)(c-1) • Involves a value for each observation as well as a value for what the observation should have been if the variables were independent
Testing Independence • Determining the Expected Value • If the variables were independent, the probability of begin a male undergraduate would be the probability of being a male times the probability of being an undergraduate • This product gives a probability. To determine the number, simply multiply that probability by the sample size • Computational Example
Testing Independence • Put into a contingency table & compute expected values • Expected Values: • The Test Statistic:
Testing Independence • Cramér developed the following measure of association • When the variables are independent, the numerator is 0 • When there is a perfect association, the numerator is (s-1) • Cohen’s measure of effect size is the same, and the values are interpreted the same: • 0.1 – small effect • 0.3 – medium effect • 0.5 – large effect
Testing Equality of c≥2 Proportions • Null Hypothesis • States that all proportions are equal • Alternative Hypothesis • States that some proportions are not equal • The test statistic is given by: • Degrees of freedom: c-1 • Involves a value for each observation as well as a value for what the observation should have been if the proportions are equal
JMP Procedures • Independence • Three columns: Two nominal and one continuous • First column: Gender • Second column: Classification • Third column: Count • Analyze | Fit Y by X | Y – Gender | X – Classification | Freq – Count | OK |
Chapter Review • Applications of Pearson’s Chi-Square • Goodness of Fit • Computations • Practical Significance • Independence • Computations • Practical Significance • Testing equality of 2 or more proportions • Computations • Test Extensions • JMP Procedures