270 likes | 607 Views
Inference for Categorical Variables. Probability & Statistics L. Weinstein May 2014. Testing a Claim with Categorical Data. Three tests: Goodness of Fit Test Does the distribution of the categorical variable fit an expected model?
E N D
Inference for Categorical Variables Probability & Statistics L. Weinstein May 2014
Testing a Claim with Categorical Data • Three tests: • Goodness of Fit TestDoes the distribution of the categorical variable fit an expected model? • Test for Homogeneity of PopulationsDoes each population have the same distribution for this variable? • Test for Association / IndependenceAre two categorical variables associated?
Goodness of Fit Test State: Is the distribution of <your variable here> different from the expected distribution of <be specific here>? The distribution is the same as expected for all categories The distribution is the different than expected for at least one category Test at significance level <choose a level>
Goodness of Fit Test Plan: Use a Goodness of Fit test Conditions: • Sample is randomly selected from population • All expected counts are at least 5 • Sample observations are independent; that is, if sampling without replacement, sample size is not more then 10% of the population size.
Goodness of Fit Test To conduct the test in Minitab, summarize the data by category and put this in one column. If equal counts are expected, this is enough. If something other than equal counts are expected, make a column of expected counts. Then run Stat>Tables>Chi-Square Goodness of Fit Test in Minitab.
Goodness of Fit Test Enter the column names for Observed Counts, Category names, and Proportions specified by historical counts (this is your expected counts list):
Goodness of Fit Test Do: <Include Minitab results of chi-square test here> <Indicate the value of the test statistics, , and the P-value of the test.>
Goodness of Fit Test Conclude: <Compare your P-Value to your significance level. Based on this comparison, either reject or fail to reject the null hypothesis. Conclude, or do NOT conclude, the alternative hypothesis in words.>
Test for Homogeneity State: Is the distribution of <your variable here> different for the populations <be specific here>? The distribution is the same for all populations The distribution is the different for at least one category Test at significance level <choose a level>
Test for Homogeneity Plan: Use a Test for Homogeneity Conditions: • Samples are randomly selected from each population • All expected counts are at least 5 • Sample observations are independent; that is, if sampling without replacement, each sample size is not more then 10% of that population size.
Test for Homogeneity To conduct the test in Minitab, make a column of the summarized distribution of the variable for each population. Then run Stat>Tables>Chi-Square Test (2-way table) in Minitab.
Test for Homogeneity Enter the column names for each population:
Test for Homogeneity Do: <Include Minitab results of chi-square test here> <Indicate the value of the test statistics, , and the P-value of the test.>
Test for Homogeneity Conclude: <Compare your P-Value to your significance level. Based on this comparison, either reject or fail to reject the null hypothesis. Conclude, or do NOT conclude, the alternative hypothesis in words.>
Test for Independence State: Is there an association between <categorical variable one> and <categorical variable two>? There is no association between the variables (they are independent). There is an association between the variables (they are NOT independent. Test at significance level <choose a level>
Test for Independence Plan: Use a Test for Independence / Association Conditions: • Sample is randomly selected from population • All expected counts are at least 5 • Sample observations are independent; that is, if sampling without replacement, sample size is not more then 10% of the population size.
Test for Independence To conduct the test in Minitab, make a two-way table summarizing the observed counts for each category of the two variables. Then run Stat>Tables>Chi-Square Test (2-way table) in Minitab.
Test for Independence Enter the column names that contain the summarized data:
Test for Independence Do: <Include Minitab results of chi-square test here> <Indicate the value of the test statistics, , and the P-value of the test.>
Test for Independence Conclude: <Compare your P-Value to your significance level. Based on this comparison, either reject or fail to reject the null hypothesis. Conclude, or do NOT conclude, the alternative hypothesis in words.>