210 likes | 361 Views
Chris Morgan, MATH G160 csmorgan@purdue.edu April 13, 2012 Lecture 30. Chapter 2.4: Chi-Squared ( χ 2 ) Test and Independence between two Categorical Variables. Two-Way Tables.
E N D
Chris Morgan, MATH G160 csmorgan@purdue.edu April 13, 2012 Lecture 30 Chapter 2.4: Chi-Squared (χ2) Test and Independence between two Categorical Variables
Two-Way Tables • Any table which allows you to observe multiple pieces of information to help find conditional, joint, and marginal probabilities • Expected Counts: the expected count in any cell of a two-way table when the null hypothesis is true • The null hypothesis is what you think to be true given previous research, outside readings, or personal opinion based on an educated guess
Example 1a: • Above is a sample of students in the College of Business. They were asked their chosen major and their sex. • What is the probability that a student is a Finance Major? • What is the probability that a student is Female?
Example 1b: • Above is a sample of students in the College of Business. They were asked their chosen major and their sex. • 3. What is the probability that a student is female given that the person is in Administration?
Example 1c: • Above is a sample of students in the College of Business. They were asked their chosen major and their sex. • 4. What is the probability that a student is an Administration major given that the student is female?
Hypothesis Testing • Our null hypothesis is what we expect to see given no interaction between variables • Our alternative hypothesis is some improvement or change on the null hypothesis • Never accept the Ha • Always “reject the Ho” or “fail to reject the Ho” • Why? • For the chi-square test: • Ho: there is no association between two categorical variables, and we conclude they’re independent • Ha: there is an association between two categorical variables, and we conclude there is a relationship
Calculating a Chi-Squared Statistic • Denoted χ2 • The observed count is whatever value we see in the table • The expected count for each cell in the table can be found by taking: Note: We can safely use the χ² test under two important conditions: 1. when no more than 20% of the expected counts are less than five 2. when all individual expected counts are one or greater
Interpreting a Chi-Squared Test • I can compare the calculated chi-square test-statistic to a critical value to see if my variables do in fact have a relationship • We will denote the test statistic as χ²* and the critical value as χ²α/2, (r-1)(c-1) where r is the number of rows, c is the number of columns, and the degrees of freedom is found by: df = (r-1)*(c-1). I can then look up the critical value in the table (see next slide) using the alpha level and df • If: | χ²*| > χ²α/2, (r-1)(c-1) …then we will reject the null hypothesis and conclude the alternative hypothesis, that the observed values were sufficiently far away from the expected value, meaning it is a significant result and there exists a relationship between the two variables • If: | χ²*| ≤ χ²α/2, (r-1)(c-1) …then we fail to reject the null hypothesis and the two variables are independent (meaning no relationship exists)
Chi-Square (χ²) Distribution Critical Values The first row is the alpha level The first column is the number of df
Example 2a: • Returning to example one, is there a relationship between gender and major? • Find expected counts • Compare expected counts to observed counts • Calculate χ² • Compare chi-squaretest statistic (χ²*) to chi-square critical value (χ²α/2, (r-1)(c-1) )
Example 2b: Fill in expected counts Recall the equation for expected counts:
Example 2c: Calculate χ² Recall the equation for chi-square:
Example 2d: Calculate χ² Recall the equation for chi-square: Now we just have to add them all together: and compare the chi-square value to the critical value…
Example 2e: Is χ² significant? To compare the chi-square value to the critical value I look up in the table the value for the chi-squared critical value when alpha = 0.05 and df = 6: Therefore, since the absolute value of the test statistic is less than or equal to the critical value we (circle one): reject the Ho fail to reject the Ho accept the Ho accept the Ha And conclude….what?:
Example 3a: • Is there a relationship between favorite soda and favorite ice cream? • Find expected counts • Compare expected counts to observed counts • Calculate χ² • Compare chi-squaretest statistic (χ²*) to chi-square critical value (χ²α/2, (r-1)(c-1) )
Example 3b: Fill in expected counts Recall the equation for expected counts:
Example 3c: Calculate χ² Recall the equation for chi-square:
Example 3d: Calculate χ² Recall the equation for chi-square: Now we just have to add them all together: and compare the chi-square value to the critical value…
Example 3e: Is χ² significant? To compare the chi-square value to the critical value I look up in the table the value for the chi-squared critical value when alpha = 0.05 and df = ____: Therefore, since the absolute value of the test statistic is less than or equal to the critical value we (circle one): reject the Ho fail to reject the Ho accept the Ho accept the Ha And conclude….what?:
To review: When calculating a chi-squared value: 1. Find expected counts 2. Compare expected counts to observed counts 3. Calculate a χ² test statistic 4. Compare test statistic to critical value using table 5. Make a conclusion If | χ²*| > χ²α/2, (r-1)(c-1) REJECT THE NULL: relationship exists If | χ²*| ≤ χ²α/2, (r-1)(c-1) FAIL TO REJECT THE NULL: independent, no relationships exists NEVER SAY ACCEPT THE NULL!!!!