140 likes | 163 Views
Learn how to determine if there is a relationship between two categorical variables using the Chi-Square Test of Association/Independence. Example scenarios and exercises included.
E N D
AP Statistics Chi-Square Test of Association/Independence
Chi – Square Test of Association/Independence • When you have a two way table arising from a single SRS and each individual is classified according to both of two categorical variables, we can determine whether there is a relationship between the variables.
The nuts and bolts • We perform the test exactly the same as how we did for the Chi – Square test for homogeneity of proportions, but the hypotheses are different.
Hypotheses • H0: the variables are independent • HA: the variables are dependent Or • H0: there is no association between the variables. • HA: there is an association between the variables.
Example 1 A survey was taken to determine if there is a relationship between students having a computer in their home and their school division (elementary, middle, secondary). A random sample of size 250 produced the following results:
Computer In Home Division Yes No Elementary 14 61 Middle 50 25 Secondary 86 14 Is there evidence that school division and having a home computer are independent? Use significance level 0.05.
Ex 1 cont. • Chi-Square Test for independence • Conditions: We are given a random sample with two categorical variables measured. • All expected counts are > 5 (see table below) • Hypotheses: Ho: Computer in home and school division are independent. HA: Computer in home and school division are dependent.
Ex 1 cont. • Mechanics: • Create expected cell counts in calculator. • df = (3 – 1)(2 – 1) = 2.
Ex 1 cont. • P-value = P(X2 > 82.94) = 0 (approx) • The p-value is extremely small(essentially 0), indicating that we have sufficient evidence to reject H0 in favor of HA: The two variables “computer in home” and “school division” are NOT independent.
U-Try • A survey was conducted in the San Francisco Bay Metropolitan area in which each participating individual was classified according to the type of vehicle used most often and city of residence. A subset of the resulting data are given in the accompanying table (The Relationship of Vehicle Type Choice to Personality, Lifestyle, Attitudinal and Demographic Variables, Technical Report UC_ITS_RR02-06, DaimlerCryslter Corp, 2002.)
Data: City Vehicle Type Concord Pleasant Hills N. SF Small 68 83 221 Compact 63 68 106 Midsize 88 123 142 Large 24 18 11
U-Try • Do the data provide convincing evidence of an association between city of residence and vehicle type? Use a significance level .05. You may assume that it is reasonable to regard the sample as a random sample of the San Francisco Bay Metropolitan area residents.
The results are in! • X2 = 49.81 • P-value < 0.001 • We have significant evidence, p-value < 0.001 < alpha = 0.05, that reject H0 in favor of HA to conclude that there is a relationship among San Francisco Bay area residents between the variables city of residence and vehicle type.
Exercises. • p766: 13.25, 13.26, • p771: 13.31, 13.39