230 likes | 527 Views
8.4: Measures of Association:Difference of Proportions. The difference of proportions is the proportion scoring
1. Sociology 601 (Martin)Lecture 15: November 11-13, 2008
Measures of association for tables (8.4)
Difference of proportions
the odds ratio
Measures of association for ordinal data (8.5 – 8.6).
Kendall’s tau-b
Statistical inference for ordinal associations
10. 8.5. Stepping up to ordinal and interval data
The chi-squared test is an extremely simple test of relationships between categories.
In chi-squared tests, we ask “Does the distribution of one variable depend on the categories for the other variable?”
This sort of question requires only nominal-scaled data
We are often interested in more informative tests of relationships between categories.
In such tests, we ask “As we increase the level of one variable, how do we change the level of another?”
As we shall see today, we can ask this question for both ordinal- and interval-scaled data.
11. A weakness of a chi-squared test. The problem: Chi-Squared tests are for nominal associations. If we use a chi-squared test when there is an ordinal association, we waste some information.
Assign + to cells where fo > fe and - where fo < fe .
Chi-Squared tests cannot distinguish the following patterns:
12. Alternative to a chi-squared test for ordinal data A solution: find concordant and discordant patterns.
1.) Identify every possible pair of observations. The number of possible pairs far exceeds the number of observations.
2.) A pair of observations is concordant if the subject who is higher on one variable is also higher on the other variable.
3.) A pair of observations is discordant if the subject who is higher on one variable is lower on the other variable.
4.) Many pairs of observations are neither concordant nor discordant. We ignore those pairs.
13. Finding concordant and discordant patterns. For all but the smallest samples, the number of concordant and discordant patterns can be very difficult to count, so we usually leave that exercise to a computer program.
It is, however, important to understand what the computer is doing. For that reason, we will try an example.
Concordant pairs:
Discordant pairs:
14. Counting concordant pairs (no like, low wages) x (maybe like, med wages) = 10 x 4 = 40
(no, low) x (maybe, high) = 10 x 7 = 70
(no, low) x (yes, med) = 10 x 5 = 50
(no, low) x (yes, high) = 10 x 2 = 20
(maybe, low) x (yes, med) = 1 x 5 = 5
(maybe, low) x (yes, high) = 1 x 2 = 2
(no, med) x (maybe, high) = 3 x 7 = 21
(no, med) x (yes, high) = 3 x 2 = 6
(maybe, med) x (yes, high) = 4 x 2 = 8
Total concordant pairs = 222
15. Counting discordant pairs (no like, med wages) x (maybe like, low wages) = 3 x 1 = 3
(no, med) x (yes, low) = 3 x 1 = 3
(no, high) x (maybe, med) = 3 x 4 = 12
(no, high) x (yes, med) = 3 x 5 = 15
(no, high) x (maybe, low) = 3 x 1 = 3
(no, high) x (yes, low) = 3 x 1 = 3
(maybe, high) x (yes, low) = 7 x 1 = 7
(maybe, high) x (yes, med) = 7 x 5 = 35
(maybe, med) x (yes, low) = 4 x 1 = 4
Total discordant pairs = 85
16. Measuring ordinal associations with gamma Gamma (?): A measure for concordant and discordant patterns.
gamma = (C –D) / (C+D), where
C = number of concordant pairs.
D = number of discordant pairs.
For the previous example: ? = (222 – 85) / (222 + 85)
= 139 / 307
= +.45
17. Measuring ordinal associations with gamma Interpreting gamma:
If gamma is between 0 and +1, the ordinal variables are positively associated.
If gamma is between 0 and –1, the ordinal variables are negatively associated.
The magnitude of gamma indicates the strength of the association.
If gamma = 0, the variables may still be statistically dependent because Chi-squared could still be large. However, the categories may not be dependent in an ordinal sequence.
18. The trouble with gamma Because gamma varies from -1 to +1 and is a measure of association between two variables, naïve statisticians tend to interpret gamma as a correlation coefficient.
(more on correlation coefficients in the next chapter)
The problem is that gamma gives more extreme values than a correlation coefficient, especially if the number of categories is small.
Unscrupulous researchers can increase gamma by collapsing categories together!
19. Kendall’s Tau-b
Kendall’s Tau-b is an alternative measure to Gamma.
Like Gamma, Kendall’s tau-b can take values from -1 to +1, and the farther from 0, the stronger the association.
STATA calculates a ‘sort-of’ standard error (Asymptotic Standard Error, or ASE) for tau-b, which you can use for statistical significance tests.
z = tau-b / (ASE of tau-b)
20. Using gamma and tau-b:
Use STATA commands for Chi-squared tests, which give you significance tests for ordinal level data.
If the gamma or tau-b test is statistically significant and the chi-squared is not, you have added power to the test by making the assumption of an ordinal relationship.
If the chi-squared test is statistically significant and the gamma and tau-b tests are not, you should see a clear departure from an ordinal relationship in the data.
(To test this relationship, calculate the conditional distributions of one variable for categories of the other.)
21. Gamma and tau-b: an example
Party identification and gender example:
We can calculate X2 = 7.010 (df = 1, p=.030)
22. STATA example of gamma and tau-b Next, use the TABULATE command with options:
. tabulate gender party [freq=number], gamma taub
| party
gender | democrat independe republica | Total
female | 279 73 225 | 577
male | 165 47 191 | 403
Total | 444 120 416 | 980
gamma = 0.1470 ASE = 0.056
Kendall's tau-b = 0.0796 ASE = 0.031
23. Statistical inference with gamma and tau-b A test for ordinal comparisons is similar to an independent samples test for population proportions.
random sample,
ordinal (or interval) categories,
the sampling distribution of differences between groups is normal because the sample size is large: n = 5 for every cell.
Null hypothesis:
there is no ordered relationship between the ordered distributions of categories.
24. Statistical inference with gamma and tau-b Test statistic: z = gamma / ASE of gamma.
gamma = 0.1470 ASE = 0.056
z = .1470/.056 = 2.625
(note: ASE stands for Asymptotic Standard Error)
P-value: look up in Table A
p = .0044 for a one-tailed test,
so p = .0088 for a two tailed test.
Conclusion: p < .01, so reject the null hypothesis.
Instead, conclude that there is an ordered relationship between sex and political identification.
(If you checked, you would find that p for a gamma test is smaller than p for a Chi-squared test in this case.)