1 / 58

Volkert Siersma Research Unit for General Practice in Copenhagen gpract.ku.dk

Volkert Siersma Research Unit for General Practice in Copenhagen http://www.gpract.ku.dk V.Siersma@gpract.ku.dk. P γ measure for association between categorical variables with partial or tentative ordering of categories. 1. Categorical variables. Nominal variables. Ordinal variables.

Download Presentation

Volkert Siersma Research Unit for General Practice in Copenhagen gpract.ku.dk

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Volkert Siersma Research Unit for General Practice in Copenhagen http://www.gpract.ku.dk V.Siersma@gpract.ku.dk Pγ measure for association between categorical variables with partial or tentative ordering of categories

  2. 1 Categorical variables Nominal variables Ordinal variables Categories are unordered relative to each other Categories have an inherent ordering Ordinal variables are categorical variables with additional information in the ordering of the categories. This can be used to devise stronger and more meaningful analyses between ordinal variables.

  3. 2 Categorical inference Nominal variables Ordinal variables Tests for conditional independence in multidimensional contingency tables

  4. 2 Categorical inference Nominal variables Ordinal variables Tests for conditional independence in multidimensional contingency tables • Inference based on χ2-measures. • LR test with saturated alternative • LR test with 2-factor alternative

  5. 2 Categorical inference Nominal variables Ordinal variables Tests for conditional independence in multidimensional contingency tables • Inference based on χ2-measures. • LR test with saturated alternative • LR test with 2-factor alternative Inference based on χ2-measures.

  6. 2 Categorical inference Nominal variables Ordinal variables Tests for conditional independence in multidimensional contingency tables • Inference based on χ2-measures. • LR test with saturated alternative • LR test with 2-factor alternative Inference based on χ2-measures. Inference based on rank correlation measures. • Goodman and Kruskal’s γ measure

  7. 3 An example Type 2 diabetes patients at diagnosis

  8. 3 An example …the corresponding empirical probability table

  9. 3 An example If the two variables are independent then the joint probability function, i.e. the cell probabilities of the table, are just the products of the marginal probabilities of the categories of each of the variables P(X=x and Y=y) = P(X=x)*P(Y=y)

  10. 3 An example …the marginal probability distributions

  11. 3 An example …observed versus expected

  12. 3 An example Compare the observed table and the table expected under independence. Pearson’s statistic, which is based on the sum of squared differences between the observed and the expected table entries, is chi-squared distributed when the null hypothesis is true. Here: df=8 and p=0.94 Or we perform an exact test!

  13. 3 An example Goodman and Kruskal’s γ. Two independent draws (X1,Y1) and (X2,Y2) from the joint (X,Y) distribution.

  14. 3 An example Two independent draws (X1,Y1) and (X2,Y2) from the joint (X,Y) distribution. Concordance ”If X goes up, Y goes up.” Y1 Y2 X1 X2

  15. 3 An example Concordance ”If X goes up, Y goes up; if X goes down, Y goes down” or ”X and Y move in the same direction.” The definition is symmetric Y2 Y1 X2 X1

  16. 3 An example Discordance ”If X goes up, Y goes down.” or ”X and Y move in opposite directions.” Y1 Y2 X2 X1

  17. 3 An example Goodman and Kruskal’sγ. Difference of the probabilities for concordance and discordance scaled with the probability of not having ties. Here: γ=0.02 and p=0.60

  18. 4 Partial order Nominal Ordinal • Only part of the categories is ordered. • Goals in a weight control programme: • No goal set • Keep current weight • Reduction < 2 kg • Reduction < 4 kg • Reduction < 6 kg • Reduction > 6 kg

  19. 4 Partial order Nominal Ordinal • Only part of the categories is ordered. • Goals in a weight control programme: • No goal set • Keep current weight • Reduction < 2 kg • Reduction < 4 kg • Reduction < 6 kg • Reduction > 6 kg Extra-ordinal category

  20. 4 Partial order Nominal Ordinal • Only part of the categories is ordered. • Goals in a weight control programme: • No goal set • Keep current weight • Reduction < 2 kg • Reduction < 4 kg • Reduction < 6 kg • Reduction > 6 kg • Has to be treated as nominal variable and the information in the ordering is lost.

  21. 4 Partial order Nominal Ordinal • Only part of the categories is ordered. • Goals in a weight control programme: • No goal set • Keep current weight • Reduction < 2 kg • Reduction < 4 kg • Reduction < 6 kg • Reduction > 6 kg • No indication on the effect of the extra-ordinal category in relation to the others. ?

  22. 5 Tentative order Nominal Ordinal The ordering of the categories is of interest. Danish political parties: Ø SF A B Q CD Z V C DF

  23. 5 Tentative order Nominal Ordinal The ordering of the categories is of interest. Danish political parties: Ø SF A B Q CD Z V C DF Ordering w.r.t. left-right affiliation Methods for nominal variables do not give information on the nature of the relationship.

  24. 6 Ordinal information • Partially ordinal variables: • have to be treated as nominal variables in general • information in the ordering of the categories, and statistical power, is lost. • Tentatively ordinal variables: • the form of the association has to be deducted by examination of stratified tables or parameters of loglinear models • which in multivariate analysis can be most confusing.

  25. 7 An ordering of a categorical variable An ordering X(r) of X is an ordinal random variable with a specific permutation r of the categories of X. If X has a (partial) order, we regard only valid orderings of X, i.e. orderings based on permutations that do not violate this partial order. Nominal variable: all orderings are valid Ordinal variable: only one ordering is valid

  26. 8 A Pγ measure of association The Pγ measure of association between a partially ordered or nominal X and an ordinal Y: the maximum γ between a valid ordering of X and Y. The optimal monotone ordering of X w.r.t. Y: the valid ordering of X for which this maximum is obtained.

  27. 9 A partial γ measure of association In multidimensional contingency tables one is often interested in the relationship of two variables, X and Y, conditional on (controlled for, stratified by) a third variable Z. Within each stratum of Z, a γ measure is calculated between X and Y. A partial γ measure of monotone association between X and Y is defined as a weighted summary γ measure across subtables spanned by the categories of Z.

  28. 10 A partial Pγ measure of association The partial PγXY|Z between a partially ordered or nominal X and an ordinal Y conditional on a nominal Z is defined as the maximum partial γ between a valid ordering of X and Y. The partial optimal monotone ordering of X w.r.t. Y, controlled for Z is the ordering corresponding to the partial PγXY|Z.

  29. 11 Inference Significance of the Pγmeasure and its corresponding partial measure is assessed by comparison of the obtained value with a simulated distribution under the null hypothesis where X and Y are independent. Resampling tests are standard in the analysis of multi-way contingency tables as tests based on the asymptotic distribution are of very low power

  30. 12 Simulation study • Relationship between X and Y conditional on Z • X and Y ordinal, Z nominal. • Dim(X) = 3 or 5 • Dim(Y) = 3 or 5 • Dim(Z) = 2 or 10 • Uniform marginal distributions • N = 200 • partial γ = 0 or 0.15 • Categories of Y are permuted to calculate Pγ

  31. 13 Simulation study – results γ=0 The attained level of significance, i.e. the power of the tests when the true γ is 0, has to be 5%. Our results show that this is not a problem. All MC estimates of the critical value are in the 95% confidence region: 0.05  0.0135

  32. 14 Simulation study – results γ=0.15

  33. 14 Simulation study – results γ=0.15 Considerably higher power than the other tests. This was to be expected because the data was generated with a monotone relationship.

  34. 14 Simulation study – results γ=0.15 The test based on P is not as powerful as the one based on . The power is higher than both the LR tests considered here.

  35. 14 Simulation study – results γ=0.15 The influence of the simulation parameters is intuitive. This becomes clear in more extensive simulations.

  36. 14 Simulation study – results γ=0.15 Insight is gained in the ordering of the categories. The identification of the correct ordering depends on the number of categories that is permuted. The ordering will be close to, but unlikely to be the correct ordering.

  37. 15 The relation between γ and Pγ Dim(X) = 5 Dim(Y) = 5 Dim(Z) = 10 γ= 0.15

  38. 15 The relation between γ and Pγ

  39. 15 The relation between γ and Pγ |γ|is closer to Pγ when the estimated values for these coefficients are higher

  40. 16 The distribution of Pγ γ = 0 Normal?!?

  41. 16 The distribution of Pγ γ = 0.15 Normal…

  42. 17 Danish political parties • European Values Studies • Denmark: • survey in 1981, 1990 and 1999 • preferred political party (10 parties) • political attitudes measured on a left-right discrete (10 point) VAS scale • 10 x 10 x 3 table • Significance of the assiciation is obvious • Ordering of the parties is common knowledge (up to a certain level…)

  43. 18 Danish political parties - Pγ P = 0.629 a very strong association Left Right categories |Far left | The Red-Green Alliance Ø | The Socialist People’s party SF | The Social Democratic Party A | The Social Liberal Party B | The Christian People’s party Q | The Centre Democrats CD | The Progress Party Z | The Liberal Party V | The Conservative Party C | The Danish People’s Party DF | | |Far right

  44. 18 Danish political parties - Pγ Common knowledge: Left Right categories |Far left | The Red-Green Alliance Ø | The Socialist People’s party SF | The Social Democratic Party A | The Social Liberal Party B | The Christian People’s party Q | The Centre Democrats CD | The Progress Party Z | The Liberal Party V | The Conservative Party C | The Danish People’s Party DF | | |Far right Left (in this order) Center Right

  45. 18 Danish political parties - Pγ The position of DF on the far right is somewhat surprising Left Right categories |Far left | The Red-Green Alliance Ø | The Socialist People’s party SF | The Social Democratic Party A | The Social Liberal Party B | The Christian People’s party Q | The Centre Democrats CD | The Progress Party Z | The Liberal Party V | The Conservative Party C | The Danish People’s Party DF | | |Far right • New party in 1999 • Reflects the political attitudes of the persons preferring DF to other parties in 1999 • Since then, the party has with some success attempted to move towards the middle of the spectrum

  46. 19 A weight control program Weight goals against attained weight. A considerable number have no goal set.

  47. 19 A weight control program For convenience we code the categories of the weight goal variable with letters.

  48. 19 A weight control program We investigate the placement of the no goal set category with Pγ. The relationship is significant, but confounded.

  49. 19 A weight control program We investigate the no goal set category with the partial Pγ.

  50. 19 A weight control program We investigate the no goal set category with the partial Pγ conditional on many possible confounders.

More Related