1 / 21

Comparing Margins of M ultivariate B inary D ata

Comparing Margins of M ultivariate B inary D ata. Bernhard Klingenberg Assoc. Prof. of Statistics Williams College, MA www.williams.edu/~bklingen. Challenges: Associations of various degrees among binary variables Simultaneous Inference

lave
Download Presentation

Comparing Margins of M ultivariate B inary D ata

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Comparing Marginsof Multivariate Binary Data Bernhard Klingenberg Assoc. Prof. of Statistics Williams College, MA www.williams.edu/~bklingen

  2. Challenges: Associations of various degrees among binary variables Simultaneous Inference Sparse and/or unbalanced data, Test statistics with discrete support Asymptotic theory questionable Outline • Setup: • Two indep. groups • Response: Vector of k correlated binary variables (multivariate binary) • Goal: • Inference about k margins: • Marginal Risk Differences • Marginal Risk Ratios

  3. Outline • Motivating Examples • From drug safety or animal toxicity/carcinogenicity studies Source: http://us.gsk.com/products/assets/us_advair.pdf

  4. Source: http://www.pfizer.com/files/products/uspi_lipitor.pdf

  5. Outline • Example: AEs from a vaccine trial (flu shot): > head(Y1) # ACTIVE Treatment n1=1971 ID HEADACHE PAIN MYALGIA ARTHRALGIA MALAISE FATIGUE CHILLS 2 1 1 1 1 1 1 1 4 0 1 1 0 0 1 0 5 1 0 0 0 0 0 0 6 1 1 1 1 1 1 1 7 0 0 0 0 0 1 0 9 1 0 1 1 1 1 1 > head(Y2) # PLACEBO Treatment n2=1554 ID HEADACHE PAIN MYALGIA ARTHRALGIA MALAISE FATIGUE CHILLS 1 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 8 0 0 0 0 1 0 0 10 0 0 0 0 0 0 0 11 0 0 0 0 0 0 0 15 0 0 1 0 0 1 0

  6. Notation and Setup • k-dimensional response vectors: Group 1Group 2 • Random sample in each group: Group 1Group 2 • Joint distrib. in each group depends on 2k-1 parameters Group 1Group 2

  7. Comparing Margins • Usually only interested in k margins.Group 1Group 2 • With just two (k=2) adverse events: Group 1Group 2 Headache Headache Pain Pain

  8. Comparing Margins • Differences in marginal incidence rates between Group 1 (Treatment) andGroup 2 (Control) Group1 Group2 Diff HEADACHE 0.26030.2407 0.0196 INJECTION SITE PAIN 0.60880.1384 0.4705 MYALGIA 0.25880.1088 0.1500 ARTHRALGIA 0.08930.0579 0.0314 MALAISE 0.20850.1332 0.0753 FATIGUE 0.24760.2098 0.0378 CHILLS 0.09280.0463 0.0465

  9. Family of Tests • j-thNull Hypothesis: • Unrestricted and restricted MLEs:

  10. Comparing Margins • Estimates of marginal incidence rates and test statistics comparing Group 1 (Treatment) andGroup 2 (Control)

  11. Asymptotic Test • Note: • Asymptotically, multivariate normal with covariance matrix determined by

  12. Asymptotic Test • Correlation Matrix: > round(cov2cor(Sigma),2) d1 d2 d3 d4 d5 d6 d7 d1 1.00 0.04 0.29 0.26 0.38 0.41 0.27 d21.00 0.18 0.09 0.08 0.10 0.01 d3 1.00 0.46 0.35 0.36 0.30 d4 1.00 0.33 0.33 0.32 d5 1.00 0.510.44 d6 1.00 0.37 d7 1.00 > qmvnorm(0.95, tail="both.tails", corr=cov2cor(Sigma)) $quantile [1] 2.656222

  13. Asymptotic Test • Correlation Matrix: > round(cov2cor(Sigma),2) d1 d2 d3 d4 d5 d6 d7 d11.00 0.06 0.33 0.28 0.41 0.41 0.29 d21.00 0.28 0.11 0.15 0.12 0.09 d3 1.00 0.46 0.41 0.36 0.35 d4 1.00 0.32 0.34 0.28 d5 1.00 0.500.47 d6 1.00 0.37 d7 1.00 > qmvnorm(0.95, tail="both.tails", corr=cov2cor(Sigma)) $quantile [1] 2.653783

  14. Permutation Approach • When testing can use Permutation Approach • This assumes distributions are exchangeable (i.e. identical), much stronger assumption than under null • Need two extra conditions: • Sequences of all 0's as or more likely to occur under group 2 (Control) • Sequence of all 1's as or more likely to occur under group 1 (Treatment)

  15. Permutation vs. Asymptotic • Permutation vs. asymptotic distribution of Permut. Distr. Critical Value: (a = 0.05) cperm= 2.655 casympt= 2.654 cBonf= 2.690 Asympt. Distr.

  16. Family of Tests • Results: Raw and Adjusted P-values

  17. Simultaneous Confidence Intervals • Invert family of tests: • Confidence Region: • Simplifies to simultaneous confidence intervals if

  18. Simultaneous Confidence Intervals • Results: Inverting Score test diffLBUB HEADACHE 0.0196 -0.0196 0.0583 PAIN 0.4705 0.4323 0.5069 MYALGIA 0.1500 0.1162 0.1835 ARTHRALGIA 0.0314 0.0078 0.0547 MALAISE 0.0753 0.0416 0.1086 FATIGUE 0.0378 -0.0002 0.0752 CHILLS 0.0465 0.0239 0.0692

  19. Simultaneous Confidence Intervals • We used (and recommend) score statistic • Could use Wald statistic instead • This is equivalent to fitting marginal model via GEE: • asympt. multiv. normal, with (sandwich) covariance matrix (same as before) • Use distribution of for multiplicity adjustment

  20. Simultaneous Confidence Intervals • Results: GEE approach (= inverting Wald test) diffLBUB HEADACHE 0.0196 -0.0194 0.0586 PAIN 0.4705 0.4331 0.5078 MYALGIA 0.1500 0.1164 0.1836 ARTHRALGIA 0.0314 0.0082 0.0546 MALAISE 0.0753 0.0419 0.1087 FATIGUE 0.0378 0.0001 0.0755 CHILLS 0.0465 0.0241 0.0689

More Related