Connections between Bayesian and Conditional Inference in Matched Studies

Connections between Bayesian and Conditional Inference in Matched Studies Ken RiceUniversity of Washington, Dept of Biostatistics November 2006

Bayesian and Conditional Inference in Matched Studies Outline •Matched case-control studies (simple but very highly stratified data) •Conditional and Bayesian approaches •Conditional and Bayesian dogma •A resolution •Nice properties of this resolution

Bayesian and Conditional Inference in Matched Studies Formal description •First ‘match’ subjects; same age, sex, etc; one case to one control •Then assess control exposure = Z1k, case exposure is Z2k, for pair k •e.g. for binary exposureZ1k~Bern( p1k ), Z2k ~Bern( p2k ) •Assume logit(p2k ) = logit(p1k ) + log( ) •Generates one nuisance parameter for each pair (Neyman-Scott) •The likelihood factorizes; Pr(Z1k ,,Z2k ) = Lcond(,Z1k ,,Z2k) × Lmarg(, p1k ,Z1k +,Z2k)

Bayesian and Conditional Inference in Matched Studies Analysis via conditioning •You can maximize w.r.t. all parameters; but MLE for  goes to 2 (awful!) •’Conditional likelihood’ ignores the difficult, unhelpful term; Lcond(,Z1k ,,Z2k) × Lmarg(, p1k ,Z1k +,Z2k) •Conditional likelihood contributions for a binary exposure; •Ratio of discordant pairs gives cMLE for  •Well behaved, standard likelihood asymptotics work. The general form is conditional logistic regression

Bayesian and Conditional Inference in Matched Studies Random-effects analysis •Suppose p1k ~ G, a random-effects/mixing/prior distribution •Marginal likelihood contributions (dropping k); •Inference comes from marginal likelihood (often fully Bayesian) •Define EG(Pr(Z1k +Z2k = t)) = mt; the marginal probabilities

Bayesian and Conditional Inference in Matched Studies Beating up on random-effects •Innocuous-looking G can do even worse than the (awful) naïve MLE! (Seaman and Richardson 2004) •Where did G come from? A big assumption, and hard to check, with this design. •Consistency? Efficiency? Software for non-experts? •Seems quite subjective (and so must be garbage!)

Bayesian and Conditional Inference in Matched Studies Beating up on conditioning! •Why throw away information? There can be some information about  in the marginal probabilities •The cMLE is a bit biased towards the null. You can (sometimes) do better with e.g. a normal distribution for G •The conditioning argument completely falls to pieces outside “pretty” models; it’s not a general prescription to get rid of nuisance parameters •Doesn’t use a full model (… so must be garbage!)

Bayesian and Conditional Inference in Matched Studies A pragmatic common ground •Philosophy aside, everything would be fine if m was free of  •This is actually possible. For pair-matched studies, with any number of categorical covariates, there are mixing distributions with exactly this property. [under review; see also Rice 2004, JASA] •Call these invariant distributions. They exist in closed form, are ‘proper priors’, and by definition have nice conjugacy properties •An example, for 1:1 matched case-control; p1k =1/2 with probability 1/2 p2k =1/2 with probability 1/2 •This gives m={0.25,0.5,0.25}, and is invariant. More generally, transformations of multivariate Normals can be used

Bayesian and Conditional Inference in Matched Studies About invariant distributions #1 •Consider the previous example; using the marginal likelihood, all you need to know is that m={0.25,0.5,0.25} – other details about Gdon’t affect analysis •There are infinitely big classes of distributions which lead to identical marginal probabilities m. We get equivalence classes of mixing distributions; hence we need only make non-parametric assumptions about G •This is (?) entirely novel in Bayesian analysis. It is Bayesian and non-parametric, but nothing like “Bayesian non-parametrics”! •In a weaker sense, absolutely all invariant G are equivalent; the marginal probabilities m and odds ratio(s)  are orthogonal; for any m free of  the marginal and conditional likelihoods are proportional

Bayesian and Conditional Inference in Matched Studies About invariant distributions #2 •Return to the example; if the case/control labels are switched then nothing happens. Other examples show the same behavior. •This is quite a general property. It has to hold for the marginal probabilities – and we saw that nothing else in G actually matters. •So it’s natural (but not necessary) that examples have this property. Conditional logistic regression also behaves in this way; re-labelling everyone gives the same (effective) answer •Can be interpreted as centering the variables (so G depends on ) •Assuming symmetry is surprisingly controversial among Bayesians! But who can really quantify some a priori difference between cases and controls? Why would they do the study?

Bayesian and Conditional Inference in Matched Studies Beating up on fundamentalists •Recommend using invariant mixing distributions. (Call them whatever you like in order to sleep at night) •A direct consequence; without really good justification of some non-invariant G, random-effects fans should use conditional logistic regression •Conditioning fans have absolutely nothing to boast about; random-effects models can have all the nice properties too •For extensions, the random-effects paradigm allows much more flexibility. Measurement error, missing data, hierarchical models, prior information are all quite easy extensions

Bayesian and Conditional Inference in Matched Studies Conclusions •Highly stratified data leads naturally to considerations of exchangeability •Bayesians in particular have been thinking hard about exchangeability for a long time; they have a lot to offer in this field •Puritanical adherence to your favorite mode of inference is unhelpful •My thanks to many colleagues (and referees) who have helped the development of this work immensely References, talks, very silly posters at http://faculty.washington.edu/kenrice

Bayesian and Conditional Inference in Matched Studies Ken’s philosophy of statistical research (in pictures) Boring simple models Exciting (!) complex models Fast Slow Generic, reliable Problem-specific, can behave oddly No power gains to be made Can get you extra power Easy to use/misuse Requires training to get anywhere Fundamentally, most types of regression are interpretable as exercises in model-fitting.

It is typically very hard to adapt submarines for monster-truck jobs But measurement error, missing data, prior information, multiple sources of data are much more straightforward when you start with a model. Bayesian and Conditional Inference in Matched Studies Ken’s philosophy of statistical research (in pictures) As you know, not all analyses fit this description •Cox Regression •Conditional Logistic Regression •Robust covariance estimation (sandwich estimates) •Robustness to outliers through bounded influence In vehicular form: (guess!)

It’s possible to find full-likelihood interpretations of non-standard analyses – although not easy; CLR and bounded influence tackled in this way, others remain •Helps understand why the non-standard methods work, & potential problems •Much easier to allow for measurement error etc; Sub → Truck conversions become Car → Truck •Just really cool Bayesian and Conditional Inference in Matched Studies Ken’s philosophy of statistical research (in pictures) An all-purpose (& rather cool) solution;

Connections between Bayesian and Conditional Inference in Matched Studies

Connections between Bayesian and Conditional Inference in Matched Studies

Presentation Transcript

Bayesian Inference

Bayesian Inference!!!

Bayesian Inference

Bayesian Inference

Bayesian Inference

Bayesian Inference

Bayesian Inference

Bayesian Inference

Bayesian Inference

Bayesian Inference

Bayesian inference

Bayesian Inference

Bayesian inference

Bayesian Inference

Bayesian inference

Bayesian Inference

Bayesian inference

Bayesian Inference

Bayesian inference

Bayesian Inference

Bayesian inference