160 likes | 189 Views
Effect of the Reference Set on Frequency Inference. Donald A. Pierce Radiation Effects Research Foundation, Japan Ruggero Bellio Udine University, Italy. Paper, this talk, other things at http://home.att.ne.jp/apple/pierce/.
E N D
Effect of the Reference Set on Frequency Inference Donald A. Pierce Radiation Effects Research Foundation, Japan Ruggero Bellio Udine University, Italy Paper, this talk, other things at http://home.att.ne.jp/apple/pierce/
Frequency inferences depend to first order only on the likelihood function, and to higher order on other aspects of the probability model or reference set That is, “other aspects” not affecting the likelihood function: e.g. censoring models, stopping rules Various reasons for interest in this. e.g. : Foundational: to what extent do frequency inferences violate the Likelihood Principle? Unattractiveness of specifying censoring models Practical effects of stopping rules Here we study to what extent, and in what manner, second-order inferences depend on the reference set
Patients arrive and are randomized to treatments, outcomes Stop at n patients based on outcomes to that point. Then the data has probability model and the likelihood function is this as a function of ,defined only up to proportionality. First-order inference based only on the likelihood function does not depend on the stopping rule, but higher-order inference does depend on this. How does inference allowing for the stopping rule differ from that for fixed sample size? Example: Sequential Clinical Trials The likelihood function does not depend on the stopping rule, including that with fixed n .
Example: Censored Survival Data Patients arrive and are given treatments. Outcome is response time, and when it is time for analysis some patients have yet to respond The likelihood function based on data is but the full probability model involves matters such as the probability distribution of patient arrival times This involves what is called the censoring model. First-order inferences depend on only the likelihood function and not on the censoring model. In what way (how much) do higher-order inferencesdepend on the censoring model? It is unattractive that they should depend at all on this.
Binomial regression: test for trend with 15 observations, estimate towards the boundary, P-values that should be 5% First-order: 7.3% , 0.8%Second-order 5.3% , 4.7% Sequential experiment: underlying data, stop when or Testing , and when stopping at following n Typical second-order effects Generally, in settings with substantial numbers of nuisance parameters, and even for large samples, adjustments may be much larger than this --- or they may not be
Model for data , parametric function of interest MLE , constrained MLE , profile likelihood Starting point: signed LR statistic, first order N(0,1) so to first order To second order, modern likelihood asymptotics yield that Only the adjustment depends on the reference set, so this is what we aim to study Some general notation and concepts
Even to higher order, ideal inference should be based on the distribution of . Modifications of this pertain to its distribution, not to its inferential relevance Computing a P-value for testing an hypotheses on requires only an ordering of datasets for evidence against the hypothesis Consider integrated likelihoods of form where is any smooth prior on the nuisance parameter Then, regardless of the prior, the signed LR statistic based on provides to second order the same ordering of datasets, for evidence against an hypothesis, as does
Now return to the main point of higher-order likelihood asymptotics, namely The theory for this is due to Barndorff-Nielsen (Bmtrka, 1986) and he refers to as Thinking of the data as , depends on notorious sample space derivatives Very difficult to compute, but Skovgaard (Bernoulli, 1996) showed they can be approximated to second order as
For example, Thus we need the quantity to only first order to obtain second-order final results It turns out that each of these approximations has a leading term depending on only the likelihood function, with a next term of one order smaller depending on the reference set A similar expansion gives the same result for the other sample-space derivative
This provides our first main result: If within some class of reference sets (models) we can write, without regard to the reference set, where the are stochastically independent, then second-order inference is the same for all of the reference sets The reason is that when the contributions are independent, the value of must agree to first order with the empirical mean of the contributions , and this mean does not depend on the reference set Thus, in this “independence” case, second-order inference, although not determined by the likelihood function, is determined by the contributions to it
Then the usual contributions to the likelihood, namely do not depend on the censoring model, and are stochastically independent So to second order, frequency inference is the same forany censoring model --- even though some higher-order adjustment should be made Probably should either assume some convenient censoring model, or approximate the covariances from the empirical covariances of contributions to the loglikelihood A main application of this pertains to censoring models, if censoring and response times for individuals are stochastically independent
Things are quite different for comparing sequential and fixed sample size experiments --- usually cannot have “contributions” that are independent in both reference sets But first we need to consider under what conditions second-order likelihood asymptotics applies to sequential settings We argue in our paper that it does whenever usual first-order asymptotics applies These conditions are given by Anscombe’s Theorem: A statistic asymptotically standard normal for fixed n remains so when: (a) the CV of n approaches zero, and (b) the statistic is asymptotically suitably continuous. Discrete n in itself does not invalidate (b)
In the key relation need to consider, following Pierce & Peters (JRSSB 1992), the decomposition Related to Barndorff-Nielsen’s modified profile likelihood NP pertains to effect of fitting nuisance parameters, andINF pertains to moving from likelihood to frequency inference --- INF is small when adj information is large
When and are chosen as orthogonal, we have that to second order depending only on the likelihood function Parameters orthogonal for fixed-size experiments remain orthogonal for any stopping rule, since (for underlying i. i. d. observations) we have from the Wald Identity that Except for Gaussian experiments with regression parameter , there is an INF adjustment both for fixed n and sequential, but they are different Thus, in sequential experiments the NP adjustment and MPL do not depend on the stopping rule, but the INF adjustment does
SUMMARY When there are contributions to the likelihood that are independent under each of two reference sets, then second-order ideal frequency inference is the same for these. In sequential settings we need to consider the nuisance parameter and information adjustments. To second order, the former and the modified profile likelihood do not depend on the stopping rule, but the latter does. This is all as one might hope, or expect. Inference should not, for example, depend on the censoring model but it should depend on the stopping rule
Transform from to , integrate out Provides a second-order approximation to the distributionof . The Jacobian and resultant from the integration are what comprise Appendix: Basis for higher-order likelihood asymptotics