1 / 31

A Skeptical Bayesian at Durham

A Skeptical Bayesian at Durham. Jim Linnemann MSU AMST Workshop, Fermilab June 1, 2002. Skeptical Bayesian remarks Conference highlights Some things to work on. What do I want? Statistics is not a science; nature won’t tell me the right procedure.

villanuevad
Download Presentation

A Skeptical Bayesian at Durham

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Skeptical Bayesianat Durham Jim Linnemann MSU AMST Workshop, Fermilab June 1, 2002

  2. Skeptical Bayesian remarks • Conference highlights • Some things to work on

  3. What do I want?Statistics is not a science; nature won’t tell me the right procedure • Correction of upper limits for systematics • understand limitations of method • Common practice—recommended in RPP “statistics” section • method sensible enough to sell—let’s try to agree? • Thanks to PDG, Glen Cowan: at last, some advice on systematics! • Default: convolve the likelihood • Mixes Bayes and Frequentist (like Cousins+Highland) • Comparison of experiments • Limit gets worse with worse resolution, background • And same answer if inputs the same! • Connection with known limits • Continuity to simple frequentist cases (b=0, =0) • coverage—if possible • Continuity of limit to error bar? (‘unified method’) • Not obvious: “5 sigma limit” to “1 sigma central” • Is under-coverage of Bayes lower limit crucial?

  4. To Use Bayes or not? • Professional Statisticians are much more Bayes-oriented in last 20 years • Computationally possible • Philosophically coherent • (solipsistic?? Subjective Bayes…) • In HEP: want to publish result, not prior • We want to talk about P(theory|data) • But this requires prior: P(theory) • Likelihoods we can agree on! • Conclusions should be insensitive to a range of priors • Probably true, with enough data • Search limits DO depend on priors! • Hard to convince anyone of a single objective prior!!! • Unpleasant properties of naïve frequentist limits, too • Feldman-Cousins is current consensus • Systematic errors hard in frequentist framework • PDG currently recommends Bayes “smearing of likelihood” • close in spirit to Cousins-Highland mixed Frequentist-Bayesian

  5. Why Bayesian? • Nuisance parameters are not strictly beyond frequentist methods’ reach but • Neyman construction in n-dimensions rarely used • Bayes: Natural treatment of systematics! • Unify treatment with statistical errors consistently • “degree of belief” needed for many systematics • Coherent point of view

  6. Bayes Theorem P(|x) = P(x| ) P() / P(x| )P() d   is the unknown parameter P(x|) = is the Likelihood function (fcn of ) p(data|model, vary model): NOT A PDF P(|x) = pdf of  after we observe x describes posterior knowledge of  P() = pdf of  after we observe x describes prior knowledge of  and what might that be????

  7. Bayesians don’t own the theorem • Theorem in probability • any interpretation of probability following axioms can use it • If prior knowledge in frequencies: • use it to update knowledge of a particular measurement • Entirely within frequentist framework

  8. (Hill/DeYoung)

  9. (Hill/DeYoung)

  10. Why Skeptical? • Faustian bargain • Bayes Theorem: Updates your prior beliefs on signal, not just systematics (nuisance parameters) • Inserts your beliefs where prefer only data (publication) • Not so bad if you have enough data • Must consider alternative priors to avoid solipsism • Reasonable priors lead to same conclusions • Not great in the case of upper limits • Not really independent of signal prior assumptions! • No universally accepted “objective” priors • Even Jeffrey’s metric-independent prior! Bernardo in n-dim? • “flat” not special, nor metric dependent Flat in what variable? Cross section, mass, tan , ln(tan ) ? flat in mass gives much tighter limit than flat in cross section cross section prior rapidly falls in mass: pulls towards 0!

  11. Some Facts:Nuisance • Corrections for background uncertainty are small <15% for even extreme db/b=1 • Efficiency/ luminosity: < 20% if <30% resolution • At least quadratic • Bayes corrections larger than Cousins-Highland • Probably larger than necessary • Esp as approach discovery • Lognormal, beta, gamma agree to 5% P(0)=0 • With flat prior, don’t use Truncated Gaussian (P(0)0)

  12. Some Facts:Signal • Bayes flat signal over-covers upper Poisson limits • But undercovers lower • Nobody’s real prior (but probably doesn’t matter!) • (s+b) (Jeffreys’ form) • Average coverage, OK, but can undercover • Really nobody’s real prior for signal (function of b!) • More complex than flat (s=sigma*L*eff) • Cheat and insert estimates, do dL, deff separately? • PROBLEM: • Differences between signal priors are of order of efficiency corrections that motivated going Bayesian! • Because we don’t have much data

  13. flat

  14. UL Sizes (Narsky)

  15. Other, smaller, worries • HPD (Highest Posterior Density) limits not independent of parameterization (metric): • Stand-in for central limits, for example • P(|x) and P(|x2) don’t have equal height at equiv points • Ideology… • “if your experiment is inconclusive, ask more experts to sharpen the prior” (!) • “the creativity is in formulating the prior” • But result had better be independent of the prior • unless it is expressing a constraint, that you’re sure of! • A pain to waste time on such debates

  16. Bayes at Durham(Michael Goldstein) • Vigorously subjective Bayesian • But not abusive, thank goodness! • “Sensitivity Analysis is at the heart of scientific Bayesianism” • How skeptical would the community as a whole have to be in order not to be convinced. • What prior gives P(hypothesis) > 0.5 • What prior gives P(hypothesis) > 0.99, etc • A modest proposal: • Many big groups have phenomenologists (+ videographers!) • get a statistician as a collaborator • as is common in clinical trials

  17. What to work on now? Education • Look at tutorials from Durham • E.g. Barlow on systematics… • Absorbing experience from LEP • Combining data (Parke) • CLS? (Read)—should at least understand = PV(s+b)/PV(b) (Cowan, PDG Stats) PV = P Value = prob(obs), posterior, like P(2) • interpolating MC samples (Kjaer) • Understand Unfolding (Cowan, Blobel) • Quite important for combining data • And pdf fitting! (many talks at Durham)—make it easier, not harder Research • Blind Analyses • Goodness of Fit Tests • 2 is seldom best test in frequentist • Not much available in Bayesian context—prefer comparison of models • Look at Yasbley’s talk from Belle on problems in B oscillations! • And at Smith and Tovey’s, on dark matter searches: other problems

  18. Belle questions • Unified limits (for rare decays)? • Feldman-Cousins argues for unified limits • Same probability content whether search or measurement • How important to go smoothly from limit to error limits? • the concern is undercoverage of your stated limits • Doesn’t make sense to me! >> .999 for discovery, .68 for measurement! • How to combine statistical and systematic errors • Add linearly or in quadrature • How to deal with  analysis? • Interesting hypotheses of different dimensions: • Circle (physical region?), line, point • And data point outside any of them!

  19. Multidimensional Methods • Aspire to full extraction of information (“Bayes discriminant”) • Equivalent to trying to fit: (little or nothing Bayesian) P(s|x)/P(b|x) = (P(s)/P(b)) x [P(x|s)/P(x|b)] [= Neyman-Pearson Test] unless you know something of P(s)/P(b) v.s. x that MC doesn’t! • Practical: multiple backgrounds, may want to fit separately • Less complexity to fit individual shapes than a sum • Issues in • choice of dimensionality (no one tells you how many!) • Almost always dimensionality < {kinematics + all ID variables} • No easy way to tell when have “all important variables!” (“the limit”) • methods of approximation • control of bias/variance tradeoff • complexity of fit method • Number of free parameters of method itself • Amount of training data needed • “ease” of interpretation • We are following the field; hope for theory to help • An excellent book: Elements of Statistical Learning, Hastie, Tibshirani, Friedman

  20. R.K. Bock Interesting idea: Expansion in dimensionality of correlation

  21. Roger Barlow Calculated the  to use for comparison checks…

  22. Roger Barlow

  23. Aslan/Zech Goodness of Fit “Energy Test” (electrostatics motivated)

  24. Aslan/Zech

  25. “liberating” Paul Harrison Blind Analysis Cousins: it takes longer, especially first time By the way, no one can read light green print…

  26. Blind Analysis • A called shot • Step towards making 3 mean 3 • Many ways to blind • 10% of data; background-only, obscure fit result • Creates a mindset • Avoiding biases and subjectivity

  27. Inherently unstable Measured smoother than true Un-smooth: Enhances noise! Nice discussion of regularization, biases, uncertainies See talk and his statistics book One program: Must balance between oscillations and over-smoothed result = Bias-variance tradeoff Same issues in multidimensional methods Glen Cowan Unfolding (unsmearing)

  28. View as matrix problem “ill-posed” = singular Analyze in terms of Eigenvalues/vectors and condition number V. Blobel Unfolding: Insight Statistical error Truncate eigenfunctions when below error bound

  29. Blobel Unfolding Results oversmoothed Statistical error in neighboring bins no longer uncorrelated! High frequencies not measured: Report fewer bins? (or supply from prior??) Higher modes (noisy) converge slowly: interate only a few times (d’Agostini)

  30. Cousins’ Last Words (for now!)from conference summary • The area under the likelihood function is meaningless. • Mode of a probability density is metric-dependent, as are shortest intervals. • A confidence interval is a statement about P(data | parameters), not P(parameters | data) • Don’t confuse confidence intervals (statements about parameter) with goodness of fit (statement about model itself). • P(non-SM physics | data) requires a prior; you won’t get it from frequentist statistics. • The argument for coherence of Bayesian P is based on P = subjective degree of belief.

More Related