1 / 40

Results from small numbers

Results from small numbers. Roger Barlow Manchester Bohr Lunch 16 th October 2009. The basic equation. Equation in use. With errors. Example: See 100 events, expected background 5.5. A=.5, η=.1, L = 10 fb -1 σ=189±20 fb Systematic errors from uncertainties in A, η, L, B.

dgodwin
Download Presentation

Results from small numbers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Results from small numbers Roger Barlow Manchester Bohr Lunch 16th October 2009

  2. The basic equation Roger Barlow

  3. Equation in use Roger Barlow

  4. With errors Example: See 100 events, expected background 5.5. A=.5, η=.1, L= 10 fb-1 σ=189±20 fb Systematic errors from uncertainties in A, η, L, B Roger Barlow

  5. Why this talk? This treatment assumes N>>0 and N>>B In real life, one or other or both of these may not be true. Particularly if you are looking for a signal that is not there. N=1, B=0.0 quote 1 ± 1 ? N=100, B=101.0 quote -1 ± 10 ? N=0, B=1.1 quote -1.1 ± 0 ???????? Roger Barlow

  6. Step back: Definition of Probability Ensemble of Everything A Limit of frequency P(A)= Limit N N(A)/N P(A) depends not just on A but on the ensemble – which must be specified. Sometimes this is not possible Roger Barlow

  7. Events with no ensemble Some events are unique. Consider “It will probably rain tomorrow.” There is only one tomorrow (Saturday 17th October). There is NO ensemble. P(rain) is either 0/1 =0 or 1/1 = 1 Strict frequentists cannot say 'It will probably rain tomorrow'. This presents severe social problems. Roger Barlow

  8. Circumventing the limitation Assemble an ensemble of statements of which 70% (say) are true. (E.g. Weather forecasts with a verified track record) A frequentist can say: “The statement ‘It will rain tomorrow’ has a 70% probability of being true.” It will rain tomorrow - with 70% confidence For unique events, confidence level statements replace probability statements. Roger Barlow

  9. Measurement and Frequentist probability MT=1712 GeV Is there a 68% probability that MT lies between 169 and 173 GeV? No. MT is unique. It is either in the range or outside. (Soon we’ll know.) But m 2 does bracket x 68% of the time: The statement ‘MT lies between 169 and 173 GeV’ has a 68% probability of being true. MT lies between 169 and 173 GeV with 68% confidence Roger Barlow

  10. Frequentist confidence intervals beyond the simple Gaussian Select Confidence Level value CL and strategy From P(x ,) choose construction (functions x1(), x2()) for which P(x[x1(), x2()])  CL for all Given a measurement X, make statement [LO, HI] @ CL Where X=x2(LO), X=x1(HI) (Neyman technique) Roger Barlow

  11. Confidence Belt Constructed horizontally such that the probability of a result lying inside the belt is 68%(or whatever) Read vertically using the measurement m Example: proportional Gaussian = 0.1  Measures with 10% accuracy Result (say) 100.0 LO=90.91 HI= 111.1 x X Roger Barlow

  12. Strategies 2 sided central. (1-CL)/2 above and below Upper Limit. 1-CL below Lower Limit. 1-CL above Roger Barlow

  13. Small number counting experiments Poisson distribution P(r;  )= e-r / r! For large  can use Gaussian approx. But not small  Choose CL. Just use one curve to give upper limit Discrete observable makes smooth curves into ugly staircases Observe n. Quote upper limit as HI from solving 0n P(r, HI) = 0n e-HI HIr/r! = 1-CL Translation. n is small.  can’t be very large. If the true value is HI (or higher) then the chance of a result this small (or smaller) is only (1-CL) (or less) Roger Barlow

  14. Poisson Table Upper limits n 90% 95% 99% 0 2.30 3.00 4.61 1 3.89 4.74 6.64 2 5.32 6.30 8.41 3 6.68 7.75 10.05 4 7.99 9.15 11.60 5 9.27 10.51 13.11 ..... Notice! If you see nothing, you have set a limit of 3.0 @ 95% CL Roger Barlow

  15. Pause to collect thoughts • N>>0 requirement cracked • N>>B requirement not solved • Suppose you see 0 events and have an expected background of • B=0.1 – report limit 2.9 ? • B=2.9 – report limit 0.1 ?? • B=3.1 – report limit -0.1 ???? Roger Barlow

  16. Alternative definition:Bayesian (Subjective) Probability P(A) is a number describing my degree of belief in A 1=certain belief. 0=total disbelief Can be calibrated against simple classical probabilities. P(A)=0.5 means: I would be indifferent given the choice of betting on A or betting on a coin toss. A can be anything: death, rain, horse races, existence of SUSY… Very adaptable. But no guarantee my P(A) is the same as your P(A). Subjective = unscientific? Roger Barlow

  17. Bayes’ Theorem General (uncontroversial) form P(A|B)P(B) = P(A & B) = P(B|A) P(A ) P(A|B)=P(B|A) P(A) P(B) P(B) can be written P(B|A) P(A) + P(B|not A) (1-P(A)) Examples: People P(Artist|Beard)=P(Beard|Artist) P(Artist) P(Beard)  /K Cherenkov counter P(|signal)=P(signal| ) P() P(signal) Medical diagnosis P(disease|symptom)=P(symptom|disease) P(disease) P(symptom) 0.9*0.5/(.9*.5+.01*.5)= 0.989 Roger Barlow

  18. Bayes’ Theorem “Bayesian” form P(Theory|Data)=P(Data|Theory) P(Theory) P(Data) “Theory” may be an event (e.g. rain tomorrow) Or a parameter value (e.g. Higgs Mass): then P(Theory) is a function P(MH|Data)=P(Data|MH) P(MH) P(Data) Prior Posterior Roger Barlow

  19. = X Measurements:Bayes at work Result value x Theoretical ‘true’ value  P(|x) P(x|) P() Prior is generally taken as uniform Ignore normalisation problems Construct theory of measurements – prior of second measurement is posterior of the first P(x|) is often Gaussian, but can be anything (Poisson, etc) For Gaussian measurement and uniform prior, get Gaussian posterior Roger Barlow

  20. Aside: “Objective” Bayesian statistics • Attempt to lay down rule for choice of prior • ‘Uniform’ is not enough. Uniform in what? • Suggestion (Jeffreys): uniform in a variable for which the expected Fisher information <d2ln L/dx2>is minimum (statisticians call this a ‘flat prior’). • Has not met with general agreement – different measurements of the same quantity have different objective priors Roger Barlow

  21. Pause for breath For Gaussian measurements of quantities with no constraints/objective prior knowledge the same results are given by: • Frequentist confidence intervals • Bayesian posteriors from uniform priors A frequentist and a simple Bayesian will report the same outcome from the same raw data, except one will say ‘confidence’ and the other ‘probability’. Roger Barlow

  22. Bayesian limits from small number counts P(r,)=exp(- )  r/r! With uniform prior this gives posterior for  Shown for various small r results Read off intervals... P() r=0 r=1 m r=2 r=6 Roger Barlow

  23. Upper limits Upper limit from n events 0HI exp(- ) n/n! d = CL Repeated integration by parts: 0n exp(- HI) HIr/r! = 1-CL Same as frequentist limit This is a coincidence! Lower Limit formula is not the same Roger Barlow

  24. Result depends on Prior Example: 90% CL Limit from 0 events Prior flat in m Prior flat in m 2.30 X = = X 1.65 Roger Barlow

  25. Which is right? • Bayesian Method is generally easier, conceptually and in practice • Frequentist method is truly objective. Bayesian probability is personal ‘degree of belief’. This does not worry biologists but should worry physicists. • Ambiguity appears in Bayesian results as differences in prior give different answers, though with enough data these differences vanish • Check for ‘robustnesss under change of prior’ is standard statistical technique, generally ignored by physicists • ‘Uniform priors’ is not a sufficient answer. Uniform in what? Roger Barlow

  26. Problems for Frequentists:Add a background m=S+b Frequentist method (b known, m measured, S wanted) • Find range for m • Subtract b to get range for S Examples: See 5 events, background 1.2 95% Upper limit: 10.5  9.3  See 5 events, background 5.1 95% Upper limit: 10.5  5.4 ? See 5 events, background 10.6 95% Upper limit: 10.5  -0.1  Roger Barlow

  27. S< -0.1? What’s going on? If N<b we know that there is a downward fluctuation in the background. (Which happens…) But there is no way of incorporating this information without messing up the ensemble Really strict frequentist procedure is to go ahead and publish. • We know that 5% of 95%CL statements are wrong – this is one of them • Suppressing this publication will bias the global results Roger Barlow

  28. Similar problems • Expected number of events must be non-negative • Mass of an object must be non-negative • Mass-squared of an object must be non-negative • Higgs mass from EW fits must be bigger than LEP2 limit of 114 GeV 3 Solutions • Publish a ‘clearly crazy’ result • Use Feldman-Cousins technique • Switch to Bayesian analysis Roger Barlow

  29. m=S+b for Bayesians • No problem! • Prior for m is uniform for Sb • Multiply and normalise as before Posterior Likelihood Prior Read off Confidence Levels by integrating posterior X = Roger Barlow

  30. A Frequentist Solution:Feldman Cousins MethodWorks by attacking what looks like a different problem... * by Feldman and Cousins, mostly Roger Barlow

  31. Feldman Cousins: m=s+bb is known. N is measured. s is what we're after This is called 'flip-flopping' and BAD because is wrecks the whole design of the Confidence Belt Suggested solution: 1) Construct belts at chosen CL as before 2) Find new ranking strategy to determine what's inside and what's outside 1 sided 90% 2 sided 90% Roger Barlow

  32. Feldman Cousins: Ranking First idea (almost right) Sum/integrate over range of N values with highest probabilities for this s and b. (advantage that this is the shortest interval) Glitch: Suppose N small. (low fluctuation) P(N;s+b) will be small for any s and never get counted s N Roger Barlow

  33. How it works Instead: compare to 'best' probability for this N, at s=N-b or s=0, and rank on that number Has to be computed for the appropriate value of background b. (Sounds complicated, but there is lots of software around) As n increases, flips from 1-sided to 2-sided limits – and the probability of being in the belt is preserved s N Means that sensible 1-sided limits are quoted instead of nonsensical 2-sided limits! Roger Barlow

  34. Arguments against using Feldman Cousins • Argument 1 It takes control out of hands of physicist. You might want to quote a 2 sided limit for an expected process, an upper limit for something weird • Counter argument: This is the virtue of the method. This control invalidates the conventional technique. In rare cases it is permissible to say ”We set a 2 sided limit, but we're not claiming a signal” Roger Barlow

  35. Example: you reward a good student with a lottery ticket which has a 10% chance of winning $10. A moderate student gets a ticket with a 1% chance of winning $20. They both win. Were you unfair? Feldman Cousins: Argument 2 • Argument 2 If zero events are observed by two experiments, the one with the higher background b will quote the lower limit. This is unfair to hardworking physicists • Counterargument An experiment with higher background has to be ‘lucky’ to get zero events. Luckier experiments will always quote better limits. Averaging over luck, lower values of b get lower limits to report. Example: you reward a good student with a lottery ticket which has a 10% chance of winning £1000. A moderate student gets a ticket with a 1% chance of winning £2000. They both win. Were you unfair? Roger Barlow

  36. Including Systematic Errors m=aS+b m is predicted number of events S is (unknown) signal source strength. Probably a cross section or branching ratio or decay rate a is an acceptance/luminosity factor known with some (systematic) error b is the background rate, known with some (systematic) error Roger Barlow

  37. Full Bayesian Assume priors • for S (uniform?) • For a (Gaussian?) • For b (Poisson or Gaussian?) Write down the posterior P(S,a,b). Integrate over all a,b to get marginalised P(s) Read off desired limits by integration Roger Barlow

  38. Hybrid Bayesian Assume priors • For a (Gaussian?) • For b (Poisson or Gaussian?) Integrate over all a,b to get marginalised P(r,S) Read off desired limits by 0nP(r,S) =1-CL etc Done approximately for small errors (Cousins and Highland). Shows that limits pretty insensitive to a ,b Numerically for general errors (RB: java applet on SLAC web page). Includes 3 priors (for a) that give slightly different results Roger Barlow

  39. Etcetera… • Extend Feldman Cousins • Profile Likelihood: Use P(S)=P(n,S,amax,bmax) where amax,bmax give maximum for this S,n • Empirical Bayes • And more… Roger Barlow

  40. Summary • Straight Frequentist approach is objective and clean but sometimes gives ‘crazy’ results • Bayesian approach useful but has problems. Check for robustness under choice of prior • Feldman-Cousins deserves more widespread adoption • Lots of work still going on • This will all be needed at the LHC But always know what you are doing and say what you are doing. Roger Barlow

More Related