Results from small numbers

Results from small numbers Roger Barlow Manchester Bohr Lunch 16th October 2009

The basic equation Roger Barlow

Equation in use Roger Barlow

With errors Example: See 100 events, expected background 5.5. A=.5, η=.1, L= 10 fb-1 σ=189±20 fb Systematic errors from uncertainties in A, η, L, B Roger Barlow

Why this talk? This treatment assumes N>>0 and N>>B In real life, one or other or both of these may not be true. Particularly if you are looking for a signal that is not there. N=1, B=0.0 quote 1 ± 1 ? N=100, B=101.0 quote -1 ± 10 ? N=0, B=1.1 quote -1.1 ± 0 ???????? Roger Barlow

Step back: Definition of Probability Ensemble of Everything A Limit of frequency P(A)= Limit N N(A)/N P(A) depends not just on A but on the ensemble – which must be specified. Sometimes this is not possible Roger Barlow

Events with no ensemble Some events are unique. Consider “It will probably rain tomorrow.” There is only one tomorrow (Saturday 17th October). There is NO ensemble. P(rain) is either 0/1 =0 or 1/1 = 1 Strict frequentists cannot say 'It will probably rain tomorrow'. This presents severe social problems. Roger Barlow

Circumventing the limitation Assemble an ensemble of statements of which 70% (say) are true. (E.g. Weather forecasts with a verified track record) A frequentist can say: “The statement ‘It will rain tomorrow’ has a 70% probability of being true.” It will rain tomorrow - with 70% confidence For unique events, confidence level statements replace probability statements. Roger Barlow

Measurement and Frequentist probability MT=1712 GeV Is there a 68% probability that MT lies between 169 and 173 GeV? No. MT is unique. It is either in the range or outside. (Soon we’ll know.) But m 2 does bracket x 68% of the time: The statement ‘MT lies between 169 and 173 GeV’ has a 68% probability of being true. MT lies between 169 and 173 GeV with 68% confidence Roger Barlow

Frequentist confidence intervals beyond the simple Gaussian Select Confidence Level value CL and strategy From P(x ,) choose construction (functions x1(), x2()) for which P(x[x1(), x2()])  CL for all Given a measurement X, make statement [LO, HI] @ CL Where X=x2(LO), X=x1(HI) (Neyman technique) Roger Barlow

Confidence Belt Constructed horizontally such that the probability of a result lying inside the belt is 68%(or whatever) Read vertically using the measurement m Example: proportional Gaussian = 0.1  Measures with 10% accuracy Result (say) 100.0 LO=90.91 HI= 111.1 x X Roger Barlow

Strategies 2 sided central. (1-CL)/2 above and below Upper Limit. 1-CL below Lower Limit. 1-CL above Roger Barlow

Small number counting experiments Poisson distribution P(r;  )= e-r / r! For large  can use Gaussian approx. But not small  Choose CL. Just use one curve to give upper limit Discrete observable makes smooth curves into ugly staircases Observe n. Quote upper limit as HI from solving 0n P(r, HI) = 0n e-HI HIr/r! = 1-CL Translation. n is small.  can’t be very large. If the true value is HI (or higher) then the chance of a result this small (or smaller) is only (1-CL) (or less) Roger Barlow

Poisson Table Upper limits n 90% 95% 99% 0 2.30 3.00 4.61 1 3.89 4.74 6.64 2 5.32 6.30 8.41 3 6.68 7.75 10.05 4 7.99 9.15 11.60 5 9.27 10.51 13.11 ..... Notice! If you see nothing, you have set a limit of 3.0 @ 95% CL Roger Barlow

Pause to collect thoughts • N>>0 requirement cracked • N>>B requirement not solved • Suppose you see 0 events and have an expected background of • B=0.1 – report limit 2.9 ? • B=2.9 – report limit 0.1 ?? • B=3.1 – report limit -0.1 ???? Roger Barlow

Alternative definition:Bayesian (Subjective) Probability P(A) is a number describing my degree of belief in A 1=certain belief. 0=total disbelief Can be calibrated against simple classical probabilities. P(A)=0.5 means: I would be indifferent given the choice of betting on A or betting on a coin toss. A can be anything: death, rain, horse races, existence of SUSY… Very adaptable. But no guarantee my P(A) is the same as your P(A). Subjective = unscientific? Roger Barlow

Bayes’ Theorem “Bayesian” form P(Theory|Data)=P(Data|Theory) P(Theory) P(Data) “Theory” may be an event (e.g. rain tomorrow) Or a parameter value (e.g. Higgs Mass): then P(Theory) is a function P(MH|Data)=P(Data|MH) P(MH) P(Data) Prior Posterior Roger Barlow

= X Measurements:Bayes at work Result value x Theoretical ‘true’ value  P(|x) P(x|) P() Prior is generally taken as uniform Ignore normalisation problems Construct theory of measurements – prior of second measurement is posterior of the first P(x|) is often Gaussian, but can be anything (Poisson, etc) For Gaussian measurement and uniform prior, get Gaussian posterior Roger Barlow

Aside: “Objective” Bayesian statistics • Attempt to lay down rule for choice of prior • ‘Uniform’ is not enough. Uniform in what? • Suggestion (Jeffreys): uniform in a variable for which the expected Fisher information <d2ln L/dx2>is minimum (statisticians call this a ‘flat prior’). • Has not met with general agreement – different measurements of the same quantity have different objective priors Roger Barlow

Pause for breath For Gaussian measurements of quantities with no constraints/objective prior knowledge the same results are given by: • Frequentist confidence intervals • Bayesian posteriors from uniform priors A frequentist and a simple Bayesian will report the same outcome from the same raw data, except one will say ‘confidence’ and the other ‘probability’. Roger Barlow

Bayesian limits from small number counts P(r,)=exp(- )  r/r! With uniform prior this gives posterior for  Shown for various small r results Read off intervals... P() r=0 r=1 m r=2 r=6 Roger Barlow

Upper limits Upper limit from n events 0HI exp(- ) n/n! d = CL Repeated integration by parts: 0n exp(- HI) HIr/r! = 1-CL Same as frequentist limit This is a coincidence! Lower Limit formula is not the same Roger Barlow

Result depends on Prior Example: 90% CL Limit from 0 events Prior flat in m Prior flat in m 2.30 X = = X 1.65 Roger Barlow

Which is right? • Bayesian Method is generally easier, conceptually and in practice • Frequentist method is truly objective. Bayesian probability is personal ‘degree of belief’. This does not worry biologists but should worry physicists. • Ambiguity appears in Bayesian results as differences in prior give different answers, though with enough data these differences vanish • Check for ‘robustnesss under change of prior’ is standard statistical technique, generally ignored by physicists • ‘Uniform priors’ is not a sufficient answer. Uniform in what? Roger Barlow

Problems for Frequentists:Add a background m=S+b Frequentist method (b known, m measured, S wanted) • Find range for m • Subtract b to get range for S Examples: See 5 events, background 1.2 95% Upper limit: 10.5  9.3  See 5 events, background 5.1 95% Upper limit: 10.5  5.4 ? See 5 events, background 10.6 95% Upper limit: 10.5  -0.1  Roger Barlow

S< -0.1? What’s going on? If N<b we know that there is a downward fluctuation in the background. (Which happens…) But there is no way of incorporating this information without messing up the ensemble Really strict frequentist procedure is to go ahead and publish. • We know that 5% of 95%CL statements are wrong – this is one of them • Suppressing this publication will bias the global results Roger Barlow

Similar problems • Expected number of events must be non-negative • Mass of an object must be non-negative • Mass-squared of an object must be non-negative • Higgs mass from EW fits must be bigger than LEP2 limit of 114 GeV 3 Solutions • Publish a ‘clearly crazy’ result • Use Feldman-Cousins technique • Switch to Bayesian analysis Roger Barlow

m=S+b for Bayesians • No problem! • Prior for m is uniform for Sb • Multiply and normalise as before Posterior Likelihood Prior Read off Confidence Levels by integrating posterior X = Roger Barlow

A Frequentist Solution:Feldman Cousins MethodWorks by attacking what looks like a different problem... * by Feldman and Cousins, mostly Roger Barlow

Feldman Cousins: m=s+bb is known. N is measured. s is what we're after This is called 'flip-flopping' and BAD because is wrecks the whole design of the Confidence Belt Suggested solution: 1) Construct belts at chosen CL as before 2) Find new ranking strategy to determine what's inside and what's outside 1 sided 90% 2 sided 90% Roger Barlow

Feldman Cousins: Ranking First idea (almost right) Sum/integrate over range of N values with highest probabilities for this s and b. (advantage that this is the shortest interval) Glitch: Suppose N small. (low fluctuation) P(N;s+b) will be small for any s and never get counted s N Roger Barlow

How it works Instead: compare to 'best' probability for this N, at s=N-b or s=0, and rank on that number Has to be computed for the appropriate value of background b. (Sounds complicated, but there is lots of software around) As n increases, flips from 1-sided to 2-sided limits – and the probability of being in the belt is preserved s N Means that sensible 1-sided limits are quoted instead of nonsensical 2-sided limits! Roger Barlow

Arguments against using Feldman Cousins • Argument 1 It takes control out of hands of physicist. You might want to quote a 2 sided limit for an expected process, an upper limit for something weird • Counter argument: This is the virtue of the method. This control invalidates the conventional technique. In rare cases it is permissible to say ”We set a 2 sided limit, but we're not claiming a signal” Roger Barlow

Example: you reward a good student with a lottery ticket which has a 10% chance of winning $10. A moderate student gets a ticket with a 1% chance of winning $20. They both win. Were you unfair? Feldman Cousins: Argument 2 • Argument 2 If zero events are observed by two experiments, the one with the higher background b will quote the lower limit. This is unfair to hardworking physicists • Counterargument An experiment with higher background has to be ‘lucky’ to get zero events. Luckier experiments will always quote better limits. Averaging over luck, lower values of b get lower limits to report. Example: you reward a good student with a lottery ticket which has a 10% chance of winning £1000. A moderate student gets a ticket with a 1% chance of winning £2000. They both win. Were you unfair? Roger Barlow

Including Systematic Errors m=aS+b m is predicted number of events S is (unknown) signal source strength. Probably a cross section or branching ratio or decay rate a is an acceptance/luminosity factor known with some (systematic) error b is the background rate, known with some (systematic) error Roger Barlow

Full Bayesian Assume priors • for S (uniform?) • For a (Gaussian?) • For b (Poisson or Gaussian?) Write down the posterior P(S,a,b). Integrate over all a,b to get marginalised P(s) Read off desired limits by integration Roger Barlow

Hybrid Bayesian Assume priors • For a (Gaussian?) • For b (Poisson or Gaussian?) Integrate over all a,b to get marginalised P(r,S) Read off desired limits by 0nP(r,S) =1-CL etc Done approximately for small errors (Cousins and Highland). Shows that limits pretty insensitive to a ,b Numerically for general errors (RB: java applet on SLAC web page). Includes 3 priors (for a) that give slightly different results Roger Barlow

Etcetera… • Extend Feldman Cousins • Profile Likelihood: Use P(S)=P(n,S,amax,bmax) where amax,bmax give maximum for this S,n • Empirical Bayes • And more… Roger Barlow

Summary • Straight Frequentist approach is objective and clean but sometimes gives ‘crazy’ results • Bayesian approach useful but has problems. Check for robustness under choice of prior • Feldman-Cousins deserves more widespread adoption • Lots of work still going on • This will all be needed at the LHC But always know what you are doing and say what you are doing. Roger Barlow

Results from small numbers

Results from small numbers

Presentation Transcript

Small Scale Dairy in Nepal-Results from the TCP

From numbers to intelligence

Recent Results from

Scientific Notation (small numbers)

Small and large numbers

Diffractive Results from

Results from GAMMA

Small Scale Dairy in Nepal-Results from the TCP

Small Changes, Big Results

Results from hsb_subset.do

First results from the Super-Extended Very Small Array.

NUMBERS FROM 1--9

Lessons from Numbers

results from

 Results from KTeV

Medium formation in small systems? Experimental results from RHIC

Results from Small Numbers

unit 3: Numbers Large and Small

UNIT 3: Numbers Large and Small

Numbers from 0 - 10