160 likes | 279 Views
An Answer and a Question Limits: Combining 2 results Significance: Does 2 give 2 ?. Roger Barlow BIRS meeting July 2006. Revisit s+b. Calculator (used in BaBar) based on Cousins and Highland: frequentist for s, Bayesian integration for and b
E N D
An Answer and a QuestionLimits: Combining 2 resultsSignificance: Does 2 give 2? Roger Barlow BIRS meeting July 2006
Revisit s+b • Calculator (used in BaBar) based on Cousins and Highland: frequentist for s, Bayesian integration for and b • See http://www.slac.stanford.edu/~barlow/java/statistics2.html and C.P.C. 149 (2002) 97 • 3 different priors (uniform in ,1/ , ln )
Combining Limits? With 2 measurements x=1.1 0.1 and x=1.2 0.1 the combination is obvious With 2 measurements x<1.1 @ 90% CL and x<1.2 @ 90% CL all we can say is x<1.1 @ 90% CL
Frequentist problem Given N1 events with effcy 1 , background b1 N2 events with effcy 2 , background b2 (Could be 2 experiments, or 2 channels in same experiment) For significance need to calculate, given source strength s, probability of result {N1 ,N2 } or less.
What does “Or less” mean? • Is (3,4) larger or smaller than (2,5) ? More ?? N2 Less ?? N1
Constraint If 1 = 2 and b1=b2 then N1+N2 is sufficient. So cannot just take lower left quadrant as ‘less’. (And the example given yesterday is trivial)
Suggestion • Could estimate s by maximising log (Poisson) likelihood -(i s +bi) + Ni ln (i s +bi) Hence Ni i /( i s +bi) - i =0 • Order results by the value of s they give from solving this • Easier than it looks. For a given {Ni } this quantity is monotonic decreasing with s. Solve once to get sdata , explore s space generating many {Ni } : sign of Ni i /( i sdata +bi) - i tells you whether this estimated s is greater or less than sdata
Message • This is implemented in the code – ‘Add experiment’ button (up to 10) • Comments as to whether this is useful are welcome
Significance Analysis looking for bumps Pure background gives 2old of 60 for 37 dof (Prob 1%). Not good but not totally impossible Fit to background+bump (4 new parameters) gives better 2new of 28 Question: Is this significant? Answer: Yes Question: How much? Answer: Significance is(2 new- 2old ) = (60-28)=5.65 Schematic only!! No reference to any experimental data, real or fictitious Puzzle. How does a 3 sigma discrepancy become a 5 sigma discovery?
Justification? • ‘We always do it this way’ • ‘Belle does it this way’ • ‘CLEO does it this way’
Possible Justification Likelihood Ratio Test a.k.a. Maximum Likelihood Ratio Test If M1 and M2 are models with max. likelihoods L1 and L2 for the data, then 2ln(L2 / L1) is distributed as a 2 with N1 - N2 degrees of freedom Provided that • M2 contains M1 • Ns are large • Errors are Gaussian • Models are linear
Does it matter? • Investigate with toy MC • Generate with Uniform distribution in 100 bins, <events/bin>=100. 100 is large and Poisson is reasonably Gaussian • Fit with • Uniform distribution (99 dof) • Linear distribution (98 dof) • Cubic (96 dof):a0+a1 x + a2 x2 + a3 x3 • Flat+Gaussian (96 dof): a0+a1 exp(-0.5(x- a2)2/a32) Cubic is linear: Gaussian is not linear in a2and a3
One ‘experiment’ Flat +Gauss Flat linear Cubic
Calculate 2 probabilities of differences in models Compare linear and uniform models. 1 dof. Probability flat Method OK Compare flat+gaussian and uniform models. 3 dof. Probability very unflat Method invalid Peak at low P corresponds to large 2 i.e. false claims of significant signal Compare cubic and uniform models. 3 dof. Probability flat Method OK
Not all parameters are equally useful If 2 models have the same number of parameters and both contain the true model, one can give better results than the other.This tells us nothing about the data Shows 2 for flat+gauss v. cubic Same number of parameters Flat+gauss tends to be lower Conclude: 2 does not give 2?
But surely… • In the large N limit, ln L is parabolic in fitted parameters. • Model 2 contains Model 1 with a2=0 etc. So expect ln L to increase by equivalent of 3 in chi squared. Question. What is wrong with this argument? Asymptopic? Different probability? Or is it right and the previous analysis is wrong?