Practical Statistics for Particle Physicists

Louis Lyons &Lorenzo Moneta Imperial College and Oxford CERN CMS expt at LHC l.lyons@physics.ox.ac.uk Lorenzo.moneta@cern.ch Practical Statistics for Particle Physicists IPMU March 2017

Books Statistics for Nuclear and Particle Physicists Cambridge University Press, 1986 Available from CUP Update/Errata in these lectures www-cdf.fnal.gov/physics/notes/Errata2.pdf

Topics 1) Introduction 2) Likelihoods 3) 2 4) Bayes and Frequentism 5) Search for New Phenomena (e.g. Higgs): Upper Limits and Discovery 6) Miscellaneous Time for discussion

Introductory remarks What is Statistics? Probability and Statistics Conditional probability Combining experimental results Binomial, Poisson and Gaussian distributions

What do we do with Statistics? Parameter Determination (best value and range) e.g. Mass of Higgs = 80  2 Goodness of Fit Does data agree with our theory? Hypothesis Testing Does data prefer Theory 1 to Theory 2? (Decision Making What experiment shall I do next?) Why bother? HEP is expensive and time-consuming so Worth investing effort in statistical analysis  better information from data

Probability and Statistics Example: Dice Given P(5) = 1/6, what is Given 20 5’s in 100 trials, P(20 5’s in 100 trials)? what is P(5)? And its uncertainty? If unbiassed, what is Given 60 evens in 100 trials, P(n evens in 100 trials)? is it unbiassed? Or is P(evens) =2/3? THEORY DATA DATA THEORY

Probability and Statistics Example: Dice Given P(5) = 1/6, what is Given 20 5’s in 100 trials, P(20 5’s in 100 trials)? what is P(5)? And its uncertainty? Parameter Determination If unbiassed, what is Given 60 evens in 100 trials, P(n evens in 100 trials)? is it unbiassed? Goodness of Fit Or is P(evens) =2/3? Hypothesis Testing N.B In more interesting cases, parameter values not sensible if goodness of fit is not good

CONDITIONAL PROBABILITY P(A|B) = Prob of A, given that B has occurred P[A+B] = N(A+B)/Ntot = {N(A+B)/N(B)} x {N(B)/Ntot} = P(A|B) x P(B) If A and B are independent, P(A|B) = P(A) Venn diagram  P[A+B] = P(A) x P(B) e.g. P[rainy + Sunday] = P(rainy) x 1/7 INDEP BUT: P[rainy + December]≠ P(rainy) x 1/12 INDEP P[Eelarge + Eν large] ≠P(Ee large) x P(Eν large) INDEP P[A+B] = P(A|B) x P(B) = P(B|A) x P(A)  P(A|B) = P(B|A) x P(A) / P(B) ***** Bayes’ Theorem N.B Usually P(A|B) ≠ P(B|A) Examples later Ntot . N(A) . N(B)

Combining Experimental Results N.B. Better to combine data! * * BEWARE 100±10 2±1? 1±1 or 50.5±5?

BLUEBest Linear UnbiassedEstimate Combine several possibly correlated estimates of same quantity e.g. v1, v2, v3 Covariance matrix σ12cov12 cov13 cov12σ22 cov23 cov13 cov23σ32 Uncorrelated Positive correlation Negative correlation covij = ρijσiσj with -1 ≤ ρ ≤ 1 Lyons, Gibault + Clifford NIM A270 (1988) 42

vbest = w1v1 + w2v2 + w3v3Linear with w1 + w2 + w3 =1 Unbiassed to give σbest = min (wrt w1, w2, w3) Best For uncorrelated case, wi~ 1/σi2 For correlated pair of measurements with σ1 < σ2 vbest = α v1 + β v2 β = 1 - α β = 0 for ρ= σ1/σ2 β < 0 for ρ > σ1/σ2 i.e. extrapolation! e.g. vbest = 2v1 – v2 Extrapolation is sensible: V  Vtrue v1 v2

Beware extrapolations because [b] σbest tends to zero, for ρ = +1 or -1 [a] vbest sensitive to ρ and σ1/σ2 N.B. For different analyses of ~ same data, ρ ~ 1, so choose ‘better’ analysis, rather than combining

N.B. σbest depends on σ1, σ2 and ρ, but not on v1 – v2 e.g. Combining 0±3 and x±3 gives x/2 ± 2 BLUE = χ2 S(vbest) = Σ (vi – vbest) E-1ij (vj – vbest) , and minimise S wrt vbest Smin distributed like χ2, so measures Goodness of Fit But BLUE gives weights for each vi Can be used to see contributions to σbest from each source of uncertainties e.g. statistical and systematics different systematics Extended by Valassi to combining more than one measured quantity e.g. intercepts and gradients of a straight line

Difference between averaging and adding Isolated island with conservative inhabitants How many married people ? Number of married men = 100 ± 5 K Number of married women = 80 ± 30 K Total = 180 ± 30 K Wtd average = 99 ± 5 K CONTRAST Total = 198 ± 10 K GENERAL POINT: Adding (uncontroversial) theoretical input can improve precision of answer Compare “kinematic fitting”

BINOMIAL DISTRIBUTION Ps = N! ps (1-p) N-s , as is obvious (N-s)! s! Expected number of successes = ΣsPs = Np, as is obvious Variance of no. of successes = Np(1-p) Variance ~ Np, for p~0 ~ N(1-p) for p~1 NOT Np in general. NOT s ±√s e.g. 100 trials, 99 successes, NOT 99 ± 10

Binomial Distributions

Poisson Distribution Prob of n independent events occurring in time t when rate is r (constant) e.g. events in bin of histogram NOT Radioactive decay for t ~ τ or larger Limit of Binomial (N∞, p0, Npμ) Pn = e-r t (r t)n /n! = e-μμn/n! (μ = r t) <n> = r t = μ (No surprise!) σ2n = μ“n ±√n” BEWARE 0 ± 0 ? μ∞: Poisson Gaussian, with mean = μ, variance =μ Important for χ2

Poisson Distributions Approximately Gaussian

For your thought Poisson Pn = e-μμn/n! P0 = e–μ P1 = μ e–μ P2 = μ2 /2 e-μ For small μ, P1 ~ μ, P2~ μ2/2 If probability of 1 rare event ~ μ, why isn’t probability of 2 events ~ μ2?

Relation between Poisson and Binomial N people in lecture, m males and f females (N = m + f ) Assume these are representative of basic rates: ν people νp males ν(1-p) females Probability of observing N people = PPoisson = e–ννN /N! Prob of given male/female division = PBinom = pm (1-p)f Prob of N people, m male and f female = PPoissonPBinom = e–νpνm pm * e-ν(1-p)νf(1-p)f m! f ! = Poisson prob for males * Poisson prob for females

y = 1/(√2πσ) exp{-(x-µ)2/2σ2} Gaussian or Normal Significance of σ i) RMS of Gaussian = σ (hence factor of 2 in definition of Gaussian) ii) At x = μ±σ, y = ymax/√e ~0.606 ymax (i.e. σ= half-width at ‘half’-height) iii) Fractional area within μ±σ = 68% iv) Height at max = 1/(σ√2π)

Area in tail(s) of Gaussian 0.002

Practical Statistics for Particle Physicists

Practical Statistics for Particle Physicists

Presentation Transcript

Practical Statistics

Practical Statistics

Practical Statistics

What Particle Physicists Want to Know

Practical Statistics

Physicists

Practical Statistics for Particle Physicists Lecture 2

Practical Statistics for Particle Physicists Lecture 1

Practical Statistics

Particle Physicists and Astrophysicists working together?

Practical Statistics for Physicists

WHAT DO PARTICLE PHYSICISTS DO? Frontiers of Particle Physics

Practical Statistics for Physicists

Statistics for Particle Physics: Limits

Practical Statistics for Physicists

Practical Statistics for Physicists

Statistics for Particle Physics: Intervals

Practical Statistics for Particle Physicists Lecture 3

Practical Statistics for Physicists

Practical Statistics for Physicists

Practical Statistics for Physicists

Practical Statistics for Physicists