220 likes | 251 Views
Dive into practical statistics for particle physicists covering likelihoods, Bayesian methods, hypothesis testing, and parameter determination. Learn to combine experimental results and improve precision using theoretical input.
E N D
Louis Lyons &Lorenzo Moneta Imperial College and Oxford CERN CMS expt at LHC l.lyons@physics.ox.ac.uk Lorenzo.moneta@cern.ch Practical Statistics for Particle Physicists IPMU March 2017
Books Statistics for Nuclear and Particle Physicists Cambridge University Press, 1986 Available from CUP Update/Errata in these lectures www-cdf.fnal.gov/physics/notes/Errata2.pdf
Topics 1) Introduction 2) Likelihoods 3) 2 4) Bayes and Frequentism 5) Search for New Phenomena (e.g. Higgs): Upper Limits and Discovery 6) Miscellaneous Time for discussion
Introductory remarks What is Statistics? Probability and Statistics Conditional probability Combining experimental results Binomial, Poisson and Gaussian distributions
What do we do with Statistics? Parameter Determination (best value and range) e.g. Mass of Higgs = 80 2 Goodness of Fit Does data agree with our theory? Hypothesis Testing Does data prefer Theory 1 to Theory 2? (Decision Making What experiment shall I do next?) Why bother? HEP is expensive and time-consuming so Worth investing effort in statistical analysis better information from data
Probability and Statistics Example: Dice Given P(5) = 1/6, what is Given 20 5’s in 100 trials, P(20 5’s in 100 trials)? what is P(5)? And its uncertainty? If unbiassed, what is Given 60 evens in 100 trials, P(n evens in 100 trials)? is it unbiassed? Or is P(evens) =2/3? THEORY DATA DATA THEORY
Probability and Statistics Example: Dice Given P(5) = 1/6, what is Given 20 5’s in 100 trials, P(20 5’s in 100 trials)? what is P(5)? And its uncertainty? Parameter Determination If unbiassed, what is Given 60 evens in 100 trials, P(n evens in 100 trials)? is it unbiassed? Goodness of Fit Or is P(evens) =2/3? Hypothesis Testing N.B In more interesting cases, parameter values not sensible if goodness of fit is not good
CONDITIONAL PROBABILITY P(A|B) = Prob of A, given that B has occurred P[A+B] = N(A+B)/Ntot = {N(A+B)/N(B)} x {N(B)/Ntot} = P(A|B) x P(B) If A and B are independent, P(A|B) = P(A) Venn diagram P[A+B] = P(A) x P(B) e.g. P[rainy + Sunday] = P(rainy) x 1/7 INDEP BUT: P[rainy + December]≠ P(rainy) x 1/12 INDEP P[Eelarge + Eν large] ≠P(Ee large) x P(Eν large) INDEP P[A+B] = P(A|B) x P(B) = P(B|A) x P(A) P(A|B) = P(B|A) x P(A) / P(B) ***** Bayes’ Theorem N.B Usually P(A|B) ≠ P(B|A) Examples later Ntot . N(A) . N(B)
Combining Experimental Results N.B. Better to combine data! * * BEWARE 100±10 2±1? 1±1 or 50.5±5?
BLUEBest Linear UnbiassedEstimate Combine several possibly correlated estimates of same quantity e.g. v1, v2, v3 Covariance matrix σ12cov12 cov13 cov12σ22 cov23 cov13 cov23σ32 Uncorrelated Positive correlation Negative correlation covij = ρijσiσj with -1 ≤ ρ ≤ 1 Lyons, Gibault + Clifford NIM A270 (1988) 42
vbest = w1v1 + w2v2 + w3v3Linear with w1 + w2 + w3 =1 Unbiassed to give σbest = min (wrt w1, w2, w3) Best For uncorrelated case, wi~ 1/σi2 For correlated pair of measurements with σ1 < σ2 vbest = α v1 + β v2 β = 1 - α β = 0 for ρ= σ1/σ2 β < 0 for ρ > σ1/σ2 i.e. extrapolation! e.g. vbest = 2v1 – v2 Extrapolation is sensible: V Vtrue v1 v2
Beware extrapolations because [b] σbest tends to zero, for ρ = +1 or -1 [a] vbest sensitive to ρ and σ1/σ2 N.B. For different analyses of ~ same data, ρ ~ 1, so choose ‘better’ analysis, rather than combining
N.B. σbest depends on σ1, σ2 and ρ, but not on v1 – v2 e.g. Combining 0±3 and x±3 gives x/2 ± 2 BLUE = χ2 S(vbest) = Σ (vi – vbest) E-1ij (vj – vbest) , and minimise S wrt vbest Smin distributed like χ2, so measures Goodness of Fit But BLUE gives weights for each vi Can be used to see contributions to σbest from each source of uncertainties e.g. statistical and systematics different systematics Extended by Valassi to combining more than one measured quantity e.g. intercepts and gradients of a straight line
Difference between averaging and adding Isolated island with conservative inhabitants How many married people ? Number of married men = 100 ± 5 K Number of married women = 80 ± 30 K Total = 180 ± 30 K Wtd average = 99 ± 5 K CONTRAST Total = 198 ± 10 K GENERAL POINT: Adding (uncontroversial) theoretical input can improve precision of answer Compare “kinematic fitting”
BINOMIAL DISTRIBUTION Ps = N! ps (1-p) N-s , as is obvious (N-s)! s! Expected number of successes = ΣsPs = Np, as is obvious Variance of no. of successes = Np(1-p) Variance ~ Np, for p~0 ~ N(1-p) for p~1 NOT Np in general. NOT s ±√s e.g. 100 trials, 99 successes, NOT 99 ± 10
Poisson Distribution Prob of n independent events occurring in time t when rate is r (constant) e.g. events in bin of histogram NOT Radioactive decay for t ~ τ or larger Limit of Binomial (N∞, p0, Npμ) Pn = e-r t (r t)n /n! = e-μμn/n! (μ = r t) <n> = r t = μ (No surprise!) σ2n = μ“n ±√n” BEWARE 0 ± 0 ? μ∞: Poisson Gaussian, with mean = μ, variance =μ Important for χ2
Poisson Distributions Approximately Gaussian
For your thought Poisson Pn = e-μμn/n! P0 = e–μ P1 = μ e–μ P2 = μ2 /2 e-μ For small μ, P1 ~ μ, P2~ μ2/2 If probability of 1 rare event ~ μ, why isn’t probability of 2 events ~ μ2?
Relation between Poisson and Binomial N people in lecture, m males and f females (N = m + f ) Assume these are representative of basic rates: ν people νp males ν(1-p) females Probability of observing N people = PPoisson = e–ννN /N! Prob of given male/female division = PBinom = pm (1-p)f Prob of N people, m male and f female = PPoissonPBinom = e–νpνm pm * e-ν(1-p)νf(1-p)f m! f ! = Poisson prob for males * Poisson prob for females
y = 1/(√2πσ) exp{-(x-µ)2/2σ2} Gaussian or Normal Significance of σ i) RMS of Gaussian = σ (hence factor of 2 in definition of Gaussian) ii) At x = μ±σ, y = ymax/√e ~0.606 ymax (i.e. σ= half-width at ‘half’-height) iii) Fractional area within μ±σ = 68% iv) Height at max = 1/(σ√2π)
Area in tail(s) of Gaussian 0.002