1 / 41

Probability, Statistics and Errors in High Energy Physics

Probability, Statistics and Errors in High Energy Physics. Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所. Outline. Errors Probability distribution: Binomial, Poisson, Gaussian Confidence Level Monte Carlo Method. Why do we do experiments?.

damita
Download Presentation

Probability, Statistics and Errors in High Energy Physics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Probability, Statistics and Errorsin High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所

  2. Outline • Errors • Probability distribution: Binomial, Poisson, Gaussian • Confidence Level • Monte Carlo Method

  3. Why do we do experiments? • Parameter determination: determine the numerical value of some physical quantity. • Hypothesis testing: test whether a particular theory is consistent with our data.

  4. Why estimate errors? • We are concerned not only with the answer but also with its accuracy. • For example, speed of light 2.998x108 m/sec • (3.090.15) x108: • (3.090.01) x108: • (3.092) x108:

  5. Source of Errors • Random (Statistic) error: the inability of any measuring device to give infinitely accurate answers. • Systematic error: uncertainty.

  6. Systematic Errors Systematic Error: reproducible inaccuracy introduced by faulty equipment, calibration, or technique Bevington Systematic effects is a general category which includes effects such as background, scanning efficiency, energy resolution, angle resolution, variation of counter efficiency with beam position and energy, dead time, etc. The uncertainty in the estimation of such as systematic effect is called a systematic error Orear Error=mistake? Error=uncertainty?

  7. Experimental Examples • Energy in a calorimeter E=aD+b a & b determined by calibration expt • Branching ratio B=N/(NT)  found from Monte Carlo studies • Steel rule calibrated at 15C but used in warm lab If not spotted, this is a mistake If temp. measured, not a problem If temp. not measured guess uncertainty Repeating measurements doesn’t help

  8. The Binomial n trials r successes Individual success probability p Variance V=<(r-  )2>=<r2>-<r>2 =np(1-p) Mean =<r>=rP( r ) = np A random process with exactly two possible outcomes which occur with fixed probabilities. 1-p p  q

  9. Binomial Examples p=0.1 n=5 n=50 n=20 n=10 p=0.2 p=0.5 p=0.8

  10. Poisson =2.5 ‘Events in a continuum’ The probability of observing r independent events in a time interval t, when the counting rate is  and the expected number events in the time interval is . Mean =<r>=rP( r ) =  Variance V=<(r-  )2>=<r2>-<r>2 =

  11. More about Poisson • The approach of the binomial to the Poisson distribution as N increases. • The mean value of r for a variable with a Poisson distribution is  and so is the variance. This is the basis of the well known nn formula that applies to statistical errors in many situations involving the counting of independent events during a fixed interval. • As , the Poisson distribution tends to a Gaussian one.

  12. Poisson Examples =2.0 =1.0 =0.5 =25 =10 =5.0

  13. Examples • The number of particles detected by a counter in a time t, in a situation where the particle flux  and detector are independent of time, and where counter dead-time  is such that   <<1. • The number of interactions produced in a thin target when an intense pulse of N beam particles is incident on it. • The number of entries in a given bin of a histogram when the data are accumulated over a fixed time interval.

  14. Binomial and Poisson From an exam paper A student is standing by the road, hoping to hitch a lift. Cars pass according to a Poisson distribution with a mean frequency of 1 per minute. The probability of an individual car giving a lift is 1%. Calculate the probability that the student is still waiting for a lift (a) After 60 cars have passed (b) After 1 hour • 0.9960=0.5472 b) e-0.6 * 0.60 /0! =0.5488

  15. Gaussian (Normal) Probability Density Mean =<x>=xP( x ) dx = Variance V=<(x-  )2>=<x2>-<x>2 =

  16. Different Gaussians There’s only one! Width scaling factor Falls to 1/e of peak at x= Normalisation (if required) Location change 

  17. Probability Contents 68.27% within 1 95.45% within 2 99.73% within 3 90% within 1.645  95% within 1.960  99% within 2.576  99.9% within 3.290 These numbers apply to Gaussians and only Gaussians Other distributions have equivalent values which you could use of you wanted

  18. Central Limit Theorem Or: why is the Gaussian Normal? If a variable x is produced by the convolution of variables x1,x2…xN I) <x>=1+2+…N • V(x)=V1+V2+…VN • P(x) becomes Gaussian for large N

  19. Multidimensional Gaussian

  20. Chi squared Sum of squared discrepancies, scaled by expected error Integrate all but 1-D of multi-D Gaussian

  21. About Estimation Theory Probability Calculus Data Given these distribution parameters, what can we say about the data? Given this data, what can we say about the properties or parameters or correctness of the distribution functions? Statistical Inference Data Theory

  22. What is an estimator? An estimator (written with a hat) is a function of the data whose value, the estimate, is intended as a meaningful guess for the value of the parameter . (from PDG)

  23. What is a good estimator? A perfect estimator is: • Consistent • Unbiassed • Efficient minimum One often has to work with less-than-perfect estimators Minimum Variance Bound

  24. The Likelihood Function Set of data {x1, x2, x3, …xN} Each x may be multidimensional – never mind Probability depends on some parameter a a may be multidimensional – never mind Total probability (density) P(x1;a) P(x2;a) P(x3;a) …P(xN;a)=L(x1, x2, x3, …xN ;a) The Likelihood

  25. Ln L a â Maximum Likelihood Estimation Given data {x1, x2, x3, …xN} estimate a by maximising the likelihood L(x1, x2, x3, …xN ;a) In practice usually maximise ln L as it’s easier to calculate and handle; just add the ln P(xi) ML has lots of nice properties

  26. Ln L a â Properties of ML estimation • It’s consistent (no big deal) • It’s biased for small N May need to worry • It is efficient for large N Saturates the Minimum Variance Bound • It is invariant If you switch to using u(a), then û=u(â) Ln L u û

  27. It is not ‘right’. Just sensible. It does not give the ‘most likely value of a’. It’s the value of a for which this data is most likely. Numerical Methods are often needed Maximisation / Minimisation in >1 variable is not easy Use MINUIT but remember the minus sign More about ML

  28. ML does not give goodness-of-fit • ML will not complain if your assumed P(x;a) is rubbish • The value of L tells you nothing Fit P(x)=a1x+a0 will give a1=0; constant P L= a0N Just like you get from fitting

  29. Least Squares y • Measurements of y at various x with errors  and prediction f(x;a) • Probability • Ln L • To maximise ln L, minimise 2 x So ML ‘proves’ Least Squares. But what ‘proves’ ML? Nothing

  30. Least Squares: The Really nice thing • Should get 21 per data point • Minimise 2 makes it smaller – effect is 1 unit of2 for each variable adjusted. (Dimensionality of MultiD Gaussian decreased by 1.) Ndegrees Of Freedom=Ndata pts– N parameters • Provides ‘Goodness of agreement’ figure which allows for credibility check

  31. Large 2 comes from Bad Measurements Bad Theory Underestimated errors Bad luck Small 2 comes from Overestimated errors Good luck Chi Squared Results

  32. Fitting Histograms Often put {xi} into bins Data is then {nj} nj given by Poisson, mean f(xj) =P(xj)x 4 Techniques Full ML Binned ML Proper 2 Simple 2 x x

  33. Full ML Binned ML Proper 2 Simple 2 What you maximise/minimise

  34. Confidence Level:Meaning of Error Estimates • How often we expect to include “the true fixed value of our paramter” P0, within our quoted range, pp, for a repeated series of experiments? • For the actual value P0, the probability that a measurement will give us an answer in a specific range of p is given by the area under the relevant part of Gaussian curve. A conventional choice of this probability is 68%.

  35. The Straightforward Example Apples of different weights Need to describe the distribution  = 68g  = 17 g All weights between 24 and 167 g (Tolerance) 90% lie between 50 and 100 g 94% are less than 100 g 96% are more than 50 g Confidence level statements 50 100

  36. Confidence Levels L U’ • Can quote at any level (68%, 95%, 99%…) • Upper or lower or two-sided (x<U x<L L<x<U) • Two-sided has further choice (central, shortest…) U

  37. Maximum Likelihood and Confidence Levels ML estimator (large N) has variance given by MVB At peak For large N Ln L is a parabola (L is a Gaussian) Ln L Falls by ½ at a Falls by 2 at Read off 68% , 95% confidence regions

  38. Monte Carlo Calculations • The Monte Carlo approach provides a method of solving probability theory problems in situations where the necessary integrals are too difficult to perform. • Crucial element: random number generator.

  39. An Example

  40. References • Lectures and Notes on Statistics in HEP, http://www.ep.ph.bham.ac.uk//group/locdoc/lectures/stats/index.html • Lecture notes of Prof. Roger Barlow, http://www.hep.man.ac.uk/u/roger/ • Louis Lyons, “Statistics for Nuclear and Particle Physicists”, Cambridge 1986. • Particle Data Group, http://pdg.lbl.gov/2004/reviews/contents_sports.html#mathtoolsetc

More Related