250 likes | 441 Views
Lessons about Likelihood Functions from Nuclear Physics. Kenneth M. Hanson T-16, Nuclear Physics; Theoretical Division Los Alamos National Laboratory. Bayesian Inference and Maximum Entropy Workshop, Saratoga Springs, NY, July 8-13, 2007 .
E N D
Lessons about Likelihood Functions from Nuclear Physics Kenneth M. Hanson T-16, Nuclear Physics; Theoretical DivisionLos Alamos National Laboratory Bayesian Inference and Maximum Entropy Workshop, Saratoga Springs, NY, July 8-13, 2007 This presentation available at http://www.lanl.gov/home/kmh/ LA-UR-07-2971 Bayesian Inference and Maximum Entropy 2007
Overview • Uncertainties in physics experiments • Particle Data Group (PDG) • Lifetime data • Outliers • Uncertainty in the uncertainty • Student t distributions vs. Normal distribution • Analysis of lifetime data using t distributions Bayesian Inference and Maximum Entropy 2007
Physics experiments • Experimenters state their measurement of physical quantity y asmeasurement ± standard error or d± σd • Experimenter’s degree of belief in measurement described by a normal distribution (Gaussian) and σd its standard deviation • Experimental uncertainty often composed of two components: • statistical uncertainty • from noise in signal or event counting (Poisson) • Type A – determined by repeated meas., frequentist methods • systematic uncertainty • from equipment calibration, experimental procedure, corrections • Type B – determined by nonfrequentist methods • based on experimenter’s judgment, hence subjective; difficult to assess • these usually added in quadrature (rms sum) Bayesian Inference and Maximum Entropy 2007
Physics experiments – likelihood functions • In probabilistic terms, experimentalist’s statement y± σdis interpreted as a likelihood functionp( d | y σd I)where I is background information about situation, including how experiment is performed • Inference about the physical quantity y is obtained by Bayes lawp( y | d σyI) ~ p( y | d σdI) p( y | I)where p( y | I) is the prior information about y Bayesian Inference and Maximum Entropy 2007
Exploratory data analysis • John Tukey (1977) suggested each set of measurements of a quantity be scrutinized • find quantile positions, Q1, Q2, Q3 • calculate the inter-quantile range IQR = Q3 – Q2 (for normal distr., IQR = 1.35 σ) • determine fraction of data outside interval, SO Q1 – 1.5 IQR < y < Q3 + 1.5 IQRlabeling these as suspected outliers(for normal distr., 0.7%) • IQR measures width of core • SO measures extent of tail Bayesian Inference and Maximum Entropy 2007
Particle Data Group (PDG) • Particle Data Group formed in 1957 • annually summarizes state-of-knowledge of properties of elementary particles • For each particle property • list all relevant experimental data • committee decides which data to include in final analysis • state best current value (usually least-squares average)and its standard error • often magnified by sqrt[χ2/(N – 1)] (avg of 2.0, 50% of time) • PDG reports are excellent source of information about measurements of unambiguous physical quantities • available online, free • provide insight into how physicists interpret data Bayesian Inference and Maximum Entropy 2007
Lambda lifetime measurement in the 60s Hydrogen bubble- chamber photo • Hydrogen bubble chambers were used in 1950s and 60s to record elementary particles • Picture shows reaction sequence: K- + p →Ξ- + K+Ξ-→Λ0 + π-Λ0→ p + π- • Track lengths and particle momenta determined from curvature in magnetic field yield survival time of Ξ- and Λ • Hubbard et al. observed 828 such events to obtain lifetimes:τΞ = 1.69 ±0.06×10-10 sτΛ = 2.59 ±0.09×10-10 s From J.R. Hubbard et al., Phys.Rev.135B (1964) Bayesian Inference and Maximum Entropy 2007
Measurements of neutron lifetime Neutron lifetime measurements • Because n lifetime is so long, it is difficult to measure without slowing neutrons or trapping them • Plot shows all measurements of neutron lifetime • Red line is PDG value, which includes 7 data sets,but excludes older ones and #2 because it is discrepant • χ2 for red line = 149/21 pts. • Evidence of outliers • Large systematic uncertainties Bayesian Inference and Maximum Entropy 2007
Measurements of lifetimes of other particles Bayesian Inference and Maximum Entropy 2007
Collection of lifetime measurements • Goal: determine distribution of measurements relative to their estimated uncertainties • Upper graph shows deviations of 99 lifetime meas. for 5 particles from PDG values, divided by their standard errors, i.e. Δx/σ • Lower graph shows histogram • Objective: characterize the distribution of Δx/σ for these expts • χ2 = 367/99 DOF • IQR = 1.83 (1.35 for normal) • Suspected outliers = 6.6 % Bayesian Inference and Maximum Entropy 2007
Uncertainty in the uncertainty • Suppose there is uncertainty in the stated standard error σ0 for measurement d • Dose and von der Linden (2000) gave plausible derivation: • assume likelihood has underlying Normal distr • assume uncertainty distr for ω, where σ is scaled by • marginalizing over ω, the likelihood is Student t distr., (2a = ν) • Many have contributed to outlier story: Box and Tiao, O’Hagan, Fröhner, Press, Sivia, Hanson and Wolf Bayesian Inference and Maximum Entropy 2007
Student t distribution ν = 1, 5, ∞ • Student* t distribution • long tail for ν< 9 (SO > 1%) • outlier-tolerant likelihood function • ν= 1 is Cauchy distr (solid red) • ν= ∞ is Normal distr (solid blue) * Student (1908) was pseudonym for W.S Gossett, who was not allowed to publish by his employer, Guiness brewery Bayesian Inference and Maximum Entropy 2007
Physical analogy of probability • Drawing analogy between φ(Δx) = minus-log-posterior and a physical potential • is a force with which each datum pulls on fit model • Outlier-tolerant likelihoods • generally have long tails • restoring force eventually decreases for large residuals G 2G G G+E 2G G+C G+C G+E Bayesian Inference and Maximum Entropy 2007
Analysis of a collection of data • To calculate “average” value of a data set, use the Student t distribution for the likelihood of each datum:where s is scaling factor of standard error for whole data set • Select ν based on data using model selection • Scale factor s marginalized out of posterior Bayesian Inference and Maximum Entropy 2007
Model selection t distr • Odds ratios of t distr (t) to Normal (N) is = 1.32x10-85 /2.2x10-90 = 6x104 • for prior ratio on models = 1 • evidence is integral over x (lifetime) and s; includes prior on s proportional to 1/s • Thus, t distr is strongly preferred by data to Normal distr • ν ≈ 2.6 (maximizes evidence) Normal distr Bayesian Inference and Maximum Entropy 2007
Measurements of neutron lifetime Neutron lifetime measurements • Upper plot shows all measurements of neutron lifetime • Lower plot shows results based on all 21 data points: • posterior for t-distr analysis (ν = 2.6, margin. over s) • least-squares result (with and w/o χ2 scaling) • PDG results (using 7 selected data points, Serebrov rejected) Bayesian Inference and Maximum Entropy 2007
Measurements of π0 lifetime π0 lifetime measurements • Upper plot shows all measurements of π0 lifetime • Lower plot shows results based on all 13 data points: • posterior for t-distr analysis (ν = 2.6, margin. over s) • least-squares result (with χ2 scaling) • PDG results (using 4 selected data points, excl. latest one) Bayesian Inference and Maximum Entropy 2007
Measurements of lambda lifetime Lambda lifetime measurements • Upper plot shows all measurements of lambda lifetime • Lower plot shows results based on all 27 data points: • posterior for t-distr analysis (ν = 2.6, margin. over s) • least-squares result (with χ2 scaling) • PDG results (using 3 latest data points) Bayesian Inference and Maximum Entropy 2007
Tests • Draw 20 data points from various t distrs. and analyze them using likelihoods: a) t distr with ν = 3 b) Normal distr • scale uncertainties according to data variance • results from 10,000 random trials • Conclude • t distr results well behaved • normal distr results unstable when data have significant outliers dashed line = estimated σ Normal t distr Bayesian Inference and Maximum Entropy 2007
Summary • Technique presented for dealing gracefully with outliers • is based on using for likelihood function the Student t distr. instead of the Normal distr. • copes with outliers, while treating every datum identically • Particle lifetime data distribution matched by t distr. with ν ≈ 2.6 to 3.0 • using likelihood functions based on t distr. produce stable results when outliers exist in data sets, whereas Normal distr. does not Bayesian Inference and Maximum Entropy 2007
Bibliography • “A further look at robustness via Bayes;s theorem,” G.E.P. Box and G.C. Tiao, Biometrica49, pp. 419-432 (1962) • “On outlier rejection phenomena in Bayes inference,” A. O’Hagan, J. Roy. Statist. Soc.B 41, 358–367 (1979) • “Bayesian evaluation of discrepant experimental data,” F.H. Fröhner, Maximum Entropy and Bayesian Methods, pp. 467–474 (Kluwer Academic, Dordrecht, 1989) • “Estimators for the Cauchy distribution,” K.M. Hanson and D.R. Wolf, Maximum Entropy and Bayesian Methods, pp. 157-164 (Kluwer Academic, Dordrecht, 1993) • “Dealing with duff data,” D. Sivia, Maximum Entropy and Bayesian Methods, pp. 157-164 (1996) • “Understanding data better with Bayesian and global statistical methods,” in W.H. Press, Unsolved Problems in Astrophysics, pp. 49-60 (1997) • “Outlier-tolerant parameter estimation,” V. Dose and W. von der Linden, Maximum Entropy and Bayesian Methods, pp. 157-164 (AIP, 2000) This presentation available at http://www.lanl.gov/home/kmh/ Bayesian Inference and Maximum Entropy 2007