1 / 29

Bayesian Within The Gates A View From Particle Physics

Bayesian Within The Gates A View From Particle Physics. Harrison B. Prosper Florida State University SAMSI 24 January, 2006. Outline. Measuring Zero as Precisely as Possible! Signal/Background Discrimination 1-D Example 14-D Example Some Open Issues Summary. Measuring Zero!.

teva
Download Presentation

Bayesian Within The Gates A View From Particle Physics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bayesian Within The GatesA View From Particle Physics Harrison B. Prosper Florida State University SAMSI 24 January, 2006 Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

  2. Outline • Measuring Zero as Precisely as Possible! • Signal/Background Discrimination • 1-D Example • 14-D Example • Some Open Issues • Summary Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

  3. Measuring Zero! Diamonds may not be forever Neutron <-> anti-neutron transitions, CRISP Experiment (1982 – 1985), Institut Laue Langevin Grenoble, France Method Fire gas of cold neutrons onto a graphite foil. Look for annihilation of anti-neutron component. Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

  4. Measuring Zero! Count number of signal + background events N. Suppress putative signal and count background events B, independently. Results: N = 3 B = 7 Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

  5. Measuring Zero! Classic 2-Parameter Counting Experiment N ~ Poisson(s+b) B ~ Poisson(b) Wanted: A statement like s < u(N,B) @ 90% CL Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

  6. Measuring Zero! In 1984, no exact solution existed in the particle physics literature! But, surely it must have been solved by statisticians. Alas, from Kendal and Stuart I learnt that calculating exact confidence intervals is “a matter of very considerable difficulty”. Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

  7. Measuring Zero! Exact in what way? Over the ensemble of statements of the form s є [0, u) at least 90% of them should be true whatever the true value of the signal s AND whatever the true value of the background parameter b. blame…Neyman (1937) Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

  8. “Keep it simple, but no simpler” Albert Einstein Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

  9. Bayesian @ the Gate (1984) Solution: p(N,B|s,b) = Poisson(s+b) Poisson(b) the likelihood p(s,b) = uniform(s,b) the prior Compute the posteriordensity p(s,b|N,B) p(s,b|N,B) = p(N,B|s,b) p(s,b)/p(N,B) Marginalize over b p(s|N,B) = ∫p(s,b|N,B) db This reasoning was compelling to me then, and is much more so now! Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

  10. Particle Physics Data proton + anti-proton -> positron (e+) neutrino (n) Jet1 Jet2 Jet3 Jet4 This event “lives” in 3 + 2 + 3 x 4 = 17 dimensions. Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

  11. Particle Physics Data CDF/Dzero Discovery of top quark (1995) Data red Signal green Background blue, magenta Dzero: 17-D -> 2-D Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

  12. But that was then, and now is now! Today we have 2 GHz laptops, with 2 GB of memory! It is fun to deploy huge, sometimes unreliable, computational resources, that is, brains, to reduce the dimensionality of data. But perhaps it is now feasible to work directly in the original high-dimensional space, using hardware! Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

  13. Signal/Background Discrimination The optimal solution is to compute p(S|x) = p(x|s) p(s) / [p(x|s) p(s) + p(x|B) p(B)] Every signal/background discrimination method is ultimately an algorithm to approximate this solution, or a mapping thereof. Therefore, if a method is already at the Bayes limit, no other method, however sophisticated, can do better! Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

  14. Signal/Background Discrimination Given D =x,y x = {x1,…xN}, y = {y1,…yN} of N training examples Infer A discriminant function f(x, w), with parameters w p(w|x, y) = p(x, y|w) p(w) / p(x, y) = p(y|x, w) p(x|w) p(w) / p(y|x) p(x) = p(y|x, w) p(w) / p(y|x) assuming p(x|w) -> p(x) Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

  15. Signal/Background Discrimination A typical likelihood for classification: p(y|x, w) = Pi f(xi, w)y [1 – f(xi, w)]1-y where y = 0 for background events y = 1 for signal events If f(x, w) flexible enough, then maximizing p(y|x, w) with respect to w yields f = p(S|x), asymptotically. Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

  16. Signal/Background Discrimination However, in a full Bayesian calculation one usually averages with respect to the posterior density y(x) = ∫ f(x, w) p(w|D) dw Questions: 1. Do suitably flexible functions f(x, w) exist? 2. Is there a feasible way to do the integral? Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

  17. Answer 1: Hilbert’s 13th Problem! Prove that the following is impossible y(x,y,z) = F( A(x), B(y), C(z) ) In 1957, Kolmogorov proved the contrary conjecture y(x1,..,xn) = F( f1(x1),…,fn(xn) ) I’ll call such functions, F, Kolmogorov functions Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

  18. u, a x1 v, b n(x,w) x2 Kolmogorov Functions A neural network is an example of a Kolmogorov function, that is, a function capable of approximating arbitrary mappings f:RN -> U The parameters w = (u, a, v, b) are called weights Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

  19. Answer 2: Use Hybrid MCMC Computational Method Generate a Markov chain (MC) of N points {w} drawn from the posterior density p(w|D) and average over the last M points. Each point corresponds to a network. Software Flexible Bayesian Modeling by Radford Neal http://www.cs.utoronto.ca/~radford/fbm.software.html Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

  20. A 1-D Example Signal • p+pbar -> t q b Background • p+pbar -> W b b NN Model Class • (1, 15, 1) MCMC • 500 tqb + Wbb events • Use last 20 networks in a MC chain of 500. Wbb tqb x Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

  21. A 1-D Example Dots p(S|x) = HS/(HS+HB) HS, HB, 1-D histograms Curves Individual NNs n(x, wk) Black curve < n(x, w) > x Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

  22. A 14-D Example (Finding Susy!) Transverse momentum spectra Signal: black curve Signal/Noise 1/100,000 Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

  23. A 14-D Example (Finding Susy!) Missing transverse momentum spectrum (caused by escape of neutrinos and Susy particles) Variable count 4 x (ET, h, f) + (ET, f) = 14 Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

  24. A 14-D Example (Finding Susy!) Signal 250 p+pbar -> top + anti-top (MC) events Background 250 p+pbar -> gluino gluino (MC) events NN Model Class (14, 40, 1) (641-D parameter space!) MCMC Use last 100 networks in a Markov chain of 10,000, skipping every 20. Likelihood Prior Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

  25. But does it Work? Signal to noise can reach 1/1 with an acceptable signal strength Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

  26. But does it Work? Let d(x) = N p(x|S) + N p(x|B) be the density of the data, containing 2N events, assuming, for simplicity, p(S) = p(B). A properly trained classifier y(x) approximates p(S|x) = p(x|S)/[p(x|S) + p(x|B)] Therefore, if the signal and background events are weighted with y(x), we should recover the signal density. Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

  27. But does it Work? Amazingly well ! Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

  28. Some Open Issues • Why does this insane function p(w1,…,w641|x1,…,x500) behave so well? 641 parameters > 500 events! • How should one verify that an n-D (n ~ 14) swarm of simulated background events matches the n-D swarm of observed events (in the background region)? • How should one verify that y(x) is indeed a reasonable approximation to the Bayes discriminant, p(S|x)? Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

  29. Summary • Bayesian methods have been, and are being, used with considerable success by particle physicists. Happily, the frequentist/Bayesian Cold War is abating! • The application of Bayesian methods to highly flexible functions, e.g., neural networks, is very promising and should be broadly applicable. • Needed: A powerful way to compare high-dimensional swarms of points. Agree, or not agree, that is the question! Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

More Related