1 / 18

General-purpose unfolding framework in ROOT

General-purpose unfolding framework in ROOT. Tim Adye Rutherford Appleton Laboratory BaBar UK Meeting Liverpool University 30 th November 2005. Outline. What is Unfolding? and why might you want to do it? Overview of a few techniques Regularised unfolding Iterative method

zubin
Download Presentation

General-purpose unfolding framework in ROOT

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. General-purpose unfolding framework in ROOT Tim Adye Rutherford Appleton Laboratory BaBar UK Meeting Liverpool University 30th November 2005 Tim Adye

  2. Outline • What is Unfolding? • and why might you want to do it? • Overview of a few techniques • Regularised unfolding • Iterative method • RooUnfold package • Currently implements three methods with a common interface • Status and Plans • References Tim Adye

  3. Unfolding • In other fields known as “deconvolution”, “unsmearing” • Given a “true” PDF in μ, that is corrupted by detector effects, described by a response function, R, we measure a distribution in ν. In terms of histograms • This may involve • inefficiencies: lost events • bias and smearing: events moving between bins(off-diagonal Rij) • With infinite statistics, it would be possible to recover the original PDF by inverting the response matrix Tim Adye

  4. Not so simple… • Unfortunately, if there are statistical fluctuations between bins this information is destroyed • Since R washes out statistical fluctuations, R-1 cannot distinguish between wildly fluctuating and smooth PDFs • Obtain large negative correlations between adjacent bins • Large fluctuations in reconstructed bin contents • Need some procedure to remove wildly fluctuating solutions • Give added weight to “smoother” solutions • Solve for µ iteratively, starting with a reasonable guess and truncate iteration before it gets out of hand • Ignore bin-to-bin fluctuations altogether Tim Adye

  5. What happens if you don’t smooth Tim Adye

  6. True Gaussian, with Gaussian smearing, systematic translation, and variable inefficiency – trained using a different Gaussian

  7. Double Breit-Wigner, with Gaussian smearing, systematic translation, and variable inefficiency – trained using a single Gaussian

  8. So why don’t we always do this? • If the true PDF and resolution function can be parameterised, then a Maximum Likelihood fit is usually more convenient • Directly returns parameters of interest • Does not require binning • If the response function doesn’t include smearing (ie. it’s diagonal), then apply bin-by-bin efficiency correction directly • If result is just needed for comparison (eg. with MC), could apply response function to MC • simpler than un-applying response to data Tim Adye

  9. When to use unfolding • Use unfolding to recover theoretical distribution where • there is no a-priori parameterisation • this is needed for the result and not just comparison with MC • there is significant bin-to-bin migration of events Tim Adye

  10. Where could we use unfolding? • Traditionally used to extract structure functions • Widely used outside PP for image reconstruction • Dalitz plots • Cross-feed between bins due to misreconstruction • “True” decay momentum distributions • Theory at parton level, we measure hadrons • Correct for hadronisation as well as detector effects • Maybe could use smoothing for standard ML fits? Tim Adye

  11. 1. Regularised Unfolding • Use Maximum Likelihood to fit smeared bin contents to measured data, but include regularisation function where the regularisation parameter, α, controls the degree of smoothness (selectαto, eg., minimise mean squared error) • Various choices of regularisation function, S, are used • Tikhonov regularisation: minimise curvature • for some definition of curvature, eg. • RooUnfHistoSvdby Kerstin Tackmann and Heiko Lacker • based onGURU by Andreas Höcker and Vakhtang Kartvelishvili • uses Singular Value Decomposition • RUN by Volker Blobel • Maximum entropy: Tim Adye

  12. 2. Iterative method • Uses Bayes’ theorem to invert and using an initial set of probabilities, pi(eg. flat) obtain an improved estimate • Repeating with new pi from these new bin contents converges quite rapidly • Truncating the iteration prevents us seeing the bad effects of statistical fluctuations • Fergus Wilson and I have implemented this method in ROOT/C++ • Supports 1D, 2D, and 3D cases Tim Adye

  13. 2D Unfolding Example2D Smearing, bias, variable efficiency, and variable rotation

  14. RooUnfold Package • Make these different methods available as ROOT/C++ classes with a common interface to specify • unfolding method and parameters • response matrix • pass directly or fill from MC sample • measured histogram • return reconstructed truth histogram and errors • full covariance matrix • Easy to do with multiple dimensions (when supported) • This would make it easy to try and compare different methods in your analysis • Could also be useful outside BaBar! Tim Adye

  15. RooUnfold Status • Implements • RooUnfoldResponse • response matrix with various filling and access methods • create from MC, use on data (can be stored in a file) • RooUnfold – unfolding algorithm base class • RooUnfoldBayes – Iterative method • RooUnfoldSvd – Inteface to RooUnfHistoSvd package • RooUnfoldBinByBin – Simple bin-by-bin method • Trivial implementation, but useful to compare with full unfolding • RooUnfoldTest and RooUnfoldTest2D • Test with different training and unfolding distributions • Ready for CVS release (next few days) • Announce in Statistics HN • Interface can still be adjusted based on comments Tim Adye

  16. Plans and possible improvements • So far this is mostly a programming exercise • Would be interesting to compare the different methods for some real analysis distributions • But YMMV • Add common tools, useful for all algorithms • Inputs and results in different formats • already supports histograms and ROOT vectors/matrices • Automatic calculation of figures of merit (eg. Â2) • can also use standard ROOT functions on histograms • Simplify selection of regularisation parameter • More algorithms? • Maximum entropy regularisation • Simple matrix inversion without regularisation • perhaps useful with large statistics Tim Adye

  17. References - Overview • G. Cowan, A Survey of Unfolding Methods for Particle Physics, Proc. Advanced Statistical Techniques in Particle Physics, Durham (2002) http://www.ippp.dur.ac.uk/Workshops/02/statistics/ • G. Cowan, Statistical Data Analysis, Oxford University Press (1998), Chapter 11: Unfolding • R. Barlow, SLUO Lectures on Numerical Methods in HEP (2000),Lecture 9: Unfolding www-group.slac.stanford.edu/sluo/Lectures/Stat_Lectures.html Tim Adye

  18. References - Techniques • V. Blobel, Unfolding Methods in High Energy Physics,DESY 84-118 (1984); also CERN 85-02 • A. Höcker and V. Kartvelishvili, SVD Approach to Data Unfolding, NIM A 372 (1996) 469 www.lancs.ac.uk/depts/physics/staff/kartvelishvili.html • K. Tackmann, H. Lacker, Unfolding the Hadronic Mass Spectrumin B->XulνDecays, BAD 894. • G. D’Agostini, A multidimensional unfolding method based on Bayes’ theorem, NIM A 362 (1995) 487 Tim Adye

More Related