Geant4 Workshop Catania, October 4 th -9 th 2004

B.Mascialino, A.Pfeiffer, M.G.Pia, A.Ribon, P.Viarengo STATISTICAL TOOLKIT Geant4 Workshop Catania, October 4th-9th 2004

A project to develop a statistical comparison system Qualitative evaluation Quantitative evaluation Goodness of fit testing Comparison of distributions Goodness-of-Fit testing Provide tools for the statistical comparison of distributions • equivalent reference distributions • experimental measurements • data from reference sources • functions deriving from theoretical calculations or fits Detector monitoring Simulation validation Reconstruction vs. expectation Regression testing Physics analysis

Architectural guidelines • The project adopts a solid architectural approach • to offer thefunctionalityand thequalityneeded by the users • to bemaintainableover a large time scale • to beextensible, to accommodate future evolutions of the requirements • Component-based approach • to facilitate re-use and integration in different frameworks • AIDA • adopt a (HEP) standard • no dependence on any specific analysis tool

SPIRAL APPROACH Software process guidelines • United Software Development Process, specifically tailored to the project • practical guidance and tools from theRUP • both rigorous and lightweight • mapping onto ISO 15504 • Guidance from ISO 15504 • Incremental and iterative life cycle model

User Requirements User requirementselicited, analysed and formally specified • Functional (capability) and not-functional (constraint) requirements • User Requirements Document available from the web site • Requirements • Design • Implementation • Test & test results • Documentation Requirement traceability

The algorithms are specialised on the kind of distribution (binned/unbinned) Every algorithm has been rigorously tested! The Toolkit is downloadable from the web: http://www.ge.infn.it/geant4/analysis/HEPstatistics/ It is externally distributed with PI

Chi-squared test • Applies tobinneddistributions • It can be useful also in case of unbinned distributions, but the data must be grouped into classes • Cannot be applied if the counting of the theoretical frequencies in each class is < 5 • When this is not the case, one could try to unify contiguous classes until the minimum theoretical frequency is reached • Otherwise one could use Yates’ formula

More sophisticated algorithms • Kolmogorov-Smirnov test • Goodman approximation of KS test • Kuiper test EMPIRICAL DISTRIBUTION FUNCTION ORIGINAL DISTRIBUTIONS unbinned distributions Dmn SUPREMUM STATISTICS

More powerful algorithms • Cramer-von Mises test • Anderson-Darling test • Fisz-Cramer-von Mises test • k-sample Anderson-Darling test unbinned distributions TESTS CONTAINING A WEIGHTING FUNCTION These algorithms are so powerful that we decided to implement their equivalent in case of binned distributions:

Is2 the most powerful algorithm? Test Power Characteristics Supremum statistics tests Tests containing a weight function 2 < < The power of a test is the probability of rejecting the null hypothesis correctly In terms of power: Talk at IEEE NSS, Rome, 16-22 October 2004 + paper submitted for publication November 2004

GPL License Feedback from users is welcome!

User Documentation • Download • Installation • User Guide • Statistics Reference Guide

User’s point of view The user is completely shielded from both statistical and computingcomplexity. • Simple user layer • Only deal withAIDA objectsand choice ofcomparison algorithm STATISTICAL RESULT TOOLKIT USER EXTRACTS THE ALGORITHM WRITING ONE LINE OF CODE

Examples of practical applications

Microscopic validation ofphysics NIST Geant4 Standard Geant4 LowE Geant4 simulations are statistically comparable with reference data (NIST database http://www.nist.gov) p=1 Chi-squared test THANKS TO SUSANNA GUATELLI READY FOR REGRESSION TESTING

Test beam at Bessy Bepi-Colombo mission X-ray fluorescence spectrum in Iceand basalt (EIN=6.5 keV) Very complex distributions Counts Energy (keV) c2 not appropriate (< 5 entries in some bins, physical information would be lost if rebinned) Anderson-Darling p>0.05 THANKS TO ALFONSO MANTERO Experimental measurements are comparable with Geant4 simulations

Medical physics: IMRT treatment at THANKS TO MICHELA PIERGENTILI Kolmogorov-Smirnov test

WE INVITE ANYONE TO USE IT!!!! Conclusions • This is a newup-to-dateeasy to handle and powerfultool for statistical comparison in particle physics. • Rigorous software process to contribute to the quality of the product • Component-based architecture, OO methods + generic programming to ensure openness to evolution, maintainability, ease of use • It the first tool supplying such a variety of sophisticated and powerful statistical tests in HEP. • AIDA interfaces allow its integration in any other concrete data analysis tool. Applications in: HEP, astrophysics, medical physics, …

Future developments • Power comparison among algorithms • Extension to theoretical functions • Extensions to bidimensional distributions

Geant4 Workshop Catania, October 4 th -9 th 2004

Geant4 Workshop Catania, October 4 th -9 th 2004

Presentation Transcript

4 th October 2012

October 9 th , 2013

Thursday , October 4 th

October 4 th , 2013

October 9 th Seminar

October 9 th 2013

October 4 th

5 th Geant4 Users’ Space Workshop

Tuesday October 9 th

October 9 th , 2013

Monday, October 4 th

9 th October 2012

Thursday October 4 th

Emotion October 19 th , 2004

4 th October 2012

APUSH – October, 4 th

4 th October 2012

9 th Workshop

4 th – 5 th Step Workshop

Geant4 User Workshop, Catania, Italy, 15 th October, 2009

Thursday, October 4 th !

Tuesday, October 4 th