110 likes | 263 Views
Tutorial of the Statistical Toolkit. Geant4 Workshop Vancouver, September 2003. Alberto Ribon CERN. http://www.ge.infn.it/geant4/analysis/TandA. Download/setup the Statistical Toolkit. http://www.ge.infn.it/geant4/analysis/HEPstatistics
E N D
Tutorial of the Statistical Toolkit Geant4 Workshop Vancouver, September 2003 Alberto Ribon CERN http://www.ge.infn.it/geant4/analysis/TandA
Download/setup the Statistical Toolkit http://www.ge.infn.it/geant4/analysis/HEPstatistics Download : StatisticsTesting-V1-00-00.tgz tarball You also need to install the following libraries: 1) GSL (GNU Scientific Library) 2) AIDA 3.0 3) Anaphe 5.0.5 Edit the script buildAll.py to set the proper GSL path GSL_DIR=/afs/cern.ch/sw/lhcxx/specific/redhat73/gcc3.2/ PublicDomainPackages/2.0.0/ Run the script: $ ./buildAll.py
Statistiscal Tests available Currently the available statistical tests are: 1) Chi2 test (for binned distributions); 2) Kolmogorov-Smirnov test (for unbinned distributions). 3) Cramer-von Mises test (for both binned and unbinned distributions); 4) Anderson-Darling test (for both binned and unbinned distributions); …
The AIDA classes you need to know AIDA::IHistogram1D : for 1-dimensional binned distribution AIDA::ICloud1D : for 1-dimensional unbinned distribution AIDA::IDataPointSet : vector of IDataPoint; IDataPoint : is a vector of IMeasurement; IMeasurement : (value, errorPlus, errorMinus).
The StatisticsTesting classes you need to know namespace StatisticsTesting { template < class Algorithm > // For instance: Chi2ComparisonAlgorithm , // KolmogorovSmirnovComparisonAlgorithm . class StatisticsComparator { public: ComparisonResult compare (const AIDA::IDataPointSet & dps1, const AIDA::IDataPointSet & dps2); ComparisonResult compare (const AIDA::IHistogram1D & histo1, const AIDA::IHistogram1D & histo2); ComparisonResult compare (const AIDA::ICloud1D & cloud1, const AIDA::ICloud1D & cloud2); ... } class ComparisonResult { public: double distance( ); double quality ( ); double ndf ( ); ... } }
Chi2 test between histograms #Include “AIDA/AIDA.h” #include “StatisticsTesting/StatisticsComparator.h” #include “Chi2ComparisonAlgorithm.h” #include “ComparisonResult.h” using namespace StatisticsTesting; std::auto_ptr<AIDA::IAnalysisFactory> af( AIDA_createAnalysisFactory() ); std::auto_ptr<AIDA::ITreeFactory> tf( af -> createTreeFactory() ); std::auto_ptr<AIDA::ITree> tree( tf -> create() ); std::auto_ptr<AIDA::IHistogramFactory> hf( af->createHistogramFactory( *tree ) ); AIDA::IHistogram1D& hA = *( hf->createHistogram1D( "A", 100, 0.0, 50.0) ); AIDA::IHistogram1D& hB = *( hf->createHistogram1D( "B", 100, 0.0, 50.0) ); hA.fill( 15.7 ); ... hB.fill( 23.4 ); ... StatisticsComparator< Chi2ComparisonAlgorithm > comparator; ComparisonResult result = comparator.compare( hA, hB ); std::cout << “ distance=“ << result.distance() << “ ndf=“ << result.ndf() << “ p-value=“ << result.quality();
Kolmorogov-Smirnov test between clouds #Include “AIDA/AIDA.h” #include “StatisticsTesting/StatisticsComparator.h” #include “KolmogorovSmirnovComparisonAlgorithm.h" #include “ComparisonResult.h” using namespace StatisticsTesting; std::auto_ptr<AIDA::IAnalysisFactory> af( AIDA_createAnalysisFactory() ); std::auto_ptr<AIDA::ITreeFactory> tf( af -> createTreeFactory() ); std::auto_ptr<AIDA::ITree> tree( tf -> create() ); std::auto_ptr<AIDA::IHistogramFactory> hf( af->createHistogramFactory( *tree ) ); AIDA::ICloud1D& cloudA = *( hf->createCloud1D( "A" ) ); AIDA::ICloud1D& cloudB = *( hf->createCloud1D( "B" ) ); cloudA.fill( 15.7 ); ... cloudB.fill( 23.4 ); ... StatisticsComparator< KolmogorovSmirnovComparisonAlgorithm > comparator; ComparisonResult result = comparator.compare( cloudA, cloudB ); std::cout << “ K-S distance=“ << result.distance() << “ p-value=“ << result.quality();
Example of an XML data file <?xml version=“1.0” encoding=“ISO-8859-1”?> <!DOCTYPE aida SYSTEM http://aida.freehep.org/schemas/3.0/aida.dtd > <aida version=“3.0”> <implementation package=“Anaphe” version=“5.0.0”/> <dataPointSet dimension=“2” name=“attenuation coefficient” path=“/” title=“attenuation coefficient in Ge”> <annotation> <item key=“Title” value=“attenuation coefficient in Ge”/> <item key=“Name” value=“attenuation coefficient”/> <item key=“Size” value=“24”/> </annotation> <dataPoint> <measurement value=“1.000e-03”/> <measurement errorMinus=“9.465e+01” errorPlus=“9.465e+01” value=“1.893e+03” /> </dataPoint> <measurement value=“1.500e-03/> <measurement errorMinus=“1.3555e+02” errorPlus=“2.7375e+02” value=“5.475e+03”/> </dataPoint> … <dataPoint> <measurement value=“1.500e+01/> <measurement errorMinus=“1.670e-03” errorPlus=“1.670e-03” value=“3.340e-02”/> </dataPoint> </dataPointSet> </aida>
How to write a XML data file aFact = AIDA_createAnalysisFactory(); treeFact = aFact -> createTreeFactory(); … theTree = treeFact -> create(“test50.xml”, “xml”, false, true, “uncompress” ); dataPointFactory = aFact -> createDataPointSetFactory( *theTree ); particleTransmissionDataPoint = dataPointFactory -> create(“Transmission test”, 2); … particleTransmissionDataPoint -> addPoint(); AIDA::IDataPoint* point = particleTransmissionDataPoint -> point( PointNumber ); AIDA::IMeasurement* coordinateX = point -> coordinate( 0 ); coordinateX -> setValue( primaryParticleEnergy ); AIDA::IMeasurement* coordinateY = point -> coordinate( 1 ); coordinateY -> setValue( TransFraction ); coordinateY -> setErrorPlus( TransError ); coordinateY -> setErrorMinus( TransError ); … theTree -> commit(); theTree -> close();
How to compare XML dataPointSets #Include “AIDA/AIDA.h” #include “StatisticsTesting/StatisticsComparator.h” #include “Chi2ComparisonAlgorithm.h” #include “ComparisonResult.h” using namespace StatisticsTesting; std::auto_ptr<AIDA::IAnalysisFactory> af( AIDA_createAnalysisFactory() ); std::auto_ptr<AIDA::ITreeFactory> tf( af -> createTreeFactory() ); std::auto_ptr<AIDA::ITree> tree( tf -> create() ); std::auto_ptr<AIDA::IDataPointSetFactory> dpsf( af -> createDataPointSetFactory( *tree ) ); std::auto_ptr<AIDA::ITree> treeXML1( tf -> create(“gamma_lowE_Ge.xml”, “xml”, true, false); std::auto_ptr<AIDA::ITree> treeXML2( tf -> create(“NIST_attenuationGamma_Ge.xml”, true, false); AIDA::IDataPointSet & dps1 = * ( dynamic_cast<AIDA::IDataPointSet*>( treeXML1 -> find(“Gamma attenuation coefficient test” ) ) ); AIDA::IDataPointSet & dps2 = * ( dynamic_cast<AIDA::IDataPointSet*>( treeXML2 -> find(“ Gamma attenuation coefficient test” ) ) ); std::cout << “ title ” << dps1.title() << “ size=“ << dps1.size() << “ dimension=“ << dps1.dimension() StatisticsComparator< Chi2ComparisonAlgorithm > comparator; ComparisonResult result = comparator.compare( dps1, dps2 ); std::cout << “ distance=“ << result.distance() << “ ndf=“ << result.ndf() << “ p-value=“ << result.quality();
Conclusions • The Statistical Toolkit provides already some important functionalities; • It is already used in test50 for Geant4 physics tests and regression; • it is simple to install and to use; we are working on the documentation; • other statistical tests are under development; • various long-term extensions are foreseen …