1 / 13

The Statistical Testing Project

This project introduces a new statistical tool for comparing physical distributions of real data, applicable in physics validation and regression testing. It focuses on goodness-of-fit tests like Pearson’s Chi-squared, Kolmogorov-Smirnov, and Anderson-Darling test. The software design follows an object-oriented approach. User requirements include handling, converting, and plotting distributions at different confidence levels. Challenges include the need for Gamma Function for Chi-Squared test and unit testing suggestions.

johnnyjones
Download Presentation

The Statistical Testing Project

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Statistical Testing Project Stefania Donadio and Barbara Mascialino January 15TH, 2003

  2. Aim of the project This project will provide a new way of analysing physical distributions of real data. It was thought as a tool for the statistical testing of Geant4: its application areas are physics validation, regression testing and system testing. Anyhow, its generality may be of interest also in other experimental contexts. At the moment, the core statistical component is designed to be applicable to the problem of comparing two distributions, independently from their origin.

  3. Distributions • By means of this statistical tool, the user shall be able to compare G4 • simulations resultswith: • equivalent reference distributions, • experimental measurements, • data libraries from reference distribution sources, • functions deriving from theoretical calculations, • functions deriving from fits, …

  4. Goodness-of-Fit tests The goodness-of-fit tests are introduced with the aim of verifying the hypothesis that experimental data come from a random variable whose distribution is well known. This problem is very important both in theoretical and experimental analysis. The researcher must decide if theoretical and experimental distribution follow the same functional law. In other words, the problem is concerned with the choice of one of these two alternative hypothesis: H0:F0(x) = FT(x) H1:F0(x)  FT(x), F0(x) < FT(x), F0(x) > FT(x) Of course, in this kind of tests the acceptance of the null hypothesis H0 means that the researcher will be able to specify the distribution analyzed.

  5. GOF tests inserted in the statistical package Pearson’s c2 test Kolmogorov test Kolmogorov – Smirnov test Anderson-Darling test (for both continuous and discrete distributions)

  6. Description of tests Pearson’s Chi-squared test was introduced to study discrete (both quantitative and qualitative) distribution’s adaptation. Kolmogorov-Smirnov test is very useful to verify the adaption of a sample coming from a random continuous variable. Anderson-Darling test is performed to be suitable for any data-set (Aksenov and Savageau-2002) with any skewness (simmetric distribuitions, left or right skewned). Moreover it seems to be sensible to fat tail of distributions.

  7. Other tests projected in GOF Of course, the statistical package could be extended with other goodness-of-fit tests, as for instance: Lilliefors test, Cramer-von Mises test, Kuiper test, Bayesian methods…

  8. Other methods Kolmogorov-Smirnov test can be applied only to continuous distributions. Physical distributions are not continuous. Following Dagum, these binned distributions could be fitted (also a mixture of more than one fit could be possible). In this way, Kolmogorov-Smirnov test statistics could be computed between the fitted function and the theoretical distribution, simply changing the number of degrees of freedom of the test.

  9. User requirements Comparing distributions Converting distributions Confidence levels Handling distributions Treatment of errors Plotting

  10. Software Design User layer <=>Developer layer Based on AIDA interfaces It is ageneral tool with an object oriented approach

  11. The code Chi Squared test => OK Anderson-Darling test (discrete distributions) Kolmogorov-Smirnov test => OK Anderson-Darling test (continuous distributions)

  12. Problems with the existing code Inside the Chi Squared Quality Checher it is needed a Gamma Function. It was found inside the GNU Scientific Library, but this one has the problem that does not work with N >171. This could be a problem!

  13. Unit tests Unit tests are to be performed on the statistical package. We should need some suggestions on reference distribution to test the code (test cases). Acceptance test Integration test System test Unit test Any suggestion? Any suggestion?

More Related