1 / 19

Statistical Learning in Astrophysics

Statistical Learning in Astrophysics. Max-Planck-Institut für Physik, München MPI für extraterrestrische Physik, München Forschungszentrum Jülich GmbH. Jens Zimmermann zimmerm@mppmu.mpg.de. Statistical Learning? Three Classes of Learning Methods Applications in Physics Analysis

ismail
Download Presentation

Statistical Learning in Astrophysics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistical Learning in Astrophysics Max-Planck-Institut für Physik, München MPI für extraterrestrische Physik, München Forschungszentrum Jülich GmbH Jens Zimmermann zimmerm@mppmu.mpg.de Statistical Learning? Three Classes of Learning Methods Applications in Physics Analysis Training the Learning Methods Examples Conclusion Jens Zimmermann, Forschungszentrum Jülich, Astroteilchenschule 10/04

  2. # slides 0 1 2 3 4 5 6 x10 0 1 2 3 4 5 6 x10 # formulas Some Events ExperimentalistsTheorists Jens Zimmermann, Forschungszentrum Jülich, Astroteilchenschule 10/04

  3. # slides 0 1 2 3 4 5 6 x10 0 1 2 3 4 5 6 x10 # formulas First Analysis 0 2 4 6 x10 # formulas 0 2 4 6 x10 # slides Experimentalists Theorists Jens Zimmermann, Forschungszentrum Jülich, Astroteilchenschule 10/04

  4. # formulas #formulas < 20 exp #formulas > 60 th 0 2 4 6 x10 # slides #slides > 40 exp #slides < 40 th 0 2 4 6 x10 all events #formulas > 60 #formulas < 20 rest th exp subset 20 < #formulas < 60 #slides < 40 #slides > 40 th exp Decision Trees 20 < #formulas < 60 ? Jens Zimmermann, Forschungszentrum Jülich, Astroteilchenschule 10/04

  5. Local Density Estimators Search for similar events that are already classified and count the members of the two classes. (e.g. k-Nearest-Neighbour) # slides 0 1 2 3 4 5 6 x10 # slides 0 1 2 3 4 5 6 x10 0 1 2 3 4 5 6 x10 # formulas 0 1 2 3 4 5 6 x10 # formulas Jens Zimmermann, Forschungszentrum Jülich, Astroteilchenschule 10/04

  6. Methods Based on Linear Separation Divide the input space into regions separated by one or more hyperplanes. Extrapolation is done! # slides 0 1 2 3 4 5 6 x10 # slides 0 1 2 3 4 5 6 x10 0 1 2 3 4 5 6 x10 # formulas 0 1 2 3 4 5 6 x10 # formulas Jens Zimmermann, Forschungszentrum Jülich, Astroteilchenschule 10/04

  7. 1 0 -1.8 +3.6 +3.6 -50 +20 +1.1 -1.1 +0.1 +0.2 # formulas # slides Neural Networks Train NN with two hidden neurons (gradient descent): Construct NN with two separating hyperplanes: 0 1 2 3 4 5 6 x10 0 1 2 3 4 5 6 x10 Jens Zimmermann, Forschungszentrum Jülich, Astroteilchenschule 10/04

  8. NN Training 8 hidden neurons = 8 separating lines signal Test-Error background Train-Error Training Epochs Jens Zimmermann, Forschungszentrum Jülich, Astroteilchenschule 10/04

  9. Easily separable but with noise? Without noise and separable by complicated boundary? # slides 0 1 2 3 4 5 6 x10 0 1 2 3 4 5 6 x10 # formulas Too high degree of polynomial results in interpolation but too low degree means bad approximation Training of Statistical Learning Methods Statistical Learning Method: From N examples infer a rule Important: Generalisation vs. Overtraining Jens Zimmermann, Forschungszentrum Jülich, Astroteilchenschule 10/04

  10. Classification Offline „Purification“ Gamma vs. Hadron MAGIC ~10µm ~300µm Regression XEUS X-ray CCD Applications in Physics Analysis Classification Online „Trigger“ H1 L2NN Charged Current Jens Zimmermann, Forschungszentrum Jülich, Astroteilchenschule 10/04

  11. PCA FFT Symmetrie Fit Features Choose raw quantities Jens Zimmermann, Forschungszentrum Jülich, Astroteilchenschule 10/04

  12. transfer direction ~10µm ~300µm electron potential • of reconstruction in µm: Neural Networks 3.6 classical methods k-Nearest-Neighbour 3.7 ETA 3.9 CCOM 4.0 Regression of the Incident Position of X-ray Photons XEUS Jens Zimmermann, Forschungszentrum Jülich, Astroteilchenschule 10/04

  13. Pileup vs. Single photon ? ? 99/67 99/52 pileup rejection [%] classical algorithm „XMM“ photon efficiency [%] Pileup Recognition – Setup Jens Zimmermann, Forschungszentrum Jülich, Astroteilchenschule 10/04

  14. a SuperCuts + Neural Network 46.8 s SuperCuts 39.0 s MAGIC - Gamma/Hadron Separation Observation Mkn421 22.04.04 Jens Zimmermann, Forschungszentrum Jülich, Astroteilchenschule 10/04

  15. Random Forest C4.5 CART k-Nearest-Neighbour Maximum Likelihood Support VectorMachines Linear DiscriminantAnalysis Neural Networks Conclusion • Three classes of statistical learning methods • Decision Trees (Bagging) • Local Density Estimators • Linear Separation • Many applications in current astrophysics experiments and analysis • Compared to classical methods usually at least small improvements Jens Zimmermann, Forschungszentrum Jülich, Astroteilchenschule 10/04

  16. Our hypothesis should have the maximum probability given the data: Shannon MDLP Rissanen Theory of Communication: Minimum Description Length Principle Hypothesis H and Data D Bayes 18th century 1948 Jens Zimmermann, Forschungszentrum Jülich, Astroteilchenschule 10/04 1990

  17. Relationship is uniform convergence: Upper bound for the actual risk (Vapnik): h: VC Dimension of learning method (complexity) Create nested subsets of function spaces with rising complexity h1 < h2 < h3 Statistical Learning Theory: Structural Risk Minimization We have N training events with input xi and correct output yi empirical risk actual risk 1996 Jens Zimmermann, Forschungszentrum Jülich, Astroteilchenschule 10/04

  18. Separating hyperplane with maximum distance to each datapoint: Maximum margin classifier Found by setting up condition for correct classfication and minimizing which leads to the Lagrangian Necessary condition for a minimum is So the output becomes KKT: only SV have No! Replace dot products: The mapping to feature space is hidden in a kernel Non-separable case: Support Vector Machines Only linear separation? Jens Zimmermann, Forschungszentrum Jülich, Astroteilchenschule 10/04

  19. Finally Include Statistical Learning Theory: ~25 formulas on 19 slides Skip Theory: ~7 formulas on 16 slides # slides 0 1 2 3 4 5 6 x10 # slides 0 1 2 3 4 5 6 x10 0 1 2 3 4 5 6 x10 # formulas 0 1 2 3 4 5 6 x10 # formulas Jens Zimmermann, Forschungszentrum Jülich, Astroteilchenschule 10/04

More Related