250 likes | 510 Views
http://tmva.sourceforge.net/. Multivariate Analysis, TMVA, and Artificial Neural Networks. Matt Jachowski jachowski@stanford.edu. Multivariate Analysis. Techniques dedicated to analysis of data with multiple variables
E N D
http://tmva.sourceforge.net/ Multivariate Analysis, TMVA, and Artificial Neural Networks Matt Jachowski jachowski@stanford.edu Matt Jachowski
Multivariate Analysis • Techniques dedicated to analysis of data with multiple variables • Active field – many recently developed techniques rely on computational ability of modern computers Matt Jachowski
Multivariate Analysis and HEP • Goal is to classify events as signal or background • Single event defined by several variables (energy, transverse momentum, etc.) • Use all the variables to classify the event • Multivariate analysis! Matt Jachowski
Multivariate Analysis and HEP • Rectangular cuts optimization common Matt Jachowski
Multivariate Analysis and HEP • Likelihood Estimator analysis also common • Use of more complicated methods (Neural Networks, Boosted Decision Trees) not so common (though growing) – why? • Difficult to implement • Physicists are skeptical of new methods Matt Jachowski
Toolkit for Multivariate Analysis (TMVA) • ROOT-integrated software package with several MVA techniques • Automatic training, testing, and evaluation of MVA methods • Guidelines and documentation to describe methods for users – this isn’t a black box! Matt Jachowski
Toolkit for Multivariate Analysis (TMVA) • Easy to configure methods • Easy to “plug-in” HEP data • Easy to compare different MVA methods Matt Jachowski
TMVA in Action Matt Jachowski
TMVA and Me • TMVA started in October 2005 • Still young • Very active group of developers • My involvement • Decorrelation for Cuts Method (mini project) • New Artificial Neural Network implementation (main project) Matt Jachowski
Decorrelated Cuts Method • Some MVA methods suffer if data has linear correlations • i.e. Likelihood Estimator, Cuts • Linear correlations can be easily transformed away • I implemented this for the Cuts Method Matt Jachowski
Decorrelated Cuts Method • Find the square root of the covariance matrix (C=C’C’) • Decorrelate the data • Apply cuts to decorrelated data Matt Jachowski
Artificial Neural Networks (ANNs) • Robust non-linear MVA technique Matt Jachowski
Training an ANN • Challenge is training the network • Like human brain, network learns from seeing data over and over again • Technical details:Ask me if you’re really interested Matt Jachowski
MLP • MLP (Multi-Layer Perceptron) – my ANN implementation for TMVA • MLP is TMVA’s main ANN • MLP serves as base for any future ANN developments in TMVA Matt Jachowski
MLP – Information & Statistics • Implemented in C++ • Object-Oriented • 4,000+ lines of code • 16 classes Matt Jachowski
Acknowledgements • Joerg Stelzer • Andreas Hoecker • CERN • University of Michigan • Ford • NSF Matt Jachowski
Questions?(I have lots of technical slides in reserve that I would be glad to talk about) Matt Jachowski
Synapses and Neurons v0 y0 w0j v1 y1 vj yj w1j . . . wnj vn yn Matt Jachowski
yj vj Synapses and Neurons Matt Jachowski
Universal Approximation Theorem Every continuous function that maps intervals of real numbers to some output interval of real numbers can be approximated arbitrarily closely by a multi-layer perceptron with just one hidden layer (with non-linear activation functions). output inputs weights between hidden and output layer weights between input and hidden layer non-linear activation function bias Matt Jachowski
x0 x1 y x2 x3 Training an MLP Training Event: Network: Matt Jachowski
Training an MLP Adjust weights to minimize error (or an estimator that is some function of the error) Matt Jachowski
Back-Propagation Algorithm Make correction in direction of steepest descent Corrections made to output layer first, propagated backwards Matt Jachowski