Experiments with Distributed Training of Neural Networks on the Grid

Experiments with Distributed Training of Neural Networks on the Grid Maciej Malawski1 Marian Bubak1,2Elżbieta Richter-Wąs3,4Grzegorz Sala3,5 Tadeusz Szymocha3 1Institute of Computer Science AGH, Mickiewicza 30, 30-059 Kraków, Poland 2Academic Computer Centre CYFRONET, Nawojki 11, 30-950 Kraków, Poland 3Institute of Nuclear Physics, Polish Academy of Sciences, Krakow, Poland 4Institute of Physics, Jagiellonian University, Kraków, Poland 5Faculty of Physics and Applied Computer Science AGH, Kraków, Poland {bubak,malawski}@agh.edu.pl, elzbieta.richter-was@cern.ch, sala@fatcat.ftj.agh.edu.pl, Tadeusz.Szymocha@ifj.edu.pl • Target application • High Energy Physics • Discrimination between signal and background events coming from the particle detector (simulation) • ROOT and Athena as basic data analysis tools • Challenges • Neural network training is a highly compute-intensive task – may need High Performance Computing • Finding optimal configuration may be time consuming: many experiments with various parameters – may need High Throughput Computing • Why neural networks • Once trained, are efficient and accurate • Applicable for classification and prediction • Proven in wide area of applications • Solution: The Grid • The distribution of the computation on a cluster of machines can lead to significant improvement in decreasing computation time. • Utilizing resources (multiple clusters) available on the Grid can make this task less time consuming for researcher. • Observation • Training of neural networks on the Grid requires many repeated tasks: • job preparation, • submission, • monitoring of status, • gathering results. • Performing them manually is time consuming for the researcher • → Preparation of tools for automating such tasks can facilitate the whole process considerably. • Our Goals • Develop the tools facilitating usage of Grid for multiple classification experiments • Investigate and validate algorithms for distributed neural network training • Allow seamless integration with data analysis tools such as ROOT • Testbed for our experiments: EGEE project • Virtual Organization for Central Europe • CYFRONET Kraków, PSNC Poznań, KFKI Budapest, CESNET Prague, TU Kosice Grid sites • Support for MPI applications

Experiments with Distributed Training of Neural Networks on the Grid

Experiments with Distributed Training of Neural Networks on the Grid

Presentation Transcript

Training Neural Networks

Distributed Diagnostics on the Grid: DAME

Learning with Neural Networks

Optimization with Neural Networks

Neural Networks

Supervised Training of Neural Networks

Neural Networks

Towards Reliable Convergence in the Training of Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural networks

Neural Networks

Presentation on Neural Networks.

Neural Networks

Neural networks – Hands on

Neural Networks