460 likes | 482 Views
ACAT07 Highlights. NIKHEF, Amsterdam 23-27 April 2007. The Workshop. For more information http://agenda.nikhef.nl/conferenceDisplay.py?confId=55. Plenary talks - 9 Parallel sessions - 3 Computer Technology for Physics Research (24 talks) Data Analysis – Algorithms and Tools
E N D
ACAT07 Highlights NIKHEF, Amsterdam 23-27 April 2007
The Workshop For more information http://agenda.nikhef.nl/conferenceDisplay.py?confId=55 • Plenary talks - 9 • Parallel sessions - 3 Computer Technology for Physics Research (24 talks) Data Analysis – Algorithms and Tools (28 talks) Methodology of Computations in Theoretical Physics (28 talks) • Round-table discussions • Plenary panel discussion • Summary talks 114 worldwide participants
Plenary Talks • Jos Engelen (CERN): Exploration of the Terascale Challenges • Markus Schulz (CERN): Bootstrapping a Grid Infrastructure (advise by Jeff Templon)
Providing HPC Resources for Philips Research and Partners Plenary talk by Ronald van Driel
To What Extent Can We Rely on the Results of Scientific Computations For more information and papers see http://www.leshatton.org/ Plenary talk by Les Hatton
Grids : Interaction and Scaling(Analysis on the Grid) Plenary talk by Jeff Templon
Panel Disscusion: Critical Issues of Distributed Computing • Bruce Allen: Einstein@Home and BOINC • Fons: Interactive Parallel Distributed Data Analysis Using PROOF
Session 1: Computing Technology for Physics Research • 7 sessions 24 talks • 12 Grid • 5 Monitoring/online • 3 Math packages • 2 GUIs • 1 PROOF • 1 Simulation
Middleware • All experiments make large usage of existing MW… but all have developed private solutions to complement it Grid
Ganga Grid
AliEn Grid
CMS – Crab Grid
Virtualization Grid
Storage - dCache Grid
Data Placement • Ideally... • Data is produced and placed randomly • Data replication or job placement at data location are optimised by the Grid • In reality • Data is placed in a given location • Jobs are sent to this location Grid
Analysis Grid
CMS RCMS Online Monitoring
ATLAS Muon Calibration Online Monitoring
DZero Online Monitoring
CDF Experiment (Compact Detector at Fnal) Online Monitoring
Simulation Simulation
Summary of Session 2 • Session on Data Analysis, Algorithms and Tools : • Neural Networks and Other Pattern Recognition Techniques • Evolutionary Algorithms • Advanced Data Analysis Environments • Statistical Methods • Detector and Accelerator Simulations • Reconstruction Algorithms • Visualization Techniques
Multi-Variate Analysis Methods • Large fraction of the talks on MVA: • General purpose implementation (e.g. TMVA, SPR) • New methods (self-organizing maps (SOM) ) • Varied usage: • Running experiments (Tevatron) • New experiments (LHC, BESIII) • Cosmic Ray experiments • Trigger • Reconstruction (Track, vertex, e/γ, b-, τ-tagging) • Data analysis, event selection 29
Discussion on Usage of Multivariate Methods P. Bhat: Multivariate Methods in HEP • Sociological Issues • We have been conservative in the use of MV methods for discovery • We have been more aggressive in the use of MV methods for setting limits • But discovery is more important and needs all the power you can muster! • This is expected to change at LHC (?) • Multivariate Analysis Issues • Dimensionality Reduction: optimal choice of variables without losing information • Choosing the right method for the problem • Controlling Model Complexity • Testing Convergence • Validation • Computational Efficiency • Correctness of modeling • Worries about hidden bias • Worries about underestimating errors Thomas Speer
MultiVariate Packages • Implemented multi-purpose MVA packages • TMVA (available in ROOT and in Sourcefourge) • presented by K. Voss • StatPatternRecognition (available in Sourceforge) • by I. Narsky (presented by A. Buckley) • Provide common platform & interface for all classifiers • training, testing and evaluation of the MVAs • easy and convenient for users • can choose their preferred methods • Provide some identical methods or very similar • need to compare the implementations • have standard benchmarks (with reference data-sets) 32
TMVA (H. Voss) • algorithms currently in TMVA: • Rectangular cut optimisation • Projective and Multi-dimensional likelihood estimator • Fisher discriminant and H-Matrix (2 estimator) • Artificial Neural Network (3 different implementations) • Boosted/bagged Decision Trees • Rule Fitting • Support Vector Machines 33
StatPatternRecognition (A. Buckley for I. Narsky) • Large choice of implemented Classifiers: • Decision split, or stump • Decision trees (2 flavors: regular tree and top-down tree) • Bump hunter (PRIM, Friedman & Fisher) with different FOM • LDA (aka Fisher) and QDA • Logistic regression • Boosting: discrete AdaBoost, real AdaBoost, and epsilon-Boost. • Arc-x4 (a variant of boosting from Breiman) • Bagging. Can bag any sequence of classifiers. • Random forest • Backprop NN with a logistic activation function (original implementation) • Multi-class learner (Allwein, Schapire and Singer) • Interfaces to SNNS neural nets (without training): • Backprop neural net, and Radial Basis Function 34
Various Usage of MVA • Presentations showing usage of MVA: • P. Bhat: Usage in HEP (at the Tevatron) • Baysian NN • A. Heikkinen: Separation of Higgs boson with SOM • M. Wolter: Optimization of tau identification in ATLAS • comparison of TMVA algorithms • R.C.Torres: Online electron/jet-identification in ATLAS using NN • J. Seixas: Online electron/jet-identification in ATLAS using SOM • S. Riggi: NN for high energy cosmic rays mass identification • S. Khatchadourian: NN Level 2 Trigger in Gamma Ray Astronomy
Self-Organizing Maps (SOM) (A. Heikkinen) • Self-organising maps (SOM): mapping from n-dimensional input data space onto a regular two-dimensional array of neurons: • Every neuron of the map is associated with an n-dimensional reference vector • The neurons of the map are connected to adjacent neurons by a neighborhood relation, which dictates the topology of the map • Similar input patterns are mapped to adjacent regions of the characteristics map. • Unsupervised training phase: the SOM forms an elastic net that folds onto the ”cloud” formed by input data and approximates the density of the data • Use SOM for b-tagging at CMS: • pp →bbHSUSY ,HSUSY → ττ • Track-based IP tag • Can accommodate missing date • Tagging efficiency 73 %, 11 % mistagging A. Heikkinen: Separation of Higgs boson with SOM
Evolutionary Algorithms • Natural evolution: generate a population of individuals with increasing fitness to environment • Evolutionary computation simulates the natural evolution on a computer • Process leading to maintenance or increase of a population • Ability to survive and reproduce in a specific environment • Quantitatively measured by evolutionary fitness • Goal of evolutionary computation: to generate a set of solutions (to a problem) of increasing quality • Genetic Algorithms (GA) (J. H. Holland, 1975): A. Drozdetskiy - GARCON • Genetic Programming (GP) (J. R. Koza, 1992) • Gene Expression Programming (GEP) (C. Ferreira, 2001): L. Teodorescu • Main differences: • Encoding method • Reproduction method Thomas Speer
Session 3 • Methodology of Computations in Theoretical Physics • Loop technology • FORM and parallel version (ParForm) • Generators and Automators (automatic computation systems) • from physics processes to event generators • advances in algorithms and systems • optimize FFT with max in-cache operations (J. Raynolds) • Error-free algorithms to solve systems of linear equations (M. Morhac)
Statistical Algorithms • Baysian approach for upper limits and Confidence Levels (Zhu, Bitykov) • Chi2 for comparison of weighted and unweighted histograms (N. Gagunashvili) • algorithm introduced in ROOT by D. Hertl last summer • Machine learning approach to unfolding (N. Gagunashvili) • Two dimensional goodness of fit testing (R. Lopes)
Plenary Sessions • J. Schmidhuber : Recent Progress in Machine Learning • Recurrent Neural Networks • new feedback network: • Long-Short Term Memory (LSTM) • various applications: • robotics, speed recognition, time-series prediction, etc... • no slides posted, see his Web-site • (google Schmidhuber • or Recurrent Neural Networks)
Plenary Sessions • J. Vink (Shell): Computing Challenges in Oil and Gas Field Simulation
Plenary sessions • L. Mullin (NSF): Grand Challenges in Computational Mathematics: Numerical, Symbolic and Algebraic Computing. An NSF View • Processing power doubles every 18 month (Moore’s law) • Compilers (SW) double every 18 years
Conclusion • Proceedings on DVD with an ISBN number • It was a very nice conference • Many interesting talks on various subjects • Useful discussions • Well organized … with a nice weather