Machine Learning in High Energy Physics

Machine Learning in High Energy Physics Satyaki Bhattacharya Saha Institute for Nuclear Physics Machine Learning in High Energy Physics Satyaki Bhattacharya Saha Institute of Nuclear Physics Satyaki Bhattacharya

Satyaki Bhattacharya Why machine learning ? We are always looking for a faint signal against a large background To do so we work in a high dimensional space (multivariate) We would like to draw decision boundaries in this space, the best we can ML algorithms are “universal approximators” and can do this job well

Satyaki Bhattacharya Brief history Machine learning for multivariate techniques have been used since Large Electron Positron (LEP)collider era Artificial Neural Network (ANN) was the tool of choice. A study (Byron Roe. et. al. ) in MiniBoone experiment demonstrated better performance with Boosted Decision Trees (BDT) Has been (almost) the default since then, with many successes The Higgs discovery was heavily aided by BDT

Satyaki Bhattacharya The new era of ML in HEP Era of deep learning came along with the advent of graphics processor units (GPU) orders of magnitude increased computing power made possible what was not possible before Since the discovery of Higgs there has been an explosion of application of machine learning in HEP – in calorimetry and tracking, Monte Carlo generation, data quality and compute job monitoring...

Satyaki Bhattacharya In this talk... Two broad areas of machine learning applications – image processing and natural language processing Both have found potential application in HEP I will talk about a few interesting use cases This talk is not an inclusive review

Satyaki Bhattacharya resources Community white paper for machine learning in HEP (https://arxiv.org/pdf/1807.02876.pdf) Review by Kyle Cranmer (https://arxiv.org/pdf/1806.11484v1.pdf)

Satyaki Bhattacharya Artificial Neural Network (ANN) Picture credit: Shamik Ghosh W1j Ui Output of Hidden Layer Hj W2j Input Layer [ 3 ] [83] [35] [69] [ 0.95] [ 1. ] [ 1. ] [ 1. ] [5] [6] [ 3 7 1 3] [-2 8 5 9] WijT x U Input vector U (2x1) Weight Matrix Wij (2x4) Multiplied output vector (4x1) Output vector H (4x1)

Satyaki Bhattacharya Artificial Neural Network (ANN) ... Output of layer n-1 is fed to layer n All nodes connected Input Layer Hidden Layer 1 Hidden Layer 2 Hidden Layer n

Satyaki Bhattacharya Artificial Neural Network (ANN) ... Output probabilities Input Layer [6] [9] [ 0.04] [ 0.95] Hidden Layers Softmax converts output to probablities

Satyaki Bhattacharya Gamma, pi0, beam-halo Distinct shower shapes of prompt photon, pi0 and photon from beam halo S Ghosh, A. Harilal, A. R. Sahasranshu, R. Singh, S Bhattacharya arXiv 1808.03987v3

Satyaki Bhattacharya Gamma, pi0, beam-halo e.m. shower: lateral shower shapes in a simulated crystal calorimeter halo pi0 prompt converted

Satyaki Bhattacharya What can image processing do? Can a machine algorithm, unaware of the underlying physics, pick up the visual features to separate between event classes? • The answer seems:Yes, very well

Satyaki Bhattacharya Convolutional neural network Convolutional layer of a CNN extracts Features like edges, horizontal lines, vertical lines from the bitmap image A “filter” - a small nxn matrix of 0,1 slides over the original image producing a convolved image Different filters --> different internal images Many internal representations of the original image Large image data is reduced by some course graining algorithm - Pooling

Satyaki Bhattacharya CNN architecture Multiple layers of convolution and pooling followed by a fully connected layer (just an ANN) https://adeshpande3.github.io/A-Beginner%27s-Guide-To-Understanding-Convolutional-Neural-Networks/

Satyaki Bhattacharya

Satyaki Bhattacharya End to End (E2E) event image photon jet

Satyaki Bhattacharya End to End (E2E) event classifier CNN Pure kinematic network

Satyaki Bhattacharya End to End (E2E) event classifier Can learn angular distribution as well as energy scale of constituent hits. Can learn about the shower shape. On this simplified analysis example, performs significantly better than a pure kinematics based analysis. True potential lies in BSM searches with complex final states.

Satyaki Bhattacharya CNN in neutrino event classification Narrow band ~ 2 GeV (mainly) nm beam Near and far detectors 810 km apart Detectors are liquid scintillator cell arrays Time and intensity of scintillation recorded tracks reconstructed 896 planes with 384 cells each ArXiv 1604.01444v3

Satyaki Bhattacharya NOvA event classes νμCC- A muon plus a hadronic component. Long, low dE/dx track corresponding to the track of a minimally ionizing muon. νeCC- An electron plus a hadronic component. The electron topology is typically a wide shower, rather than a track, whose dimensions are related to the radiation length of the detector material. ντCC- A tau plus a hadronic component. The tau decays immediately with varying final state probabilities that may produce pions, electrons, muons, and neutrinos. The production threshold is 3.4 GeV, at the upper end of the energy spectrum seen in the NOvA detectors ν-NC- The outgoing lepton is a neutrino. Only the hadronic component is visible, making their flavor impossible to identify. Can fake CC

Satyaki Bhattacharya A charged current event

Satyaki Bhattacharya Charged current resonant topology(RESonant)

Satyaki Bhattacharya A CC event with a muon track

Satyaki Bhattacharya What was input? Two sets of images, X-Z and Y-Z views Rendered as 100X80 grids representing a region 14.52 m deep X 4.18 m wide 8 bit encoding instead of floating point Doesn’t significantly compromise representational capability, gains a factor of 8 in memory footprint

Satyaki Bhattacharya Convolutional Visual Network CVN was developed using the Caffe framework. Caffe is an openframework for deep learning applications highly modular and makes accelerated training on graphics processing units (GPU) straightforward. Common layer types are pre-implemented in Caffe Easy to make new architectures by specifying the desired layers and their connections in a configuration file. Caffe is packaged with a configuration file implementing the GoogLeNet

Satyaki Bhattacharya What did CVN learn on? Simulated 4.3 million events with realistic beam composition 1/3 unoscillated, 1/3 to ne and 1/3 to nt Regularisation was done with penalty term quadratic in weights in the loss function

Satyaki Bhattacharya CVN architecture

Satyaki Bhattacharya CVN: convolutional layers 64 filters Input image

Satyaki Bhattacharya Performance and output

Satyaki Bhattacharya CVN performs better! For muon events performance comparable. For electron events (harder to separate) CVN achieves 49% against 35% efficiency of established NovA analysis, for very similar purity! CVN not overly sensitive to systamatics.

Satyaki Bhattacharya Natural Language Processing Tomorrow, and tomorrow, and tomorrow, Creeps in this petty pace from day to day, To the last syllable of recorded time; And all our yesterdays have lighted fools The way to dusty death.

Satyaki Bhattacharya Natural Language Processing [[[Tomorrow], and tomorrow], and tomorrow], [creeps in [this [petty pace]]] [from [day to day]], to the last syllable of recorded time; And all our yesterdays have lighted fools the way to dusty death. Languages have a nested (or recursive) structure

Satyaki Bhattacharya ML in Natural Language Processing Recursive Neural Networks The cat sat on the hat Language model: Words are represented as vectors. Nearby words --> nearby vectors

Satyaki Bhattacharya ML in Natural Language Processing The Cat sat on the hat

Satyaki Bhattacharya ML in Natural Language Processing The cat sat on the hat

Satyaki Bhattacharya Jet clustering in HEP Parameters: a, R and pTmin a=1,0,-1 : KT, Cambridge-Aachen,Anti-KT Recursive binary tree Vjet V1 V2 V3 V4 V5 V6

Satyaki Bhattacharya Jet clustering: Jet embedding ,Vjet V1 V2 V3 V4 V5 V6

Satyaki Bhattacharya Jet clustering: event embedding

Satyaki Bhattacharya Generative vs. Discriminative Discriminative: features(x)-->labels(y) Given features output p(y|x) Generative: labels-->features Given a label generate features p(x|y)

Satyaki Bhattacharya Generative Adversarial Network Most interesting Idea in machine learning in last decade - LeCunn

Machine Learning in High Energy Physics Satyaki Bhattacharya Saha Institute for Nuclear Physics Satyaki Bhattacharya

“Christie’s sold a portrait for $432,000 that had been generated by a GAN, based on open-source code written by Robbie Barrat of Stanford. Like most true artists, he didn’t see any of the money, which instead went to the French company, Obvious.” Satyaki Bhattacharya

Satyaki Bhattacharya Monte Carlo generation with GAN First attempts with GAN to generate calorimetric data GEANT Shower simulation can take minutes, GAN can be 5 orders of magnitude faster Challenges with accuracy Next: explore architectures and systematic hyper-parameter scans on HPCs to achieve the required performance. The technique can be useful for final states with boosted objects, where full GEANT-based simulation are required. An example is caloGAN – simulation of liquid Ar segmented calorimeter. M. Paganini, L. de Oliveira, and B. Nachman. “CaloGAN: Simulating 3D High Energy Particle Showers in Multi-Layer Electromagnetic Calorimeters with Generative Adversarial Networks” (2017). arXiv: 1705.02355 [hep-ex] .

Satyaki Bhattacharya How well does caloGAN learn?

Satyaki Bhattacharya Does GAN memorize? Eucleadian distance between GAN and nearest GEANT event shows that GAN does not just memorize. Promising but still needs tuning

Satyaki Bhattacharya The Tracking Challenge

Satyaki Bhattacharya Conventional Tracking Most CPU intensive low level reconstruction Hits-->clusters-->track seeds-->Kalman filter 108 channels-->104 hits-->103 tracks

Satyaki Bhattacharya Trend in tracking Explosion in combinatorics is the computing challenge https://indico.cern.ch/event/686555/contributions/2976579/attachments/1680748/2700965/ICHEP_TrackML_Jul052018.pdf

Machine Learning in High Energy Physics

Machine Learning in High Energy Physics

Presentation Transcript

High Energy Physics in Brazil

Liverpool High Energy Physics

Magnets in High Energy Physics

Physics and Machine Learning

High Energy Physics

Triggering In High Energy Physics

High Energy Physics in Australia

Triggering In High Energy Physics

in High Energy Physics

High Energy Physics

High Energy Physics

High Energy Physics

Analysis Techniques in High Energy Physics

High Energy Physics

Some High Energy Physics

Particle Physics - High Energy Physics

IDKM in High Energy Physics

Computing in High Energy Physics

Computing in High Energy Physics