1 / 59

Machine Learning in High Energy Physics

Machine Learning in High Energy Physics. Satyaki Bhattacharya Saha Institute for Nuclear Physics. Machine Learning in High Energy Physics. Satyaki Bhattacharya Saha Institute of Nuclear Physics. Why machine learning ?. We are always looking for a faint signal against a large background

jenniek
Download Presentation

Machine Learning in High Energy Physics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Machine Learning in High Energy Physics Satyaki Bhattacharya Saha Institute for Nuclear Physics Machine Learning in High Energy Physics Satyaki Bhattacharya Saha Institute of Nuclear Physics Satyaki Bhattacharya

  2. Satyaki Bhattacharya Why machine learning ? We are always looking for a faint signal against a large background To do so we work in a high dimensional space (multivariate) We would like to draw decision boundaries in this space, the best we can ML algorithms are “universal approximators” and can do this job well

  3. Satyaki Bhattacharya Brief history Machine learning for multivariate techniques have been used since Large Electron Positron (LEP)collider era Artificial Neural Network (ANN) was the tool of choice. A study (Byron Roe. et. al. ) in MiniBoone experiment demonstrated better performance with Boosted Decision Trees (BDT) Has been (almost) the default since then, with many successes The Higgs discovery was heavily aided by BDT

  4. Satyaki Bhattacharya The new era of ML in HEP Era of deep learning came along with the advent of graphics processor units (GPU) orders of magnitude increased computing power made possible what was not possible before Since the discovery of Higgs there has been an explosion of application of machine learning in HEP – in calorimetry and tracking, Monte Carlo generation, data quality and compute job monitoring...

  5. Satyaki Bhattacharya In this talk... Two broad areas of machine learning applications – image processing and natural language processing Both have found potential application in HEP I will talk about a few interesting use cases This talk is not an inclusive review

  6. Satyaki Bhattacharya resources Community white paper for machine learning in HEP (https://arxiv.org/pdf/1807.02876.pdf) Review by Kyle Cranmer (https://arxiv.org/pdf/1806.11484v1.pdf)

  7. Satyaki Bhattacharya Artificial Neural Network (ANN) Picture credit: Shamik Ghosh W1j Ui Output of Hidden Layer Hj W2j Input Layer [ 3 ] [83] [35] [69] [ 0.95] [ 1. ] [ 1. ] [ 1. ] [5] [6] [ 3 7 1 3] [-2 8 5 9] WijT x U Input vector U (2x1) Weight Matrix Wij (2x4) Multiplied output vector (4x1) Output vector H (4x1)

  8. Satyaki Bhattacharya Artificial Neural Network (ANN) ... Output of layer n-1 is fed to layer n All nodes connected Input Layer Hidden Layer 1 Hidden Layer 2 Hidden Layer n

  9. Satyaki Bhattacharya Artificial Neural Network (ANN) ... Output probabilities Input Layer [6] [9] [ 0.04] [ 0.95] Hidden Layers Softmax converts output to probablities

  10. Satyaki Bhattacharya Gamma, pi0, beam-halo Distinct shower shapes of prompt photon, pi0 and photon from beam halo S Ghosh, A. Harilal, A. R. Sahasranshu, R. Singh, S Bhattacharya arXiv 1808.03987v3

  11. Satyaki Bhattacharya Gamma, pi0, beam-halo e.m. shower: lateral shower shapes in a simulated crystal calorimeter halo pi0 prompt converted

  12. Satyaki Bhattacharya What can image processing do? Can a machine algorithm, unaware of the underlying physics, pick up the visual features to separate between event classes? • The answer seems:Yes, very well

  13. Satyaki Bhattacharya Convolutional neural network Convolutional layer of a CNN extracts Features like edges, horizontal lines, vertical lines from the bitmap image A “filter” - a small nxn matrix of 0,1 slides over the original image producing a convolved image Different filters --> different internal images Many internal representations of the original image Large image data is reduced by some course graining algorithm - Pooling

  14. Satyaki Bhattacharya CNN architecture Multiple layers of convolution and pooling followed by a fully connected layer (just an ANN) https://adeshpande3.github.io/A-Beginner%27s-Guide-To-Understanding-Convolutional-Neural-Networks/

  15. Satyaki Bhattacharya

  16. Satyaki Bhattacharya End to End (E2E) event image photon jet

  17. Satyaki Bhattacharya End to End (E2E) event classifier CNN Pure kinematic network

  18. Satyaki Bhattacharya End to End (E2E) event classifier Can learn angular distribution as well as energy scale of constituent hits. Can learn about the shower shape. On this simplified analysis example, performs significantly better than a pure kinematics based analysis. True potential lies in BSM searches with complex final states.

  19. Satyaki Bhattacharya CNN in neutrino event classification Narrow band ~ 2 GeV (mainly) nm beam Near and far detectors 810 km apart Detectors are liquid scintillator cell arrays Time and intensity of scintillation recorded tracks reconstructed 896 planes with 384 cells each ArXiv 1604.01444v3

  20. Satyaki Bhattacharya NOvA event classes νμCC- A muon plus a hadronic component. Long, low dE/dx track corresponding to the track of a minimally ionizing muon. νeCC- An electron plus a hadronic component. The electron topology is typically a wide shower, rather than a track, whose dimensions are related to the radiation length of the detector material. ντCC- A tau plus a hadronic component. The tau decays immediately with varying final state probabilities that may produce pions, electrons, muons, and neutrinos. The production threshold is 3.4 GeV, at the upper end of the energy spectrum seen in the NOvA detectors ν-NC- The outgoing lepton is a neutrino. Only the hadronic component is visible, making their flavor impossible to identify. Can fake CC

  21. Satyaki Bhattacharya A charged current event

  22. Satyaki Bhattacharya Charged current resonant topology(RESonant)

  23. Satyaki Bhattacharya A CC event with a muon track

  24. Satyaki Bhattacharya What was input? Two sets of images, X-Z and Y-Z views Rendered as 100X80 grids representing a region 14.52 m deep X 4.18 m wide 8 bit encoding instead of floating point Doesn’t significantly compromise representational capability, gains a factor of 8 in memory footprint

  25. Satyaki Bhattacharya Convolutional Visual Network CVN was developed using the Caffe framework. Caffe is an openframework for deep learning applications highly modular and makes accelerated training on graphics processing units (GPU) straightforward. Common layer types are pre-implemented in Caffe Easy to make new architectures by specifying the desired layers and their connections in a configuration file. Caffe is packaged with a configuration file implementing the GoogLeNet

  26. Satyaki Bhattacharya What did CVN learn on? Simulated 4.3 million events with realistic beam composition 1/3 unoscillated, 1/3 to ne and 1/3 to nt Regularisation was done with penalty term quadratic in weights in the loss function

  27. Satyaki Bhattacharya CVN architecture

  28. Satyaki Bhattacharya CVN: convolutional layers 64 filters Input image

  29. Satyaki Bhattacharya Performance and output

  30. Satyaki Bhattacharya CVN performs better! For muon events performance comparable. For electron events (harder to separate) CVN achieves 49% against 35% efficiency of established NovA analysis, for very similar purity! CVN not overly sensitive to systamatics.

  31. Satyaki Bhattacharya Natural Language Processing Tomorrow, and tomorrow, and tomorrow, Creeps in this petty pace from day to day, To the last syllable of recorded time; And all our yesterdays have lighted fools The way to dusty death.

  32. Satyaki Bhattacharya Natural Language Processing [[[Tomorrow], and tomorrow], and tomorrow], [creeps in [this [petty pace]]] [from [day to day]], to the last syllable of recorded time; And all our yesterdays have lighted fools the way to dusty death. Languages have a nested (or recursive) structure

  33. Satyaki Bhattacharya ML in Natural Language Processing Recursive Neural Networks The cat sat on the hat Language model: Words are represented as vectors. Nearby words --> nearby vectors

  34. Satyaki Bhattacharya ML in Natural Language Processing The Cat sat on the hat

  35. Satyaki Bhattacharya ML in Natural Language Processing The Cat sat on the hat

  36. Satyaki Bhattacharya ML in Natural Language Processing The Cat sat on the hat

  37. Satyaki Bhattacharya ML in Natural Language Processing The cat sat on the hat

  38. Satyaki Bhattacharya Jet clustering in HEP Parameters: a, R and pTmin a=1,0,-1 : KT, Cambridge-Aachen,Anti-KT Recursive binary tree Vjet V1 V2 V3 V4 V5 V6

  39. Satyaki Bhattacharya Jet clustering: Jet embedding ,Vjet V1 V2 V3 V4 V5 V6

  40. Satyaki Bhattacharya Jet clustering: event embedding

  41. Satyaki Bhattacharya Generative vs. Discriminative Discriminative: features(x)-->labels(y) Given features output p(y|x) Generative: labels-->features Given a label generate features p(x|y)

  42. Satyaki Bhattacharya Generative Adversarial Network Most interesting Idea in machine learning in last decade - LeCunn

  43. Machine Learning in High Energy Physics Satyaki Bhattacharya Saha Institute for Nuclear Physics Satyaki Bhattacharya

  44. “Christie’s sold a portrait for $432,000 that had been generated by a GAN, based on open-source code written by Robbie Barrat of Stanford. Like most true artists, he didn’t see any of the money, which instead went to the French company, Obvious.” Satyaki Bhattacharya

  45. Satyaki Bhattacharya Monte Carlo generation with GAN First attempts with GAN to generate calorimetric data GEANT Shower simulation can take minutes, GAN can be 5 orders of magnitude faster Challenges with accuracy Next: explore architectures and systematic hyper-parameter scans on HPCs to achieve the required performance. The technique can be useful for final states with boosted objects, where full GEANT-based simulation are required. An example is caloGAN – simulation of liquid Ar segmented calorimeter. M. Paganini, L. de Oliveira, and B. Nachman. “CaloGAN: Simulating 3D High Energy Particle Showers in Multi-Layer Electromagnetic Calorimeters with Generative Adversarial Networks” (2017). arXiv: 1705.02355 [hep-ex] .

  46. Satyaki Bhattacharya How well does caloGAN learn?

  47. Satyaki Bhattacharya Does GAN memorize? Eucleadian distance between GAN and nearest GEANT event shows that GAN does not just memorize. Promising but still needs tuning

  48. Satyaki Bhattacharya The Tracking Challenge

  49. Satyaki Bhattacharya Conventional Tracking Most CPU intensive low level reconstruction Hits-->clusters-->track seeds-->Kalman filter 108 channels-->104 hits-->103 tracks

  50. Satyaki Bhattacharya Trend in tracking Explosion in combinatorics is the computing challenge https://indico.cern.ch/event/686555/contributions/2976579/attachments/1680748/2700965/ICHEP_TrackML_Jul052018.pdf

More Related