Study of electron/pion separation in TRD. Recent results of BDT applications.

Study of electron/pion separation in TRD. Recent results of BDT applications. Semen Lebedev GSI, Darmstadt, Germany and LIT JINR, Dubna, Russia Claudia Höhne GSI, Darmstadt, Germany Gennady Ososkov LIT JINR, Dubna, Russia S.Lebedev@gsi.de

TR Productionreminder and Outline Problems to consider: 1. Choose parameters of TR model • Nr. of foils •Foil thicknessgrouped into 3 sets •Gas thickness for better fit to experimental data. 2.Compare methods for e- indentifiation in order to find one most efficiently suppressing pions • Simple cut on the sum of energy losses • Photon cluster counting • Ordered statistics (median) • Artificial Neural Network • Boosted Decision Tree TR Production 3. Study how that best method is robust to to such experimental factors as calibration of measurements, pile up of signals etc.

1. Choice of the TR model parameters • Three sets of radiator parameters: • trdNFoils (number of polyethylene foils) • trdDFoils (thicknbess of one foil [cm]) • trdDGap (thickness of gap between foils [cm])

Comparison TR simulation and experimental results for 1.5 GeV/c simulation params1 simulation params2 overestimation underestimation experiment experiment simulation Distributions of energy losses for pions are the same for all parameters simulation experiment params3 realistic experiment

Checking of the simulation: n of zero TR layers Very important value for electron identification Best pion suppression results are expected for param1 Distributions of number of zero TR layers for 1.5/GeV electrons param1 param2 param3

2. Methods for e- indentificationand  suppression • Simple cut on the sum of energy losses • Photon cluster counting a cluster is a number of photoelectrons in 12 TRD layers exceeding5 KeV threshold The main lesson: a transformation needed to reduce Landau tails of dE/dx losses • Ordered statistics (median) π e- • Artificial Neural Network • the main factor is – appropriate transform of all ΔE=dE/dx from 12 TRD layers to be input to ANN. ANN was used from theROOT package TMVA (Toolkit for MultiVariate data Analysis)

2. Methods for e- indentification (contin-1) data sample • Decision Tree (DT) Multiple cuts on Xand Y in a big tree (only grows steps 1- 4 are shown) Final result of DT training on a great sample (~ 1000)

2. Methods for e- indentification (contin-2) • Boosted Decision Tree (BDT) • Given a training sample,boosting increases theweights of misclassifiedevents (background wich isclassified as signal, or viceversa), such that they have ahigher chance of beingcorrectly classified insubsequent trees. • Trees with more misclassifiedevents are also weighted,having a lowerweight thantrees with fewer misclassifiedevents. • Build many trees (~1000) anddo a weighted sum of eventscores from all trees (score is1 if signal leaf, -1 ifbackground leaf). • The renormalizedsum of all the scores, possibly weighted, is thefinal score of the event. High scores mean the event ismost likely signal and low scores that it is most likelybackground. Many weak trees (single-cuttrees) combined (only 4 trees shown) boosting algorithm produces 500 weak trees together

BDT algorithm (I) • Two steps of the algorithm: • energy loss transform • evaluate probability using Boosted Decision Tree (BDT)

BDT algorithm (II) • Boosted decision tree (BDT) classifier from TMVA package was used. • Before using BDT has to be trained Energy loss transformation: 1) Sort energy losses 2) Prepare probability density function (PDF) for ordered energy losses • Calculate likelihood ratio for each • energy loss as input to BDT: Transformation is very important step, without it classifiers could not be trained properly.

Results, different radiator parameters • Statistics: around 1M electrons and 1M pions for each momentum • Standard TRD geomerty • Standard ANN and new BDT methods were used for electron identification • For each momentum ANN and BDT were trained separately ANN BDT 90% electron efficiency

Results of electron Identification in TRD (I) electrons and pions with parametersθ = (2.5, 25), ϕ = (0, 360) for momentum 1.5 GeV/c 90% electron efficiency

Results, ANN and BDT comparison (II) Statistics: 1M electrons and 1M pions with parametersθ = (2.5, 25), ϕ = (0, 360) for certain momenta (1,1.5, 2, 3, 4, 5, 7, 9, 11, 13 GeV/c). Pion suppression vs. momentum Black: BDT method Blue: ANN method 90% electron efficiency

Stabilityofthe suppression algorithm • One should considernot only apion rejection procedure,as it is, but it is necessaryto take into account its robustnessto such experimental factors as • calibration of measurements, pile up of signals • missing one or two hits • erroneous substitution e- hit by  hit First factors were simulated by adding error to the energy loss for each hit: Eloss=Eloss+Gauss(0, Sigma) • BDT method was used • Keep in mind: • the most probable value of Eloss for pion is around • 1.05-1.5 keV 90% electron efficiency

Investigation of three parameter sets of TR simulation shows that param3 set is the most suitable. Comparative analysis of various methods for Electron/Pion separation demensrated the effectiveness of BDT mehod. First results of stability of the electron identification methods show its robustness to Eloss measurement errors up to 30-50% of the maximum value of  Eloss this study of the electron identification stability to effects of hit missing and substituting is planned Summary and outlook

Study of electron/pion separation in TRD. Recent results of BDT applications.