Work to Improve n e Identification

Work to Improve ne Identification Alex Smith University of Minnesota

Overview • Migration of JM EID to TMVA framework • Consistency between old and new • Plans for implementation • Addition of reconstructed E(ne) to training • Summary and Plans

Motivation • JM EID currently use Root’s TMultilayerPerceptron as the algorithm for multivariate analysis • TMVA (Toolkit for Multivariate Analysis) • Open-source framework • Supports many different algorithms • Nice diagnostic tools • Nice standard framework-- used by others within NOvA • If we migrate JM EID to TMVA framework, we can take advantage of these features

Consistency With Previous Implementation • TMVA includes the Root TMultilayerPerceptron as one option • Run this and compare with results from Jianming’s EID • Not possible to specify exactly the same events • Must run large sample in order to compare • Monte Carlo samples • Signal: Swap MC, applied to sample in training • Background: Generic FD MC (includes beam ne)

Comparison: Previous to TMVA Beam ne CC NC Background TMVA TMVA Previous Previous Efficiency Efficiency MVA Variable MVA Variable

Comparison: Previous to TMVA nm CC ne CC, Wrong Particle Chosen for e TMVA TMVA Previous Previous Efficiency Efficiency MVA Variable MVA Variable

Comparison: Previous to TMVA Figure of Merit Signal neCC TMVA TMVA Previous Previous Figure of Merit Efficiency MVA Variable MVA Variable

Comparison: Previous to TMVA TMVA Previous Events MVA Variable

Comparison: Previous to TMVA TMVA TMVA Previous Previous Events Events MVA Variable MVA Variable

Consistency With Previous Implementation • The TMVA implementation is consistent with the Root/TMultilayerPerceptron version of the current EID • Not possible to select exactly the same training/test subsamples from the TMVA interface • FOM is the same within uncertainties • Can move forward with this implementation of EID

Plan for Implementation in NOVA Framework • We plan to provide a few different options of EID variables in the analysis framework: • Without E(ne) • Including E(ne) • Other sets of variables to be determined (suggestions?) • Different MVA algorithms (Artificial Neural Networks, Boosted Decision Trees, k-Nearest Neighbor, H-Matrix, etc.) • Ultimately, the user can select the EID variable that best suits their analysis

Addition of Reconstructed E(ne) to ANN Training • We will provide EID variables both with and without E(ne) included in the training • Caution: Using E(ne) in the training will bias the E(ne) distribution • It is preferable to use MVA without E(ne) and do a 2D fit to reconstructed E(ne) and MVA discriminator if you care about the E(ne) shape • This will allow comparison with other EID packages that include E(ne)

Definition of Input Variables egLLL= evtSh1DedxLLL[0] - evtSh1DedxLLL[1]; egLLT = evtSh1DedxLLT[0] - evtSh1DedxLLT[1]; emuLLL = evtSh1DedxLLL[0] - evtSh1DedxLLL[2]; emuLLT = evtSh1DedxLLT[0] - evtSh1DedxLLT[2]; epi0LLL = evtSh1DedxLLL[0] - evtSh1DedxLLL[3]; epi0LLT = evtSh1DedxLLT[0] - evtSh1DedxLLT[3]; epLLL = evtSh1DedxLLL[0] - evtSh1DedxLLL[5]; epLLT = evtSh1DedxLLT[0] - evtSh1DedxLLT[5]; enLLL = evtSh1DedxLLL[0] - evtSh1DedxLLL[6]; enLLT = evtSh1DedxLLT[0] - evtSh1DedxLLT[6]; epiLLL = evtSh1DedxLLL[0] - evtSh1DedxLLL[7]; epiLLT = evtSh1DedxLLT[0] - evtSh1DedxLLT[7]; gap = evtSh1Gap; pi0mass = Max(evtSh1Pi0Mgg, 0.0); vtxgev = evtSh1VtxGeV; shE= evtSh1Energy / evtSh1SliceGeV; nueRecEnergy = (evtSh1Energy + 0.282525 + 1.0766*(evtEtot-evtSh1Energy));

Input Variables

Comparison: With/Without E(ne) nm CC nm CC ne CC, Wrong Particle Chosen for e ne CC, Wrong Particle Chosen for e With E(ne) With E(ne) No E(ne) No E(ne) Efficiency Efficiency MVA Variable MVA Variable

Comparison: With/Without E(ne) Beam ne CC NC Background With E(ne) With E(ne) No E(ne) No E(ne) Efficiency Efficiency MVA Variable MVA Variable

Comparison: With/Without E(ne) Figure of Merit Signal neCC With E(ne) With E(ne) No E(ne) No E(ne) Figure of Merit Efficiency MVA Variable MVA Variable

Comparison: With/Without E(ne) With E(ne) No E(ne) Events MVA Variable

Comparison: With/Without E(ne) With E(ne) With E(ne) No E(ne) No E(ne) Events Events MVA Variable MVA Variable

Importance of Variables to Separation

Variable Correlations

Separation Performance

Other MVA Algorithms Can Be Used Boosted Decision Tree K-Nearest Neighbor • Not optimized for performance – just used default parameters • Can certainly do better

Summary and Plans • JM EID training migrated to TMVA • Demonstrated that results are consistent • Working on code to implement this in NOvA analysis framework, should be available soon for others to use • Added E(ne) to training • Figure of merit (FOM) increases from 6.4 to 6.5 • Using • If use like others, FOM = ~6.8 • Investigate other variables and MVA algorithms

Work to Improve n e Identification