310 likes | 325 Views
This paper explores the use of genetic algorithms for optimizing discriminant functions in the search for the Higgs boson at the Large Hadron Collider (LHC). The paper also discusses the optimization of neural weights and the search for hyperplanes and hypersurfaces.
E N D
ACAT05 May 22 - 27, 2005 DESY, Zeuthen, Germany Search for the Higgs boson at LHC by using Genetic Algorithms Mostafa MJAHEDEcole Royale de l’Air, Mathematics and Systems Dept. Marrakech, Morocco
Search for the Higgs boson at LHC by using Genetic Algorithms • Introduction • Genetic Algorithms • Search for the Higgs boson at LHC by using Genetic Algorithms • Optimization of discriminant functions • Optimization of Neural weights • Hyperplans search • Hypersurfaces search • Conclusion M. Mjahed ACAT 05, DESY, Zeuthen, 26/05/2005 2
IntroductionHiggs production at LHC • Several mechanisms contribute to the production of SM Higgs boson in proton collisions • The dominant mechanism is the gluon fusion process, gg H
Introduction Decay Modes • decay into quarks: H bb and H cc • leptonic decay: H +- • gluonic decay: H g g • decay into virtual W boson pair: H W +W -
MH < 2MZ: H bb+X H W+W- ll H ZZ 4l H Introduction Main discovery modes
HW+W-ll (1) • The decay channel chosen: • H W+W- e+-, + e- , e+e-, +-. • Signature: • Two charged oppositely leptons with large transverse momentum PT • Two energetic jets in the forward detectors. • Large missing transverse momentum P’T • Main background: • tt production: with tt WbWb l j l j • QCD W+W- +jets production • Electroweak WWjj M. Mjahed ACAT 05, DESY, Zeuthen, 26/05/2005 7
HW+W-ll (2) Main Variables • ll , ll: the pseudo-rapidity and the azimuthal angle differences between the two leptons • jj, jj: the pseudo-rapidity and the azimuthal angle differences between the two jets • Mll , Mjj: the invariant mass of the two leptons and jets, • Mnm(n,m = 1,2,3) some rapidity weighted transverse momentum • n, m = 1, 2,3, … • i rapidity of the leptons or jets, piT their transverse momentums. M. Mjahed ACAT 05, DESY, Zeuthen, 26/05/2005 8
Pattern Recognition Measurement Feature Extraction/ Feature Selection Feature Classification Decision M. Mjahed ACAT 05, DESY, Zeuthen, 26/05/2005 10
Pattern Recognition Methods Pattern Recognition Methods Statistical Methods Connectionist Methods Others Methods • PCA Neural Networks Genetic Algorithms • Decision Trees • Discriminant • Analysis Fuzzy Logic Wavelets • Clustering
Genetic Algorithms • Based on Darwin’s theory of ”survival of the fittest”: • Living organisms reproduce, individuals evolve/ mutate, • individuals survive or die based on fitness • The input is an initial set of possible solutions • The output of a genetic algorithm is the set of ”fittest solutions” that will survive in a particular environment • The process • Produce the next generation (by a cross-over function) • Evolve solutions (by a mutation function) • Discard weak solutions (based on a fitness function) M. Mjahed ACAT 05, DESY, Zeuthen, 26/05/2005 12
Genetic Algorithms • Preparation: • Define an encoding to represent solutions (i. e., use a character sequence to represent a classe) • Create possible initial solutions (and encode them as strings) • Perform the 3 genetic functions: Crossover, Mutate, Fitness Test • Why Genetic Algorithms (GAs) ? • Many real life problems cannot be solved in polynomial amount of time using deterministic algorithm • Sometimes near optimal solutions that can be generated quickly are more desirable than optimal solutions which require huge amount of time • Problems can be modeled as an optimization one M. Mjahed ACAT 05, DESY, Zeuthen, 26/05/2005 13
0 1 1 1 1 0 1 0 4 3 2 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 1 0 0 1 1 1 1 1 0 1 1 1 0 1 0 0 1 0 1 0 0 1 1 1 0 1 Spin 2 1 2 4 Genetic Functions Crossover • Mutation • Roulette Wheel Selection
Initialize Population Evaluate Fitness Yes Output solution Terminate? No Perform selection, crossover and mutation Evaluate Fitness GA Process M. Mjahed ACAT 05, DESY, Zeuthen, 26/05/2005 15
Efficiency for Ci classification • Purity for Ci events • Misclassification rate for Ci or Error GAs for Pattern Classification • Optimization of discriminant functions • Optimization of Neural weights • Hyperplans search • Hypersurfaces search Efficiency and Purity of Classification • Validation
Search for the Higgs boson at LHC by using Genetic Algorithms
Events generated by the LUND MC PYTHIA 6.1 at s = 14 TeV • MH = 115 - 150 GeV/c2 • 10000 Higgs events and 10000 Background events are used Search for the Higgs boson at LHC by using Genetic Algorithms Background Signal • pp ttX WbWbX l j l jX • pp qqX W+W- X pp HX W+W-X l+ l-X • Research of discriminating variables M. Mjahed ACAT 05, DESY, Zeuthen, 26/05/2005 18
Rapidity-impulsion weighted Moments Mnm : (n=1, …,6) • irapidity: Variables • ll , ll: the pseudo-rapidity and the azimuthal angle differences between the two leptons • jj , jj: the pseudo-rapidity and the azimuthal angle differences between the two jets • Mjj : the invariant mass of the two jets • Mll : the invariant mass of the two leptons ll, ll, jj, jj, Mll , Mjj, M11, M21, M31, M41
C CHiggs CBack Test events Classification Efficiency Purity CHiggs 0.601 0.606 CBack 0.610 0.604 Optimization of Discriminant Functions (1) • Discriminant Analysis • F(x ) = (gsignal - gback)T V-1 x = i xi • The most separating discriminant Function FHiggs / Back between the classes CHiggs and CBack is : FHiggs / Back = - 0.02+0.12ll +0.4jj +0.35 Mll +0.61 Mjj + 0.74M11 +1.04M21 • The classification of a test event x0 is then obtained according to the condition: if FHiggs / Back(xo) 0 then xo CHiggs else xo CBack • Classification of test events
-0.02 0.12 0.4 0.35 0.61 0.74 1.01 0.395 0i 1i 2i 3i 4i 5i 6i I Optimization of Discriminant Functions (2) • GA Parameters • Discriminant Function • Fitness Fitness Fn: Misclassification rate • Number of generations • Selection, Crossover, Mutation • GA Code • Matlab GA Toolbox M. Mjahed ACAT 05, DESY, Zeuthen, 26/05/2005 21
0N 01 1N 11 2N 21 3N 31 41 4N 51 5N 61 6N I N Initialize Population Evaluate Fitness Yes Terminate? Output solution No Perform selection, crossover and mutation Evaluate Fitness Optimization of Discriminant Functions (3) Chromosome 1 ……………… Chromosome N Generation of N solutions Coefficients of F Fitness Fn: • Genetic Process
-0.02 0.12 0.4 0.35 0.61 0.74 1.01 0.35 Test Events Classification Eff. Purity Eff CHiggs 0.652 0.649 0.65 0.35 CBack 0.648 0.650 Optimization of Discriminant Functions (4) • Optimization Results Number of generations =10000 CPU Time: 120 s • Optimal Disc. Fn M. Mjahed ACAT 05, DESY, Zeuthen, 26/05/2005 23
ll ll jj jj Mll M0 M11 M21 M31 M41 h1 h2 O1 Optimization of Neural Weights (1) • Neural Analysis NN Architecture: (10, 10, 10, 1) • Classification of test events if O1(x) 0.5 then x CHiggs else x CBack
Wijxh1, ih1 I Wijh1h2, ih2 Wijh2o 100 +10 I 100 + 10 10 Optimization of Neural Weights (2) • GA Parameters • Connection Weights + Thresholds • Total number of parameters to be optimized : 230 • Fitness: Misclassification rate, • Optimization Results Number of generations =1000 CPU Time: 6 mn
0j 1j2j 3j4j 5j6j 7j8j 9j10j Initialize Population Evaluate Fitness Yes Terminate? Output solution No Perform selection, crossover and mutation Evaluate Fitness Hyperplan search (1) • H(x) = 0 + i xi = 0 , i=1, 10 Hyperplan Hj • 11 parameters to optimize • Classification rule if H(x) 0 then x CHiggs else x CBack • Genetic Process • Generation of N hyperplans Hj, • j=1: N=20 • Number of generations =10000 • CPU Time: 4 mn
Test Events Classification Eff. Pur. Eff CHiggs 0.652 0.649 0.65 0.35 CBack 0.648 0.650 Hyperplan search (2) • Hyperplan search Results • Classification of test events • Same results than Discriminant functions optimization
Initialize Population Evaluate Fitness Yes Terminate? Output solution No Perform selection, crossover and mutation Evaluate Fitness Hypersurface search (1) I j i=0:10 i j i=1:10 i j i=1:10 j Hypersurface j • 31 parameters to optimize • Classification rule if S(x) 0 then x CHiggs else x CBack • Genetic Process • Generation of N hyperplans Sj, • j=1: N=20 • Number of generations =10000 • CPU Time: 6 mn
Test Evts Classification Eff. Pur. Eff CHiggs 0.691 0.696 0.695 0.305 CBack 0.699 0.693 Hypersurface search (2) • Hypersurface search Results • Classification of test events • Same results as NN weights optimization M. Mjahed ACAT 05, DESY, Zeuthen, 26/05/2005 29
Methods C • Importance of Pattern Recognition Methods • The improvement of an any identification is subjected to the multiplication of multidimensional effect offered by PR methods and the discriminating power of the proposed variables. CHiggs CBack Conclusion • Genetic Algorithms Method allows to minimize the classification error and to improve efficiencies and purities of classifications. • The performances are in average 3 to 5 % higher than those obtained with the other methods. • Discriminant Functions Optimization : comparative to hyperplan search approach • Neural Weights Optimization : comparative to hypersurface search approach M. Mjahed ACAT 05, DESY, Zeuthen, 26/05/2005 30
Conclusion (continued) • Variables • Characterisation of Higgs Boson events: Other variables should be examined • Physics Processes Other processes should be considered Detector effects should be added to the simulated events M. Mjahed ACAT 05, DESY, Zeuthen, 26/05/2005 31