150 likes | 187 Views
Example of a CMS benchmark analysis chain. N. De Filippis LLR-Ecole Polytechnique. Design of an analysis chain. Main concepts: Coherence, robustness and reproducibility of the results Flexibility and adaptability for fast and efficient implementation of new ideas
E N D
Example of a CMS benchmark analysis chain N. De Filippis LLR-Ecole Polytechnique
Design of an analysis chain Main concepts: • Coherence, robustness and reproducibility of the results • Flexibility and adaptability for fast and efficient implementation of new ideas • quasi-online automatic chain to run on real data and on new MC samples • uniformity of use/access to physics objects and algorithms • modularity of the analysis steps • plug and play addition of new components • easier debug and bug fix • support for parallel branches and generalized analyses
How to implement that chain in CMS ? • Common analysis framework across homogeneous analyses common basic tools (and reference tunes) • easy interface to EDM/PAT (via FWlite) and bare ROOT objects via macros • well tested baseline code for fast validation and crosschecks of results • use of efficient validated physics objects and algorithms by profiting from tools provided by POG and PAG • provide guidelines and documentation • provide control histograms to validate the results • use of common statistical approaches and tools • encourage a collab. effort of more people trying to avoid duplication
Objects and tools • Physics objects provided through POGs: electrons, muons, jets, photons, taus, tracks, MET, Pflow and PAT objects • Common tools already provided or to be provided centrally: • combinatorics, (PhysicsTools) • vertex fitting, (PhysicsTools) • boosting particles, (PhysicsTools) • dumping MC truth, (PhysicsTools) • matching RecoTo MCTruth, (PhysicsTools) • kinematic fit (PhysicsTools) • selection particles based on pT (PhysicsTools) • isolation, eId,muonID algorithms provided by POG • optimization machinery (Garcon, N95, neural networks, MVA) • significance, CL calculation • Official analysis: clear combination of FW modules and bare ROOT classes Example of HZZ4l
Benchmark: the HZZ4l analysis at 14 TeV Details in the Clementine’s talk • Signatures: 4e,4mu and 2e2mu final state • Backgrounds: • irreducible ZZ (each virtual or real Z in m+m-) • reducible Zbb (Z in m+m- and semilept. decay of b) • reduciblett (each t in bW and semilept. decay of b) • and tt+jets, Z+jets, W+jets, QCD • Preselection strategy: (to get rid of QCD bkg with fake leptons) • Single & double lepton triggers • 4 loose isolated leptons opp. charge and eleId • mll>12 GeV, m4l>100 GeV • Main selection observables: • tight isolation (against tt, Zbb) • impact parameter (against Zbb and tt) • 50<mZ<100 GeV, 20<mZ*< 100 GeV • control from real data of • the efficiencies (lepton and jet reconstruction) • the estimation of ZZ and Zbb bkg rates Baseline cut-based analysis, mH-independent, able to get rid of main bkg
Software for HZZ analysis (1) • Goal: build a common basic analysis to unify the treatment of the homogeneus channels, use the same algorithms, propagate changes in all the channels.etc.. • First prototype in December 2008 now enough mature to run all the 4l analyses in a coherent way: - cvs repo: HiggsAnalysis/HiggsToZZ4Leptons • supported features: • Skim selection via HLT trigger bits • electron selection, muon selection • ZEE, Zmm, H4l candidates building • electron isolation, muon isolation • scan of parameters of electron Id, muon and electron isolation algos • support for several vertexing algorithms • CP observables, boosted particles • Best higgs candidate building • support for Brem recovery studies
Software for HZZ analysis (2) • Filter on MC particles, build of MCtruth candidates and RECO matching • Flags for preselection and offline selection steps • Flexible application of cuts via rootple or EDM format • Interfaced to edm::View PATv2 integration by the end of June • Common ROOT Tree • common ROOT macros for fast iteration issues • optimization tool via bayesian neural net • Many contributors common effort of the HZZ team
Workflow of the HZZ analysis (1) H1/HLT selection + Skimming CMSSW / EDM Format Triggered + Skimmed Reco. Data (1 Mev / fb-1; 100 kB/ev) (or triggered only data + skimming flags) Pre-selection CMSSW / EDM Format Triggered + Skimmed Reco. Data (1 Mev / fb-1; 100 kB/ev ) + Common pre-selection Root-tuple + PAT-tuple (or pre-selected events only) Optimise common working point i.e. loose lepton ID and Isolation, ambiguity resolving etc. Selection CMSSW / EDM Format Pre-selected Reco. Data (1 Kev / fb-1; 100 kB/ev ) + Analysis Root-tuple + PAT-tuple Optimise common baseline i.e. topology dependent lepton ID and Isolation, PT(li) and M(Zi) cuts, global kinematics etc. Analysis
Workflow of the HZZ analysis (2) Three leptons (e or ) above some PT thresholds H1/HLT selection + Skimming Triggered + Skimmed Reco. Data (1 Mev / fb-1; 100 kB/ev) (or triggered only data + skimming flags) Eliminate QCD multi-jet background Select 4l candidates Pre-selection Triggered + Skimmed Reco. Data (1 Mev / fb-1; 100 kB/ev ) + Common pre-selection Root-tuple + PAT-tuple (or pre-selected events only) Perform baseline selection (common for any MH > 100 GeV/c2) Selection Pre-selected Reco. Data (1 Kev / fb-1; 100 kB/ev ) + Analysis Root-tuple + PAT-tuple Provide MH hypothesis independent results Perform MH dependent selection, MVA analyses, Perform fits and statistical analysis Analysis
HiggsToZZ4Leptons analysis schema generator filter on signal HZZ4LeptonsMCGenFilter HZZ4leptonsHLTAnalysis higgsToZZ4LeptonsFilter HLT analysis Skim 4 leptons hTozzTo4LeptonsCommonPreSelection Preselection sequence (cleaning of electrons, Electron Id included) hTozzTo4Leptons Full reconstruction of the channel, also in H rest frame to study CP properties hTozzTo4LeptonsHiggsFrame hTozzTo4LeptonsCP hTozzTo4LeptonsMuonIsolation hTozzTo4LeptonsElectronIsolation Electron and muon isolation Vertex constraint Bremsstrahlung recovery hTozzTo4leptonsVertex hTozzTo4leptonsInnerBremProducer hTozzTo4leptonsMCTruthMatching and Dumper any user analysis module could be plugged anywhere in the sequence CMSSW+ ROOT interplay ROOT Tree creation; module used as a probe at each step of the analysis hTozzTo4leptonsRootTree HiggToZZ4Leptons package
Use of the package -- Brief instructions: HiggsToZZ4Leptons/test/README_tags -- Files to run the preselection :2e2mu, 4e and 4mu, all together HiggsToZZ4Leptons/test/HiggsToZZPreselection_2e2mu.py HiggsToZZPreselection_4mu.py HiggsToZZPreselection_4e.py HiggsToZZPreselection_4l.py -- Files to run the complete analysis :2e2mu, 4e and 4mu, all together HiggsToZZ4Leptons/test/HiggsToZZCompleteAnalysis_2e2mu.py HiggsToZZCompleteAnalysis_4mu.py HiggsToZZCompleteAnalysis_4e.py HiggsToZZCompleteAnalysis_4l.py Sequence of the analysis: 2e2mu -- preselection -- 3D IP and 2D IP -- full selection -- BestCandidate building -- CP variables -- production of ROOT Tree
How to run the analysis chain in GRID • Production of samples: • use of official samples accessible in Tier-2 via CRAB • generation of private samples via CRAB in dedicated Tier-2 • local submission via CRAB at tier-3 (to be supported for some schedulers) • Filtering, Skimming, Preselection in CMSSW via CRAB at Tier-2 • Complete analysis in CMSSW via CRAB or via FWlite/ROOT macros at Tier-2/3 • At each step the samples are publishedin Higgs PAG DBS to make them available to the full community via CRAB access Tier-2 for central Higgs PAG activities: IFCA, GRIF, MIT and Rome