1 / 16

Pi0 Calibration: Status and Plans

Vladimir Litvin, Marat Gataullin Caltech, CMS. Pi0 Calibration: Status and Plans. Intro. Part I: Producer and HLT implementation are ready Part II: Problem with output file size – cmssw framework problem. Part I.

brosh
Download Presentation

Pi0 Calibration: Status and Plans

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Vladimir Litvin, Marat Gataullin Caltech, CMS Pi0 Calibration: Status and Plans

  2. Intro • Part I: Producer and HLT implementation are ready • Part II: Problem with output file size – cmssw framework problem

  3. Part I • Part I: Producer and HLT implementation are ready

  4. CVS tags for 172 • Pi0 calibration producer and associated cfg/cff/cfi files • > cvs co -r lva091207 Calibration/EcalAlCaRecoProducers • HLT changes: • > cvs co -r lva091207HLT HLTrigger/special • To be able to run pi0 calibration HLT stream in regular production chain you need to modify HLTrigger/Configuration/data/main/Special.cff. See here: • http://www.hep.caltech.edu/~litvin/Special.cfg • Or you can add those two lines in it by hand: include "HLTrigger/special/data/AlcastreamEcalPi0.cff" path pathAlcastreamEcalPi0 = {doRegionalEgammaEcal,seqAlcastreamEcalPi0}

  5. Calibration selection (I) • Calibration Selection (I) • Making regional reconstruction based on EMIso and EM NonIso objects • We will unpack correspondent FED (0.35 x 0.35 area in eta-phi) and everything around this FED with margins 0.25 x 0.40 in eta-phi only if RCT Pt rank of the object will be larger than Ptmin_iso for EMIso objects and PTmin_noniso (EM NonIso objects). Ptmin_iso = Ptmin_noniso = 5 by default • [Pending feature: the ability to unpack the FED only if RCT Pt rank is in the specified range – from Ptmin_iso to Ptmax_iso for EMIso objects and from Ptmin_noniso to Ptmax_noniso for EMNonIso objects. Done, but need a change in FED unpacker to use it] • Will make a digis from alll raw unpacked data • Will make RecHits from the Digis, based on unpacked data • Will create regionally unpacked collection in EcalRegionalRecHitsEB transparently, on the fly. • Using EcalRegionalRecHitsEB collection, prepared on the fly.

  6. Calibration selection (II) • Calibration selection (II) • Make a loop over RecHits to prepare simple clusters. All RecHits with energies larger than clusSeedThr_ (default = 0.5 GeV) will be considered as seeds and sorted with energy. After that all RecHits in the window clusEtaSize_ X clusPhiSize_ (default 3x3), will be considered as a cluster. If there is an overlap between clusters, RecHit will belong to the cluster with higher energy (it means, that there is no RecHit, which belongs to both clusters). • Based on the vector of these simple clusters, further preselection will be done to identify the pi0 candidates. Those candidates are pairs of Simple clusters, passed through next simplified selection • Apply two cuts on Pt of each photon candidates: Pt_gamma1 > selePtGammaOne_ (default =1 GeV), Pt_gamma2 > selePtGammaTwo_ (default = 1 GeV) (where gamma1 is a Island Basic Cluster with highest Pt, gamma2 - Island Basic Cluster with second highest Pt) • Apply cut on Pt of Pi0 candidate: Pt_Pi0 > selePtPi0_ (default - 2.5 GeV) • Apply cut on invariant mass of two photons (invariant mass of Pi0 candidate): seleMinvMinPi0_ < Minv_Pi0 < seleMinvMaxPi0_ seleMinvMinPi0_ default value is 0.09, seleMinvMaxPi0_ default vaue is 0.16 GeV

  7. Calibration selection (III) • Calibration Selection (III) • Each pair, passed through this selection will be treated as good Pi0 candidate • All ECAL Barrel RecHits with energies more than seleXtalMinEnergy_ (default value is 0.0 GeV) in gammaCandEtaSize_ x gammaCandPhiSize_ matrix (default values are 21x21) around most energetic crystals in both clusters will be counted • If number of RecHits, counted above, will be less than seleNRHMax_ (default value is 75), this RecHits collection will be stored in the file, otherwise, collection will be discarded.

  8. RecHits and pi0 cand rates with no PU • energy (float = 8B), and unique ID of the crystal (int = 4B). • (2) 100 ev for each pthat bin • (3) Xeon 2.8 GHz, 5.6k bogomips, 1GB RAM, process VIRT = 1.1GB • (4) EE+ES problem – no regional digis?? See next slide • (5) Need pending feature to reject L1 EM objects with high ranks

  9. Future EE+ES problem (?) • If you peek into 15-20 GeV pthat you can get a feeling how it will be - to work with ES+EE: • TimeReport 0.237284 0.243369 0.237284 0.243369 pathAlcastreamEcalPi0 • TimeReport ---------- Modules in Path: pathAlcastreamEcalPi0 ---[sec]---- • TimeReport per event per module-visit • TimeReport CPU Real CPU Real Name • TimeReport 0.000430 0.000426 0.000430 0.000426 ecalPreshowerDigis • TimeReport 0.000900 0.000984 0.000900 0.000984 ecalRegionalEgammaFEDs • TimeReport 0.004739 0.005622 0.004739 0.005622 ecalRegionalEgammaDigis • TimeReport 0.021177 0.022190 0.021177 0.022190 ecalRegionalEgammaWeightUncalibRecHit • TimeReport 0.014748 0.015573 0.014748 0.015573 ecalRegionalEgammaRecHit • TimeReport 0.194550 0.194647 0.194550 0.194647 ecalPreshowerRecHit • TimeReport 0.000690 0.003887 0.000690 0.003887 alCaPi0RegRecHits

  10. Part II • Part II: Problem with output file size – cmssw framework problem

  11. Example • QCD 30-50, 100ev, default selection • There are three methods how to evaluate the size of useful information. • I: Based on the number of RecHits. • We need to store energy (float = 8B), and unique ID of the crystal (int = 4B). 12B per RecHit in a total. • 374 RH * 12B = 4488 B = 4.5kB (including event and run numbers) total • II: Based on the edmEventSize –v output: • File alCaPi0RegRCT_0.09_0.16_RH75_30_50_100ev.root Events 100 • EcalRecHitsSorted_alCaPi0RegRecHits_pi0EcalRecHitsEB_recHitsToAlCaPi0RecHitsProcess. 569.61 569.61 • 570 B per event => 570B * 100 = 57000B = 57kB total, or 152B per RecHit • III: Based on the ls –al output (keep only pi0 collection and dropped the rest) • cithep90 [] ls –al alCaPi0RegRCT_0.09_0.16_RH75_30_50_100ev.root • -rw-rw-r-- 1 litvin litvin 2275860 Dec 9 05:41 alCaPi0RegRCT_0.09_0.16_RH75_30_50_100ev.root • 2275860B = 2.28MB total, or 6085B per RecHit II/I = 12.7, III/I = 505

  12. Example • Inside edmEventSize –v… • EcalRecHitsSorted_alCaPi0RegRecHits_pi0EcalRecHitsEB_recHitsToAlCaPi0RecHitsProcess. 18079 • EcalRecHitsSorted_alCaPi0RegRecHits_pi0EcalRecHitsEB_recHitsToAlCaPi0RecHitsProcess.edm::EDProduct 2323 • EcalRecHitsSorted_alCaPi0RegRecHits_pi0EcalRecHitsEB_recHitsToAlCaPi0RecHitsProcess.present 1250 • EcalRecHitsSorted_alCaPi0RegRecHits_pi0EcalRecHitsEB_recHitsToAlCaPi0RecHitsProcess.obj 12791 • EcalRecHitsSorted_alCaPi0RegRecHits_pi0EcalRecHitsEB_recHitsToAlCaPi0RecHitsProcess.obj.obj 12500 • EcalRecHitsSorted_alCaPi0RegRecHits_pi0EcalRecHitsEB_recHitsToAlCaPi0RecHitsProcess.obj.obj.id_.id_ 3338 • EcalRecHitsSorted_alCaPi0RegRecHits_pi0EcalRecHitsEB_recHitsToAlCaPi0RecHitsProcess.obj.obj.energy_ 3346 • EcalRecHitsSorted_alCaPi0RegRecHits_pi0EcalRecHitsEB_recHitsToAlCaPi0RecHitsProcess.obj.obj.time_ 3334 • EcalRecHitsSorted_alCaPi0RegRecHits_pi0EcalRecHitsEB_recHitsToAlCaPi0RecHitsProcess. 18079 • EcalRecHitsSorted_alCaPi0RegRecHits_pi0EcalRecHitsEB_recHitsToAlCaPi0RecHitsProcess. 56961 56961 • File alCaPi0RegRCT_0.09_0.16_RH75_30_50_100ev.root Events 100 • EcalRecHitsSorted_alCaPi0RegRecHits_pi0EcalRecHitsEB_recHitsToAlCaPi0RecHitsProcess. 569.61 569.61 • Observations (374 RHs were stored): • Looks like edmEventSize counts multiple times the nested folders. • Id_, energy_ and time_ have the same size = 9B (do we really need a time_ in RecHits at all? Or for calibration purposes?? ) (12%) • Obj.obj contains 2.5kB more than id_+energy_+time_ (25%) • Process contains 4kb more than obj.obj + present + EDProduct (20%)

  13. Example • Let’s calculate the bandwidth requirements for all three cases. Will start from 1kHz pi0, will use 3:1 factor, so will have 3kHz candidates (this is a bit an overkill, but let’s be conservative here) and ~50 RecHits per candidate (1kHz pi0s = 2000 pi0s/crystal, BW = 200GB/d) • I: 12B/RecHit • 3kHz * 12B * 50 *10^5 s = 180GB (2 days, no prescale) • II: 152B/RecHit • 3kHz * 152B * 50 * 10^5 s = 2.28TB (24 days, 1:12 prescale) • III: 6085B/RecHit • 3kHz * 6085B * 50 * 10^5s = 91.2TB (1080 days, 1:505 prescale) There are two choices: - Heavily optimize IO layer of cmssw to store small pieces of data efficiently, with small system overhead (no more than 10%-20%) - Develop alternative way to store data – ascii file will be just fine

  14. Example • Tentative ASCII format: Int #Run Int #Event Float ERH1 int ID1 Float ERH2 int ID2 … Float ERHN int IDN Int #Event Float ERH1 int ID1 Float ERH2 int ID2 … Float ERHN int IDN • Overhead (~50 RecHits per pi0 candidate) will be 4B/600B < 1%.

  15. Plans • To produce 17x and 18x samples large enough to check the quality of pi0 candidates with new online algorithms in different pthat bins. Optimize the cuts for different trigger scenarios at the startup (top) • Look at muon and tau candidates – perhaps we can use them as well, if rate will not be enough. Right now we have a plenty of trigger objects to work with, but this still will be investigated further based on 17x/18x data. • Split the regional unpacking between EB and EE to be able to unpack EB or EE L1 objects separately. If we are working with EB we don’t need to unpack EE. • Develop the procedure to calibrate EE using the data from ES. (top) • Regional EE+ES digitization/unpacking? • Special algo should be developed to take into account the position of the ES clusters. Position resolution of the clusters in EE should be improved • Check again eta calibration with new software. • Check the algo part in set of exercises in Jan-Feb based on 167 data produced in CSA07. (top) • How fast is the cmssw algo? How stable is the castor? What is the castor IO performance wrt usual disks? • Check the standalone algo with inetrmediate format if will have a problem with cmssw and compare. Scripts and format are ready now. • Check the DB access from within cmssw and with standalone tools. • DQM tools to monitor both online HLT path work and offline CAF processes

  16. Conclusion • Pi0 producer was improved and ported to 17x • HLT module was developed and committed to cvs. • Pi0 selection itself used ~5ms per event. Max and min values of the time spent in the path should be investigated further • Bandwidth is Ok if we will use just the useful information. It is worrisome if we will use just a branch size and it is unacceptable if we will use default IO layer in cmssw. • There are two ways of dealing with it – to optimize cmssw or develop alternative way to store data with small overhead. One possible data format was proposed with virtually zero overhead (<1%) • Work on EE calibration will be started in Jan. Regional unpacking in EE+ES needed. • EB and EE+ES should be treated separately in regional unpacking.

More Related