220 likes | 397 Views
Upgrades for the CMS simulation. David J Lange LLNL September 4, 2014. CMS simulation faces significant challenges for both today and tomorrow. The CMS Physics goal is to keep same performances as in Run 1 despite the increasing more harsh conditions. We are busy preparing for 2015 to
E N D
Upgrades for the CMS simulation David J Lange LLNL September 4, 2014 D. Lange, ACAT 2014
CMS simulation faces significant challenges for both today and tomorrow The CMS Physics goal is to keep same performances as in Run 1 despite the increasing more harsh conditions. We are busy preparing for 2015 to • The higher LHC beam energy means more complex hard-scatter events • The higher LHC luminosity means larger number of interactions per bunch crossing (higher pileup) and thus more time consuming to simulate and reconstruct • Higher output rate of trigger (~1kHz) means demand for larger samples of simulated events We have achieved significant progress on the resource needs of our simulation during LHC shutdown D. Lange, ACAT 2014
CMS simulation faces significant challenges for both today and tomorrow Preparing for CMS at the start of Phase 2 (HL-LHC): • The CMS detector configuration is still to be determined • Even higher output rate of trigger (potentially 10kHz) • Even higher luminosity and pileup (140+ interactions/crossing) 1034 1035 HL-LHC presents increased challenges for Triggering, Tracking and Calorimetry, in particular for low to medium PT objects D. Lange, ACAT 2014
CMS Upgrade Strategy - Overview LS1 LS2 LS3 • Upgrades 2013/14 ongoing: • Completes muon coverage (ME4) • Improve muon trigger (ME1), DT electronics • Replace HCAL photo-detectors in forward • and outer (HPD → SiPM) Complete original detector Address operational issues Start upgrade for high PU • Phase 2 Upgrades: 2023-2025 (Technical Proposal in preparation) • Further Trigger/DAQ upgrade • Barrel ECAL Electronics upgrade • Tracker replacement/ Track Trigger • End-Cap Calorimeter replacement • Phase 1 Upgrades2018/19 (TDRs): • New Pixels, HCAL SiPMs and electronics, L1-Trigger • Preparatory work during LS1: • new beam pipe • test slices of new systems (Pixel cooling, HCAL, L1-trigger) Maintain/Improve performance at extreme PU. Sustain rates and radiation doses Maintain/Improve performance at high PU D. Lange, ACAT 2014
Changes we are making to facilitate 2015 and PhaseI/PhaseII upgrade development D. Lange, ACAT 2014
Flexible geometry infrastructure supports development of proposed systems • Each “CMS” geometry is made up of XML components for each subdetector • XML development done mostly by subdetector experts • Consistency checks and definition of overall “CMS” geometry scenarios done by overall geometry expert • Geometry is controlled via runtime job configuration to make it easy for users to get the geometry they want. The initial CMS geometryinfrastructure design has workedvery well for upgrade evaluation D. Lange, ACAT 2014
Example configuration of Phase II geometries implemented in CMSSW geometry D. Lange, ACAT 2014
Improving technical performance of the CMS detector simulation We have evaluated and integrated a number of technical improvements • Monte Carlo sampling techniques (e.g., Russian Roulette) • Deployment of Geant4.10 • Improved library packaging Together these improvements have gained a factor of 2 in simulation time/event in the last year of development D. Lange, ACAT 2014
Russian Roulette: Sampling of low-energy particles in Geant4 • Method from neutron shielding calculations: Track only a small fraction of low-energy particles through the detector with no noticeable change in simulation results • We found that it was necessarily to have sampling factors and thresholds that depend on both detector region and particle type. • Two parameters: • RR factor (1/W): Fraction of particles to keep • Upper energy limit (ERR) • Hits from Particles below ERR that are tracked are given a weight W. D. Lange, ACAT 2014
Russian Roulette now used by default after long tuning and validation process • RR factor of W=10 for neutrons and gammas found to give between 25% and 40% performance improvement with no observable effect on physics output • Energy and shower shape response in the high-resolution ECAL barrel detector were the most sensitive to RR parameter tuning D. Lange, ACAT 2014
10% performance gain from hidden visibility without playing with linker scripts Repackage all shared libraries in CMSSW that depend on Geant4 into a single static library to “hide” Geant4 from the rest of CMSSW • Use single archive library for Geant4 itself • This allowed us to more aggressively optimize at link time:adding “-flto -Wl,--exclude-libs,ALL” works best Constraints this imposes: • Must control dependencies to use Geant4 only within this single library: • This is “easy” for simulation (<2% of our libraries) • However, extending this idea to something effecting the full reconstruction is difficult • Impact on simulation code developers minimized by keeping .so cached in release. Static library rebuild is the only extra step if developer builds a package in this static library. D. Lange, ACAT 2014
Simulation approach: Hardscatter and pileup events simulated separately and “mixed” together Physics Generators Particle 4-vectors Geant 4 Geometry/Material Description Simulated Hits Simulated Raw Data Simulated Hits from Pileup Interactions Electronics Simulation Noise Model = “Digitization” D. Lange, ACAT 2014
Simulating Extreme Luminosities: The “old” way • Model pileup byincluding G4hits fromMinBias events generatedseparately from the hard-scatter event • The pileup interaction simulation From individual minbias events From “hard-scatter” MC event Simulated Hits Simulated Hits from Pileup Interactions … Simulated Hits from all interactions in BX N Simulated Hits from all interactions in BX N+1 Simulated Hits from all interactions in BX N+2 Simulated Hits from Pileup Interactions Electronics Simulation This loads all interactions in all beam crossings – all in memory simultaneously! ⇒ unsustainable at HL-LHC luminosities: ~140 interactions x 16 BXs = 2240 events in memory D. Lange, ACAT 2014
Modifications to allow very high pileup simulation within memory constraints • We re-factored the pileup simulation to process each interaction sequentially • Required substantial rewrite of digitization code, and the re-organization of internal event processing • The content of individual interaction events is dropped once they have been processed: • only 1 event in memory at any given time, so arbitrarily many pileup events can be included in the digitization THEN Simulated hits from one interaction repeat until all interactions are processed, including “hard-scatter” “Accumulate” hits/Energy Electronics Simulation D. Lange, ACAT 2014
We are now solving the remaining I/O issues with high pileup simulations • Even now that the MinBias events are read one-by-one, we still must read more than 2000 minbias event to produce a single event with appropriate pileup at HL-LHC luminosities • Potential Solution: “Pre-Mixing” • Create library of events containing only pileup contributions, following pre-determined luminosity profile to calculate how many interactions to include THEN Simulated hits from one interaction Simulated Raw Data repeat until all minbias interactions are processed “Accumulate” hits/Energy Electronics Simulation D. Lange, ACAT 2014
The hard-scatter and pileup events are combined as “RAW” data format events Physics Generators • The hard-scatter sample is created and processed through the digitization step with no pileup, convert to our raw data format • Then the two streams are merged • Only 1 pileup event is needed for each “hard scatter” MC event • Much easier to process through computing infrastructure once the premixing sample is created Simulated Raw Data Simulated Raw Data Geant 4 Simulated Raw Data Unpacking (RawToDigi) Unpacking (RawToDigi) Digi-Mixer (channels) Reconstruction, etc. Electronics Simulation D. Lange, ACAT 2014
Premixing deployment • Initial version implemented and deployed for part of our large 2015 preparation samples • Production tests went smoothly and demonstrated gains in CPU efficiency at high pileup • We have done an extensive validation against our normal mixing simulation approach. At this point, there are only a few known physics issues left to solve (saturation effects) • Need to take care with the re-use of events. • Generating 1 pre-mixed event for each hard-scatter event (e.g., 10 billion) would significantly reduce the potential benefit • We are evaluating the statistical issues with reuse of pileup events • We have always reused individual MinBias events in the pileup simulation, but with pre-mixing, we would reuse groups of MinBias events • In any case, care is needed in the MC production system to avoid excess resuse of pre-mixed events D. Lange, ACAT 2014
Managing CMS detector configurations • We expect that a single “CMSSW” version will support simulation and reconstruction of past, current and future detector configurations • We are not there yet as we have a release for the 2010-2016 configurations and another for 2017-202X • Challenges • The geometry and detailed channel-level information is part of our conditions database • Limitations in detector numbering schemes: Most were written specifically to the Run1 CMS detector specification. These are the biggest issue for generalizing our software • Application configuration needs to be flexible for both detector and LHC conditions differences D. Lange, ACAT 2014
Not everything that is detector dependent is easily made a condition • CMS uses a PYTHON based configuration language for SIM/RECO/ANALYSIS job configuration control. • Developers define “modules” • Sequences of modules define a “process” for our workflows • Upgrade software developers often need to change the configuration from that of the current CMS detector. Examples: • Detector dependent algorithm parameters (detector segmentation) • Pileup dependent tracking configuration • Sometimes it is simply the most convenient way to do something that could/should be a condition or derived from the geometry D. Lange, ACAT 2014
We have so far centralized these configuration changes • To keep control over these, we kept all configuration changes in one location per detector configuration (10s of lines of python) cms2017Customize.py defcustomize_reco(process): process.trkingIter1.ptMin=0.2 process.particleFlow.endcapEcalSource=‘shashlik’ …… # 10-200 lines of changes typically needed return process D. Lange, ACAT 2014
“ERA”s concept will let us de-centralize configuration changes • Top level configuration defines “ERA” (eventually derived from the geometry and conditions specified) • Handful of parameters that can be used by modules to change their configuration appropriately • Moves the development and maintenance of the upgrade detector configurations back to the developers (where the expertise is) • Better integration with on-going developments for the 2015 startup • Can also be used for the small number of configuration differences needed for 2015 vs 2012 CMS detectors trkingIter1=EDProducer(“IterativeTrackingModule”, ptMin=0.15) if era.pileup > 80: trkingIter1.ptMin=0.2 D. Lange, ACAT 2014
Conclusions • Recent development work in CMS means a big reduction in simulation resource needs for 2015 even in the face of higher event complexity and trigger rates. • CMS detector upgrades push us to use today’s software/computing for tomorrow’s event complexity. • The detector upgrade developments have proven to be an excellent platform for the quick deployment of new simulation features D. Lange, ACAT 2014