210 likes | 358 Views
Realistic MC Needs/Status. Mike Hildreth Universit é de Notre Dame du Lac Representing the Full Simulation Group with Charles Plager UCLA/FNAL. Philosophic Overview. Full Simulation: should include all details relevant for physics analysis
E N D
Realistic MC Needs/Status Mike Hildreth Université de Notre Dame du Lac Representing the Full Simulation Group with Charles Plager UCLA/FNAL Mike Hildreth/Charles Plager – CMS Physics Week, Bodrum
Philosophic Overview • Full Simulation: should include all details relevant for physics analysis • accurate representation of interaction of particles with detector • tails on resolution distributions from all known effects • to the extent that these tails/effects are important… • including noise, pileup, geometry, sensitivity, etc. • Because we can make the simulation perfect doesn’t mean we should • currently victims of our own hard work/success • when things look “too good”, analysts will expect perfection • in reality, it will never be perfect • some a posteriori corrections will always be necessary will need to plan for implementation of these corrections • Goal: reduce the size of necessary corrections and the associated errors so that they have minimal impact on physics • implies improvements to Simulation if needed • needs to be “good enough” Mike Hildreth/Charles Plager – CMS Physics Week, Bodrum
Talk Overview The 64M Lira Question: “What is ‘Good Enough’?” • Right now, we don’t know the answer... • Here, review strategies for achieving Data/MC agreement, look at advantages/disadvantages of each approach • Run-Dependent Monte Carlo • Technical Readiness • Possible Implementation Scenarios • Data Recovery Test • Event Re-Weighting (or “Run-Independent” MC) • Overview of Techniques • Examples Emphasize: Studies needed to arrive at MC Production Strategy Mike Hildreth/Charles Plager – CMS Physics Week, Bodrum
Realistic MC • Definition of “Realistic MC”: • IOV-based (= run-number based) • uses DB to control live/dead channels, calibrations, etc. • Realistic “Conditions” • beam spot in correct position for a given run • pileup distribution matches • “Realistic” Detector Geometry • mis-alignments/smeared geometry matching the real detector • NOT: completelyrealistic trigger simulation • Offline L1 Triggers do not exactly match what was run online • One proposal: Create MC samples with a run distribution that statistically samples all Data runs with proper luminosity distribution • Alternatively, a set of “representative” runs could be decided upon by the Physics groups to give appropriate sampling of run-dependent epochs of detector conditions Mike Hildreth/Charles Plager – CMS Physics Week, Bodrum
Status of Necessary Pieces (I) • Run-number setting (√) • Simulation/Framework allows the specification of arbitrary lists of runs, each with a probability weight, to assign run numbers for Generation • set before SIM step • Run-number dependent code in Sim Packages (√) • All subdetectors have code that reads conditions from the DB for simulation of masked/dead channels • can easily be made run-dependent by assigning IOVs • Beam Spot from DB (X) • under discussion with experts to evaluate time scale • not complicated, could be done quickly • must be set in SIM step, matching Run number Mike Hildreth/Charles Plager – CMS Physics Week, Bodrum
Status of Necessary Pieces (II) • Arbitrary Pileup Distributions (√) • currently, distributions must be put in “by hand” • either a root file or a list of numbers • reading from the DB would obviously be better • another relatively quick project • actual distributions will have to be measured • correspondence number of vertices with true number of interactions, for example • Dedicated pileup studies needed • Geometry (X) • “Large” movements, such as mis-centering of pixels with respect to Silicon Tracker, beampipe, (Ecal?) will be put in Geant geometry • would be new default • Geometry DB would allow use of new and old samples Mike Hildreth/Charles Plager – CMS Physics Week, Bodrum
Realistic Conditions? A couple of items missing from previous discussion: • Beam Backgrounds • may be time-dependent • should be taken directly from data if necessary • we don’t know what the impacts are on analysis yet • Asynchronous HPD noise in Hcal • Request (plan?) for Hcal to use data in simulation as default • Thermal neutrons in cavern • simulation plan evolving; may be difficult to get right • All these may be easiest using DataMixer • Not clear if such detail is necessary... Mike Hildreth/Charles Plager – CMS Physics Week, Bodrum
Comments on DataMixer Some attractive attributes: • Backgrounds, multiple interactions, detector noise, etc. are automatically “correct” if samples are properly formulated • no incorrect physics model of low-pT interactions • taken directly from Nature • Overlay is done at Digi level, so can be re-done without excessive computation But: it is the ultimate run-dependent MC • beamspots must match between simulated and overlay events • zerobias samples will need to be constructed before MC production representing a given time period can start • some care necessary to make sure IOVs are correct for MC Need Pileup Studies (Data/MC comparisons) to know whether or not (or at what level) we need to use data overlay Mike Hildreth/Charles Plager – CMS Physics Week, Bodrum
“Data Recovery” Test • Working with DPGs, we will generate run-dependent MC for some runs that are currently declared “BAD” due to hardware problems • use DB to kill MC hits in channels that are off • (will have to watch load on DPGs if this becomes routine) • e.g.: Tracker power supply failure • DPGs/POGs will check MC/Data agreement for these epochs • “Good” agreement between Data and MC may allow the “recovery” of some data with minor problems, since MC will correctly simulate degradation Same json file for Data & MC Gordon Kaußen Mike Hildreth/Charles Plager – CMS Physics Week, Bodrum
Reweighting Techniques • Almost all physics analyses will not be able to use “straight” MC, but instead will need to apply correction factors (i.e., “scale factors”) or efficiencies to correct for MC/Data disagreement • In some very simple cases, these may be able to be applied after running over the MC samples. • e.g., electron reconstruction efficiency/scale factor that is independent of the electron kinematics for a single electron analysis. • In many (most?) cases, it is either necessary or much easier to apply these corrections/efficiencies while running over the MC. E.g. • Electron efficiency/scale factor that is not independent of kinematics. • Electron efficiencies for multi-electron analyses. • B-tagging efficiencies depending on jet kinematics • Jet energy corrections • etc. Mike Hildreth/Charles Plager – CMS Physics Week, Bodrum
Reweighting Techniques • Sample from Monte Carlo generated with various conditions to build a MC analysis dataset equivalent to the Data • different “correction factors” for different data epochs • Can make use of constants stored in the Performance DB (PerfDB) and IOVs • MC sample must supply run & luminosity information to PerfDB to obtain appropriate correction factors • distribution of runs matches Data • caveat: often, have to worry about changing conditions for more than one object (i.e., b-tagging, electron efficiencies, etc.) • overlapping IOVs have to be treated correctly e.g. relative weights given by integrated lumi • Run number: 1 2 3 4 5 6 7 8 9 10 • B tag efficiency IOVs: • Electron efficiency IOVs: • Total IOVs modeled in MC: A B C Mike Hildreth/Charles Plager – CMS Physics Week, Bodrum
Making Generic MC Run-Dependent • General idea: • Query the PerfDB about the different IOVs for the quantities in which we are interested. • Get luminosity profile for data in which we are interested. • Calculate intersection of data luminosity and IOV lists. • When running over MC, a run-lumituple will be generated for each event such that the different IOVs will be represented in the correct fraction as dictated by the data. • Complete tools would/will be provided that work both in cmsRun and FWLite • central solution, not 40 different ones… • Reweighting will be necessary whether or not we have Run-Dependent MC – tools should be centrally developed Mike Hildreth/Charles Plager – CMS Physics Week, Bodrum
Luminosity Reweighting • For first pass, assume we only need to worry about number of pile up interactions. • General idea: • Split data into IOVs as previously discussed • For each data IOV, convert luminosity profile into distribution of expected number of pile-up interactions. • This depends on bunch structure at LHC and not just instantaneous luminosity. • also will have to worry about out-of-time pileup… • Get expected number of pile-up interactions summed over all data IOVs (i.e., sum up distributions from step 2 above). • For MC sample, get distribution of pile-up interactions. • For each MC event, given its number of pile-up interactions, weight event based on pileup distributions and pick IOV based on distributions of the number of pileup interactions to match overall Data distribution Mike Hildreth/Charles Plager – CMS Physics Week, Bodrum
Quick Aside: Pileup at CMS • We now have all of the tools in place to estimate pileup at CMS. • Data/Prediction agree well: circulation rate Predicted vs. Measured Primary Verticies Data (lum. weighted zero bias data) Predicted (Instantaneous luminosity information convolved with MC minimum bias vertex efficiency) Charles Plager Mike Hildreth/Charles Plager – CMS Physics Week, Bodrum
Example of Luminosity Reweighting • Take a Tevatron sample as an example Unnormalized weights Mike Hildreth/Charles Plager – CMS Physics Week, Bodrum
Example of Luminosity Reweighting • A MC event with one pileup interaction: is much more likely to come from early data, use this IOV ratio (early/late) for conditions use this weight Mike Hildreth/Charles Plager – CMS Physics Week, Bodrum
Example of Luminosity Reweighting • A MC event with 6 pileup interactions: is equally likely to come from early or late data, use this IOV ratio (early/late) for conditions use this weight Mike Hildreth/Charles Plager – CMS Physics Week, Bodrum
Reweighting Redux • Requires no knowledge of detector conditions for MC generation • if reweighting is sufficient for run-dependence • potentially faster turn-around for physics analysis • Adaptable to most corrections for MC/Data agreement • However, requires most corrections to be in the form of scale factors, not something more subtle • Widely used • will be necessary to correct for residual MC/Data disagreement • e.g., deficiencies in generators, if nothing else • Will put quite some strain on Performance DB • may need careful design considerations, light payloads Mike Hildreth/Charles Plager – CMS Physics Week, Bodrum
Back to Run-Dependent MC • Do we need it? • Questions to answer: (not a complete list) • Do we have “epochs” of dramatically different detector performance that have impacts on physics? • either large correction factors or different simulation • can we recover “BAD” data with proper simulation • How much does beam spot motion affect • tracking efficiency? • tracking errors B-tagging efficiency? • NB: beam spot motion cannot be re-done at Digi time • What are the effects of beam backgrounds on physics, and do they vary in time in a way that should be modeled? • same for neutrons in cavern • Are there significant differences in pileup distributions between Data and MC that we cannot simulate by modifying MC parameters? • what about degradation of detector over time? Mike Hildreth/Charles Plager – CMS Physics Week, Bodrum
Back to Run-Dependent MC • Do we need it? • Will need to explore both weighting and Run-Dependent options to decide which is necessary • implies efforts from Physics, POGs, DPGs to study issues • close coordination with simulation effort • In particular, we now have sufficient quantities of events with large numbers of interactions in a single beam crossing • need to have matching MC samples to make thorough comparison studies: • Can we simulate pileup, or should we use data instead? • need some sort of Pileup Study Group to coordinate • again, close coordination between Physics and Simulation/Generators Mike Hildreth/Charles Plager – CMS Physics Week, Bodrum
Conclusions • MC will never be perfect • Reweighting tools will be necessary and should be centrally developed • Studies needed to determine how perfect a MC is necessary • quite broad in scope • pileup • efficiency epochs • beamspot effects • need central coordination • Technically, Run-Dependent MC is possible ~now • some small updates needed • validation needed Mike Hildreth/Charles Plager – CMS Physics Week, Bodrum