70 likes | 254 Views
Online calibration for offline. Wha t are we talking about?. Goal for offline is to remove need for “end of year” reprocessing Avoid huge peak in processing needs once per year Make calibrated data available for analysis soon after it is taken
E N D
What are we talking about? • Goal for offline is to remove need for “end of year” reprocessing • Avoid huge peak in processing needs once per year • Make calibrated data available for analysis soon after it is taken • Implies having “final” alignment and calibration soon after data being taken • Two-four weeks achieved for Reco14 during September-December 2012 • Certain fine tune calibrations can be applied in DaVinci later • e.g. Foresee a full end of year stripping where refined calibrations are applied • Implies suppression of “prompt” processing • Insufficient resources to run “prompt” and “re”-processing in parallel
Prompt processing • What is purpose of prompt processing? • Data Quality monitoring • Need Brunel reconstruction for many data quality checks • In 2012: 40% of bandwidth reconstructed for this purpose • Stripped dataset for Calo calibration • Including FemtoDST stream • Can we get rid of it? • Data quality can be done online, from OnlineBrunel histograms • Needs work on selection of appropriate events to reduce statistics that need to be reconstructed • Can such selection be done in HLT1, if HLT2 is deferred? • Mechanism needed to archive (offline) DQ histograms by run • Work in progress to calibrate Calo differently, online • FemtoDST can be done offline, after full reconstruction • Cross check for readjusting online calib e.g. after TS • Produce fine tuning that can be applied in end of year stripping
Calibration and alignment • No judgements here on whether this can be fully automated • Up to sub-systems to evaluate • Clear though that initial calibration/alignment takes time and cannot be automated • Also in cases of changes that must be understood • Not critical for offline, can wait for green light! • Input data • All calibrations and alignments require selected RAW data • Express stream, selected events -> output of HLT1 • Jobs must run automatically, best on dedicated resources • Output constants • Need to be distributed to “the grid” • Has to be automated
Conditions database - reminders • Two partitions supporting IOVs: • Online: time varying constants but single version • Automatically distributed • LHCbCond: multiple versions, version selected by tag • Tagging is manual, tag needs to be supplied to application • or, “use latest tag” • In 2012, regularly provided new global tag and updated productions • Manpower intensive, little scope for automation • Need scheme for automatic injection of tags • Actually, semi-automatic. Should be systematically validated? • Then “use latest” in both HLT2 and Offline • Meaning of “latest” can change. How do we keep track of what was used? • Same “latest” in HLT2 and Offline?
CondDB validation • My belief that, even if injection of tags is automatic, behaviour should be validated before using it in production • In HLT, if it goes wrong, you lose the data • In offline, if it goes wrong, you almost lose the data • Given volumes of data to be handled, anything wrongly reconstructed is likely to be flagged BAD, and not redone until reprocessing at least a year later. • For Offline, may mean going back to old design that a run is reconstructed only once validated in some way • For HLT some mechanism is needed • And if it’s OK for HLT, it’s OK for offline….