190 likes | 341 Views
Software&Computing Data distribution. Heavy Ion Group. Heavy Ion Collisions. Glauber. Large spread in event multiplicity from the change of impact parameter Collision geometry mostly produces peripheral (low-multiplicity) events But highest multiplicities may reach 5x average
E N D
Software&Computing Data distribution Heavy Ion Group
Heavy Ion Collisions Glauber • Large spread in event multiplicity from the change of impact parameter • Collision geometry mostly produces peripheral (low-multiplicity) events • But highest multiplicities may reach 5x average • Central event coming randomly every ~30 collision • In production of detector simulations this is aggravated by event ordering placing most central events together in the end of production jobs HIJING Peripheral collision Central collision 2
Distinct Features • Collision geometry parameters • Heavy Ion physics results are often presented as a function of centrality • Can be obtained from calculations based on Glauber model, but better to use results from Glauber based Monte Carlo models • Parameters need to be preserved thru the chain of simulation production • Special features of collisions of heavy ions, bulk matter objects • Different physics program requires specific reconstruction algorithms • Different collision properties require modifications or replacement of some of the standard Atlas reconstruction algorithms • High particle multiplicities • Heavy Ion collisions produce much higher particle multiplicities than pp • This leads to longer reconstruction times and critically high memory usage • Large event size requires more disk space per event than pp data
Event Generators • Models Hijing and Hydjet used • Using code from official Genser distribution • Athena interface implemented for both models • Problems? • Hijing code had a problem on SLC5 • Fixed in June and partially validated (high statistics validation soon)
Simulation & Digitization • Using standard Athena tools • Collision parameters transferred when using HeavyIonSimConfig.py • Problems? • Long time (up to 24 hours) of detector simulations of a single central event still acceptable in production on Grid at this collision energy • Rate of simulation permanent failures ~0.1%
Reconstruction • Using standard Athena tools • Heavy Ion specific modifications activated when using HeavyIonRecConfig.py • Collision parameters transferred • Trigger algorithms selected by HI menus • Heavy Ion specific algorithms from HeavyIonRec used • Modifications in standard reconstruction algorithms activated • Problems? • No production problems in rel. 15
HI Algorithms • HIGlobal: Global variables reconstruction • HICentrality: Event centrality • HIGlobalEt: Total Et • HIFlow: charged particle elliptic flow v2 • HIGlobalNSiCluster: dNch/d based on pixel cluster density • HIPixelTracklets:dNch/d based on 2-point tracklets in Pixel detector • HIJetRec: Jet reconstruction • extends standard JetRec + new background subtraction and fake jet rejection • HIPhoton: Direct photon pre-analysis • based on pp photon algorithms, produces special ntuple for final analysis
pp Algorithms • Trigger processing using dedicated HI menus • several trigger menus developed by Tomasz and Iwona Grabowska-Bołd • Tracking running in newTracking mode • newTracking with modified cuts activated by doHeavyIon switch • to lower CPU and memory requirements • no lowPt, no BackTracking, no conversions • Vertexing run in simple "DefaultFastFinding" mode • no V0 finder, no secondary vertices • Regular jets off • Heavy Ion version of jet reconstruction run instead • no ETMiss, no ETFlow, no BTagging, no TauRec • Calorimeter reconstruction based on cells • CaloTopoClusters activated for monitoring purposes • Muon reconstruction on • no MuidLowPt
Simulation Production • Official production at energy ECM = 2.75 TeV done in recent campaign with releases 15.6.X.Y • Description of mc09 samples • Hijing, Hydjet minimum bias and central samples • Hijing with particle flow • 5-10k events • Additional Hijing requests for increased statistics and more physics samples accepted and running now (actually waiting in the queue with low priority) • Hijing tasks • Hydjet tasks
Real Data Production Planning • Total amount and rate of data taking – fit data to available storage and computing resources • Reconstruction properties and requirements, data types and sizes – required cpu time and disk space for storage of reconstruction results • Tier0, Tier1 and Group Grid resources available – input for production and data distribution strategy • Software development and installation procedures – deadlines, possible scenarios for running tests and production • Production strategy – which resources will be used in which step • Analysis Model – where the data should be distributed
CPU/Mem in rel. 15/16 Reconstruction • Rel. 15.6.9.13 has acceptable CPU and memory consumption, with 100% reconstruction job success • Rel. 16.0.0 reconstruction on simulations (only) exceeds available ~4 GB memory limit in 55% of jobs • Reason 1: increased memory consumption between releases due to test run with tracking from min pT= 0.5 GeV, leading to 50 MB difference (at lower multiplicity) to 700 MB difference in most central events! • Reason 2: increased memory consumption by monitoring algorithms, adding 200 MB more at high multiplicity! • To reduce memory usage we may look for compromise in tracking min pT and reduce monitoring requirements, or run reconstruction on simulations without monitoring altogether.
Data Reconstruction • Reconstruction strategy at Tier0 • Performance of Heavy Ion reconstruction with monitoring in rel. 15.6.9.13 • 45 CPU s/event (no trigger, no truth), assuming <2 min/event panda wall time) • Tier0 capacity • current total CPU capacity of Tier0 farm (2500 cores!) • efficiency of CPU use at Tier0 (~100%!) • no additional CPU needed for other processing (file merging, etc.) at Tier0 • Calculated throughput: 30 PbPb events/hour/CPU core 1,800k events/day = 20.8 Hz • Expected rate of PbPb event data taking is > 60Hz so additional resources are needed • Separate data by streaming at DAQ • Express stream with duplicated events used for prompt calibration at Tier0 • Construct just one physics stream and reconstruct it promptly at Tier1 sites • Construct two exclusive physics streams: physics.prompt and physics.bulk • Prompt stream to be reconstructed promptly at Tier0 • Bulk stream reconstructed at Tier1s with possibly updated software
Data Distribution • Resources available • General Atlas disk resources on the Grid • Group disk resource for PHYS-HI group • sites: BNL, CERN, CYFRONET, WEIZMANN • Total promised: 115 TB, used 15 TB (so far) • Data formats and sizes • RAW: 2.2 MB/event • ESD: 5.5 MB/event (and possible dESD with reduced size) • D3PD: 4.0 MB/event (measured with truth) (in development, so size will change) • Distribution strategy • General Atlas disk will be used to store RAW and ESD files from official production • Some RAW and ESD may be partially replicated to PHYS-HI group for group tests • D3PD are expected to be stored on PHYS-HI resources • Number of versions/copies will have to be adjusted to available resources • Problems? New CYFRONET storage hardware installation delays
Software Readiness • Heavy Ion reconstruction code • Mostly ready, expect minor changes • D3PD making in full development • Validation available and running • HIValidation package running tests and comparing to reference • HIInDetValidation package testing HI mode tracking • Tests on MC and Data in RecJobTransforms • We need more tests in Tier0 configuration • Production scripts • Mostly ready solutions using Reco_trf.py command • HeavyIonD3PDMaker ready for interface in Reco_trf.py (David Cote) • Tested in private production with standard monitoring and some specials for trigger and tracking validation • Athena release schedule • End of October Tier0 changes releases (from reprocessing schedule) • we need to have all updates in by that time
Issues & Plans • PHYS-HI group has no access to CAF area • we may want to run calibration (tests) outside Tier0 • we may want to run reconstruction tests on recent raw data • we shall ask for access and disk quota • Group afs space at CERN? • for doing common development • for keeping group software release? • Group releases and official group production? • not planned probably this year • Panda group queues at sites with group space • I would like to re quest such queues with higher priorities for phys-hi group production at CYFRONET and WEIZMANN • Could be used for increased efficiency in group production of D3PDs with latest versions of software and for (group/user) analysis jobs
CPU/Mem in rel. 16 reconstruction • Rel. 16.0.0 with tracking from min pT= 0.5 GeV increased memory consumption by 500 MB! • Trying tracking with min pT= 1.0 GeV has still not reduced memory use enough to avoid some jobs crashed
CPU/Mem in rel. 16 reconstruction nihialgs: only standard algorithms but in heavy ion mode notrigger: standard+heavy ion algorithms nomonitor: standard+heavy ion+trigger algorithms full: all algorithms with monitoring