220 likes | 370 Views
CMS Monte Carlo Production in LCG. J. Caballero, J.M . Hernández, P. García-Abia (CIEMAT) CMS Collaboration. Computing in High Energy and Nuclear Physics, T.I.F.R. Mumbai, India, 13-17 February 2006. Outline. Introduction Monte Carlo production framework:
E N D
CMS Monte Carlo Production in LCG J. Caballero, J.M. Hernández, P. García-Abia (CIEMAT) CMS Collaboration Computing in High Energy and Nuclear Physics, T.I.F.R. Mumbai, India, 13-17 February 2006
Outline • Introduction • Monte Carlo production framework: • Data tiers, metadata attachment, publication of data • Production workflow • First experiences • Improvements to production: • Output ZIP archives, treatment of pile-up, local software installation • Production operations • Efficiency, problems • Migration to LFC • The new MC production system • Conclusions CMS Monte Carlo Production in LCG
Introduction • Monte Carlo (MC) production is crucial for detector studies and physics analysis • Event simulation and reconstruction typically done in computer farms of CMS institutions • Porting production to LCG allows using a large amount of computing, storage and network resources • MC simulation was previously run in a LCG0-LCG1 dedicated testbed: • Low scale production • Low efficiency: RLS, site configuration • We had introduced novel concepts which have made running the full production chain possible on LCG, from the generation of events to the publication of data for analysis • We had coupled production and the CMS data transfer system (PhEDEx) and made tools more robust • Important implications for the design of the new production framework CMS Monte Carlo Production in LCG
Introduction II • The CMS event data model (EDM) and the MC production framework are somewhat monolithic, not suitable for a Grid environment: • Lack of modularity • Grid provides basic services: • Reliability, stability and flexibility are important issues • We have identified the main weak points of LCG and made the production framework more robust: • Efficient running of production in LCG is manpower intensive • Availability of the resources and responsiveness of the local administrators are crucial • Code development-and-testing and running of production in LCG done by ~1.5=FTE at CIEMAT CMS Monte Carlo Production in LCG
Production framework • The basic unit in MC production is the dataset: • a given physics process with a well defined set of parameters • The production chain is: • generation of events, simulation (hits), digitization (digis) and reconstruction (DST) • these are called data tiers or steps • owner: data tier with defined geometry, SW version and pile-up (PU) sample • Detector and physics groups request events of a specific dataset/owner pair • For practical reasons, requests are split in small assignments composed of a number of runs (~1000 events) • The data relevant to production (requests, owner/dataset, assignments, runs, data attributes) are kept in a global database (RefDB) • The MC production framework is McRunjob, a python application developed at FNAL, used for local farm production since long CMS Monte Carlo Production in LCG
Data tiers • Generation: • no input, small output (10 to 50 MB ntuples) • pure CPU: few minutes, up to few hours if hard filtering present • Simulation (hits): GEANT4 • small input • CPU and memory intensive: 24 to 48 hours • large output: ~500 MB in three files (EVD files), the smallest is ~ 100 KB ! • Digitization: • lower CPU/memory requirements: 5 to 10 hours • I/O intensive: persistent reading of PU through LAN • large output: similar to simulation • Reconstruction: • even less CPU: ~5 hours • smaller output: ~200 MB in two files CMS Monte Carlo Production in LCG
Event metadata attachment • In order to run the digitization step, event metadata have to be generated for the whole collection of simulated events • When running reconstruction the metadata of both the simulated and the digitized events are required • The generation of metadata (metadata attachment) needs direct access to the event files, not suitable for distributed systems: • output of the jobs potentially distributed among several Storage Elements with no POSIX I/O-like access • Metadata attachment was the main show-stopper for porting the MC production system to LCG: • lack of modularity (atomicity) in the old EDM • We introduced the concept of atomic attachment: • Metadata attachment done on the Worker Node for the run to be processed • Negligible overhead: EVD files already in the working area CMS Monte Carlo Production in LCG
Publication of data • We have coupled production in LCG and the CMS data transfer system: • PhEDEx is used to collect event files in the T1s/T2s that host data for analysis • However, data handling for intermediate steps not done by PhEDEx... (this is one of the main problems in production) • For each owner/dataset, a global metadata attachment is performed: • metadata and local XML POOL file catalogs are produced and made public in the data location system (global RefDB and local data location DB -PubDB-) • Analysis tools inspect RefDB/PubDB for data discovery: • Analysis jobs are submitted to the appropriate T1/T2 CMS Monte Carlo Production in LCG
Production workflow • Job preparation: • McRunjob downloads assignment information from RefDB: • List of runs, job templates, application data-cards, input file specification, input/output virgin metadata • Jobs are created for each run using the templates: • Application scripts • JDL file with grid requirements (CPU, memory, SW tags, site...) • Wrapper script: specific stuff to let the job run in a LCG WN • Jobs are submitted to a LCG CE using the JDL • At runtime on the WN: • Input files are downloaded from SE • Metadata are generated for the input event files • After the application runs the output EVDs are copied to the SE • The summary file is returned in the output sandbox and sent to RefDB from the UI for validation of the job • Originally, also the application output/error was returned CMS Monte Carlo Production in LCG
First experiences • The first experiences were disappointing: • Extremely high submission time (very low rate) • Very low job efficiency • Job retrieval time too high: huge output • Failure causes: • Local configuration problems: unavailability of CMS software (installation problems), NFS • Instability of the RLS global catalog • Problems staging in/out files: weak staging procedure, copy from/to the SE unreliable • Poor error report from the application: hard to automatise job resubmission, typically done after visual inspection of the logs • Real time monitoring unavailable CMS Monte Carlo Production in LCG
Improvements to production • We introduced new ideas in the production system in order to make it more robust • Output/error files of the application removed from the output sandbox: • Size largely reduced: significant improvement of the job retrieval rate • Virgin metadata and XML POOL catalog of the job removed from the input sandbox (size reduced to 10 KB): • Stored in several SEs at job submission time (atomic operation) to improve their availability • Significant improvement of the job submission rate • More robust stage in/out procedure: • Failing input/output operations to/from the SE are retried several times (with a delay) to avoid temporary access problems to the SE/RLS • The copy of the job output is tried on several SEs if one fails CMS Monte Carlo Production in LCG
Output ZIP archives • At job completion time, the output EVD files are packed in a ZIP archive (without compression) together with other important files: • Checksum of the EVD files, XML POOL catalog fragment of the output files, summary file, output and error files of the application • Just one big file is copied to the SE, instead of several EVD files • One of the EVDs is only 100 KB in size (very bad for MSS performance) • CMS applications can read files inside uncompressed ZIP archives (without unpacking them) • Zipping had implications for the job preparation of subsequent steps: • We instrumented the job wrapper to deal properly with ZIPs • ... and in the publication of data: • The publication tool, CMSGLIDE (M.A. Afaq, FNAL), was modified to be able to create XML POOL catalogs and attached metadata for production ZIPs • Global metadata attachment done using ZIPs (w/o unzipping) • Zipping of EVDs widely adopted in CMS: EVDs produced in local farmas merged into 2 GB files • Great benefits for PhEDEx and MSS: much less, much larger, files CMS Monte Carlo Production in LCG
Treatment of pile-up • Proper simulation of events requires the superposition of events from inelastic pp interactions on the events of simulated physics processes • Large pile-up (PU) samples prepared at CERN: about 100 GB • Local farm: PU installed locally and made accessible to the jobs at runtime via POSIX I/O-like (rfio, dcap) • LCG: PU sample, EVD (zipped) and metadata transferred with PhEDEx to T1/T2 sites that will run digitization/reconstruction: • The XML POOL catalog of the PU, with site-dependent PFNs/protocol, is placed in a standard location • Publish a LCG software tag for the PU in the grid information system, used as a requirement in the JDL of the jobs • At runtime, the job wrapper merges the PU catalog with that of the job • This simple (novel) implementation has been crucial for running digitization and reconstruction jobs in LCG CMS Monte Carlo Production in LCG
Other interesting ideas • Local CMS software installation at runtime: • We instrumented the job wrapper to install the CMS software in the working area of the job • Little overhead: software downloaded from the SE • Avoid NFS problems, software installation problems, black holes • Suitable for running in sites with little or no support for CMS • Local pile-up installation at runtime (a la ATLAS): • Store and replicate the PU sample in several SEs • Download a (random) fraction of the PU sample • Generate metadata for the PU runs downloaded • An experimental version exists, not used for physics: • Important to determine the number of events required to have minimal or no impact on physics • Need to study tradeoff between local access and downloading of files (LAN) CMS Monte Carlo Production in LCG
Production operations • Production in LCG started slowly one year ago with reduced manpower: • Development/testing of the McRunjob-LCG software and production operations done for a long time by ~1.5 FTE • Other production operators joined the effort few months later • The number of events (in millions) produced in LCG per data tier are: • 13.1 generated, 11.7 simulated, 11.4 digitized, 5.1 reconstructed 14 M 12 M Simulation 11/04 to 02/06 Digitization 06/05 to 02/06 CMS Monte Carlo Production in LCG
Production operations II • We decided to use white lists due to grid/site unreliability: • Sites selected for their size and robustness • Big sites still running production in local farm mode (FZK, IN2P3) • Local administrators providing fast response: fix problems • Availability of PU • This represents a fraction (~30 %) of the production in LCG • No proper bookkeeping in the initial phase of production 9800 digitization and reconstruction jobs 4275 simulation jobs CMS Monte Carlo Production in LCG
Efficiency and failures • Rather low efficiency • Stage in/out and catalog (RLS) related problems • LCG and site problems: RB, CE CMS Monte Carlo Production in LCG
Example of problems jobs • Lack of automatic monitoring/resubmission • Lack of coupling of the CMS data management system (pre-staging) • Temporary grid and site problems: CE, SE, RLS • Lack of manpower • Organization: lack of dedicated resources (PU) • Lack of priorities: competition with CMS analysis and other experiments’ jobs CMS Monte Carlo Production in LCG
Migration to LFC • Recently, CMS has migrated from RLS to LFC, as a global file catalog for LCG (thanks to S. Lemaitre, A. Sciabà , J. Casey) • We adapted McRunjob to use LFC instead of RLS • So far, a small fraction of production in LCG done using LFC • Very satisfactory results as compared to RLS CMS Monte Carlo Production in LCG
Performance of production (LFC) • Significant improvement in performance when using LFC as a global catalog • A bunch of jobs died due to a unscheduled power cut CMS Monte Carlo Production in LCG
New MC production system • Expert system (prodagent) • Automatic data merging step • Job chaining • Coupled to the Data Management System • New EDM (no metadata attachment) • Improved monitoring • Better error handling CMS Monte Carlo Production in LCG
Conclusions • End-to-end production system in LCG • Invaluable experience for the next generation Monte Carlo production system • Robustness is very a important issue given the current unreliability and instability of grid/sites CMS Monte Carlo Production in LCG