120 likes | 278 Views
Applying a Virtual Data Catalog to CMS Monte Carlo Production. Rick Cavanaugh, Raj Rajmani Jens Voeckler, Mike Wilde. GriPhyN All Hands Meeting Oct. 15 and 16, ISI. Purpose. Apply Virtual Data concepts/technologies to an actual application problem: CMS Monte Carlo production
E N D
Applying a Virtual Data Catalog to CMS Monte Carlo Production Rick Cavanaugh, Raj Rajmani Jens Voeckler, Mike Wilde GriPhyN All Hands Meeting Oct. 15 and 16, ISI
Purpose • Apply Virtual Data concepts/technologies to an actual application problem: CMS Monte Carlo production • Expose possible architectural "pitfalls" by using a realistic case study • Produce a simple demo for SC2001 • Useful for CMS: introduce fault tolerance via Condor/DAGMan
4 Steps in the CMS production pipeline 1. Generate simulated "truth" data for physics event 2. Simulate the response of the CMS detector to the "truth" data; produces simulated "hit" data 3. Copy "flat" simulated "hit" data to an OODB format (Objectivity/DB) 4. Digitize the "hit" data into electronic signals FORTRAN based; Flat file format C++ based (ORCA); OODBMS (Objectivity)
Generate Simulated "Truth" Data pythia.cards (~20 params) pythia.exe cms environment variables (~10) truth.ntpl pythia.log
Generate Simulated Detector "Hits" truth.ntpl cmsim.cards (~90 params) cms environment variables (~10) cmsim.exe cmsim.hbook hits.fz cmsim.log
Copy Flat "Hits" Data File to OODBMS hits.fz .orcarc (~3 params) ORCA environment variables! (~25) Objectivity environment variables! (~20) writeHits CMS environment variables! (~10) Federation cmsim.fz cmsim.fz cmsim.fz hits.DB writeHits.log
Generate Electronic Digitization from Simulated "Hits" Data Federation .orcarc (~30 params) cmsim.fz cmsim.fz cmsim.fz hits.DB Objectivity environment variables! (~20) pileup.DB (ignore for now) writeDigis ORCA environment variables (~15) hits.DB hits.DB digis.DB writeDigis.log
Full CMS production pipeline (no pileup): pythia cmsim writeHits writeDigis CPU: 2 min 8 hours 5 min 45 min 1 run 1 run 1 run . . . . . . . . . . . . . . . . . . 1 run Data: 0.5 MB 175 MB 275 MB 105 MB truth.ntpl hits.fz hits.DB digis.DB 1 run = 500 events SC2001 Demo Version: 1 event
Define a Logical Pipeline Structure using the Virtual Data Catalog Physically instantiate a particular CMS pipeline using the Replica Catalog.
Define Logical Pipeline: begin v /usr/local/demo/scripts/cmkin_input.csh file i ntpl_file_path file i template_file file i num_events stdout cmkin_param_fileendbegin v /usr/local/demo/binaries/kine_make_ntpl_pyt_cms121.exe pre cms_env_var stdin cmkin_param_file stdout cmkin_log file o ntpl_fileendbegin v /usr/local/demo/scripts/cmsim_input.csh file i ntpl_file file i fz_file_path file i hbook_file_path file i num_trigs stdout cmsim_param_fileendbegin v /usr/local/demo/binaries/cms121.exe condor copy_to_spool=false condor getenv=true stdin cmsim_param_file stdout cmsim_log file o fz_file file o hbook_fileendbegin v /usr/local/demo/binaries/writeHits.sh condor getenv=true pre orca_hits file i fz_file file i detinput file i condor_writeHits_log file i oo_fd_boot file i datasetname stdout writeHits_log file o hits_dbendbegin v /usr/local/demo/binaries/writeDigis.sh pre orca_digis file i hits_db file i oo_fd_boot file i carf_input_dataset_name file i carf_output_dataset_name file i carf_input_owner file i carf_output_owner file i condor_writeDigis_log stdout writeDigis_log file o digis_dbend pythia_input pythia.exe cmsim_input cmsim.exe writeHits writeDigis
Replica Catalog Logical File Name: Physical File Name: #### log file locationsrc condor_writeDigis_log /usr/local/demo/output/condor_writeDigis.logrc condor_writeHits_log /usr/local/demo/output/condor_writeHits.logrc writeDigis_log /usr/local/demo/output/writeDigis.logrc writeHits_log /usr/local/demo/output/writeHits.logrc cmsim_log /usr/local/demo/output/cmsim.logrc cmkin_log /usr/local/demo/output/cmkin.log#### environment variablesrc cms_env_var /usr/local/demo/scripts/l1_topjets_1200061.cmkin.cshrc carf_input_dataset_name l1_topjetsrc carf_output_dataset_name l1_topjetsrc carf_input_owner hitsrc carf_output_owner digisrc datasetname l1_topjetsrc oo_fd_boot /grinraid/raid1/cavanaug/cms/databases/test2/ORCATEST.bootrc detinput /grintest/user0/cavanaug/cmsinabox/DAG/cms_geom_out.rz#### output filesrc digis_db /grinraid/raid1/cavanaug/cms/databases/test2/EVD0_Digis.l1_topjets.digis.ORCATEST.DBrc hits_db /grinraid/raid1/cavanaug/cms/databases/test2/EVD0_Hits.l1_topjets.hits.ORCATEST.DBrc hbook_file /usr/local/demo/output/l1_topjets_1200061.hbookrc hbook_file_path /usr/local/demo/output/l1_topjets_1200061.hbookrc fz_file /usr/local/demo/output/l1_topjets_1200061.fzrc fz_file_path /usr/local/demo/output/l1_topjets_1200061.fzrc ntpl_file /usr/local/demo/output/l1_topjets_1200061.ntplrc ntpl_file_path /usr/local/demo/output/l1_topjets_1200061.ntpl#### input Parameter Files/Templatesrc cmsim_param_file /usr/local/demo/scripts/l1_topjets_1200061.simrc cmkin_param_file /usr/local/demo/scripts/l1_topjets_1200061.genrc template_file /grintest/user0/cavanaug/cmsinabox/DAG/simulations/Templates/level1/l1_topjets_gen.tit#### misc variables in parameter filesrc num_events 1rc num_trigs 1
Condor/DAGMan: VD catalog generates physical DAG B.sub cms-pipeline.dag: Job B B.subJob C C.subJob D D.subJob E E.subJob F F.subJob G G.subScript PRE C /usr/local/demo/scripts/l1_topjets_1200061.cmkin.cshScript PRE F /usr/local/demo/scripts/orcarc_hits.cshScript PRE G /usr/local/demo/scripts/orcarc_digis.cshPARENT B CHILD CPARENT C CHILD DPARENT D CHILD EPARENT E CHILD FPARENT F CHILD G C.sub D.sub E.sub F.sub For example, C.sub: ## Filename: C.sub# Transformation 2#Universe = vanillaExecutable = /usr/local/demo/binaries/kine_make_ntpl_pyt_cms121.exeLog = full-pipeline.logInput = /usr/local/demo/scripts/l1_topjets_1200061.genOutput = /usr/local/demo/output/cmkin.logArguments = /usr/local/demo/output/l1_topjets_1200061.ntplNotification = NEVERQueue G.sub