Management of Large Scale Data Productions for the CMS Experiment

Management of Large Scale Data Productions for the CMS Experiment Presented by L.M.Barone Università di Roma & INFN ACAT2000 - FNAL

The Framework • The CMS experiment is producing a large amount of MC data for the development of High Level Trigger algorithms (HLT) for fast data reduction at LHC • Current production is half traditional (Pythia + CMSIM/Geant3) half OO (ORCA using Objectivity/DB) ACAT2000 - FNAL

The Problem Dealing with actual MC productions and not with 2005 data taking • Data size ~ 106 - 107 events, 1 MB/ev ~ 104 files (typically 500 evts/file) • Resource dispersion many production sites CERN,FNAL,Caltech, INFN etc. ACAT2000 - FNAL

The Problem (cont’d) • Data Relocation data produced in site A are stored centrally (CERN); site B may need a fraction of them; combinatorics increasing • Objectivity/DB does not make life easier(but the problem would exist anyway) ACAT2000 - FNAL

Objectivity Database Objectivity Database ytivitcejbO esabataD ORCA Production 2000 Signal Zebra files with HITS HEPEVT ntuples CMSIM MC Prod. MB Catalog import Objectivity Database ORCA Digitization (merge signal and MB) Objectivity Database ORCA ooHit Formatter ORCA Prod. Catalog import HLT Algorithms New Reconstructed Objects Objectivity Database HLT Grp Databases Mirrored Db’s (US, Russia, Italy..) ACAT2000 - FNAL

The Old Days • Question: how was it done before ?A mix of ad hoc scripts/programs with a lot of manual intervention... but the problem was smaller and less dispersed ACAT2000 - FNAL

Requirements for a Solution • Solution must be as automatic as possible  decrease manpower • Tools should be independent from data type and from site • Network traffic should be optimized (or minimized ?) • Users need complete information on data location ACAT2000 - FNAL

Present Status • Job creation is managed by a variety of scripts in different sites • Job submission again goes through diverse methods, from UNIX commands to LSF or Condor • File transfer has been managed up to now by Perl scripts not generic, not site independent ACAT2000 - FNAL

Present Status (cont’d) • The autumn 2000 production round is a trial towards standardization same layout (OS, installation)  same scripts (T.Wildish) for non Objy data transfer first use of GRID tools (see talk by A.Samar) validation procedure for production sites ACAT2000 - FNAL

Collateral Activities • Linux + CMS software automatic installation kit (INFN) • Globus installation kit (INFN) • Production monitoring tools with Web interface ACAT2000 - FNAL

What is missing ? • Scripts and tools are still too specific and not robust enough need practice on this scale • Information serviceneeds a clear definition in our context and then an effective implementation (see later) • File replication management is just appearing and needs careful evaluation ACAT2000 - FNAL

Ideas for Replica Management • A case study with Objectivity/DB(thanks to C.Grandi Bologna,INFN) • can be extended to any kind of file ACAT2000 - FNAL

Cloning federations • Cloned federations have a local catalog (boot file) • It is possible to manage each of them in an independent way. Some databases may be attached (or exist) only in one site • “Manual work” is needed to keep the schemas synchronized (this is not the key point today...) ACAT2000 - FNAL

RC1 Boot CERN Boot RC1 FD DB_a CERN FD DB_b RC2 Boot RC2 FD DB2 DBn DB3 DB1 Cloning federations Clone FD ACAT2000 - FNAL

Productions • Using a DB-id pre-allocation system it is possible to produce databases at RCs which can then be exported to other sites • A notification system is needed to inform other sites when a database is completed • This is today accomplished by GDMP using a publish-subscribe mechanism ACAT2000 - FNAL

Productions • When a site receives notification, it can: • ooattachdb to the remote site DB • copy the DB and ooattachdb it locally • ignore it ACAT2000 - FNAL

RC1 Boot CERN Boot RC1 FD CERN FD DBn+1 DBn+m RC2 Boot RC2 FD DBn+m+1 DB2 DBn DB3 DB1 DBn+m+k Productions ACAT2000 - FNAL

Analysis • In each site a complete catalog with the location of all the datasets is needed. Some DBs are local and some are remote • In case more copies of a DB are available it would be nice to have in the local catalog the closest one (NWS) ACAT2000 - FNAL

Information service • Create an Information Service with information about all the replicas of the databases (GIS ?) • In each RC there is a reference catalog which is updated taking into account the available replicas • It is even possible to have a catalog created on-the-fly only for the datasets needed by a job ACAT2000 - FNAL

RC2 Boot RC2 FD DBn+m+1 DBn+m+k Analysis CERN Boot RC1 Boot CERN FD RC1 FD DBn+1 DBn+m DB2 DBn DB3 DB1 DBn+m+1 DBn+1 DBn+m DBn+m+k ACAT2000 - FNAL

Logical vs Physical Datasets • Each dataset is composed by one or more databases • datasets are managed by application-sw • Each DB is uniquely identified by a DBid • DBid assignment is a logical-db creation • The physical-db is the file • zero, one or more instancies • The IS manages the link between a dataset, its logical-dbs and its physical-dbs ACAT2000 - FNAL

Logical vs Physical Datasets Dataset: H2 pccms1.bo.infn.it::/data1/Hmm1.hits.DB shift23.cern.ch::/db45/Hmm1.hits.DB id=12345 Hmm.1.hits.DB pccms1.bo.infn.it::/data1/Hmm2.hits.DB shift23.cern.ch::/db45/Hmm2.hits.DB id=12346 Hmm.2.hits.DB pccms3.pd.infn.it::/data3/Hmm2.hits.DB Hmm.3.hits.DB id=12347 shift23.cern.ch::/db45/Hmm3.hits.DB Dataset: H2e pccms5.roma1.infn.it::/data/Hee1.hits.DB shift49.cern.ch::/db123/Hee1.hits.DB id=5678 Hee.1.hits.DB pccms5.roma1.infn.it::/data/Hee2.hits.DB shift49.cern.ch::/db123/Hee2.hits.DB id=5679 Hee.2.hits.DB pccms5.roma1.infn.it::/data/Hee3.hits.DB id=5680 Hee.3.hits.DB shift49.cern.ch::/db123/Hee3.hits.DB ACAT2000 - FNAL

Database creation • In each production site we have: • a production federation including incomplete databases • a reference federation with only complete databases (both local and remote ones) • When a DB is completed it is attached to the site reference federation • The IS monitors the reference federations of all the sites and updates the database list ACAT2000 - FNAL

pc.rc1.net DB4 RC1 Prod DB5 RC1 Ref DB5 DB5 Database creation shift.cern.ch CERN FD 0001 DB1.DB shift.cern.ch::/shift/data 0002 DB2.DB shift.cern.ch::/shift/data 0003 DB3.DB shift.cern.ch::/shift/data 0004 DB4.DB pc.rc1.net::/pc/data shift.cern.ch::/shift/data 0005 DB5.db pc.rc1.net::/ps.data shift.cern.ch::/shift/data 0001 DB1.DB shift.cern.ch::/shift/data 0002 DB2.DB shift.cern.ch::/shift/data 0003 DB3.DB shift.cern.ch::/shift/data 0004 DB4.DB pc.rc1.net::/pc/data shift.cern.ch::/shift/data 0005 0001 DB1.DB shift.cern.ch::/shift/data 0002 DB2.DB shift.cern.ch::/shift/data 0003 DB3.DB shift.cern.ch::/shift/data 0004 DB4.DB pc.rc1.net::/pc/data shift.cern.ch::/shift/data 0005 DB5.DB pc.rc1.net::/pc/data DB1 DB2 DB3 DB4 ACAT2000 - FNAL

Replica Management • In case of multiple copies of the same DB each site may choose which copy to use: • it should be possible to update the reference federation at given times • it should be possible to create on-the-fly a mini-catalog only with information about the datasets requested by a job • this kind of operation is managed by application-sw (e.g. ORCA) ACAT2000 - FNAL

DB2 Replica Management shift.cern.ch pc1.bo.infn.it DB1 CERN FD BO Ref DB3 DB1 DB2 pc1.pd.infn.it PD Ref 0001 DB1.DB shift.cern.ch::/shift/data pc1.bo.infn.it::/data 0002 DB2.DB shift.cern.ch::/shift/data 0003 DB3.DB shift.cern.ch::/shift/data 0001 DB1.DB shift.cern.ch::/shift/data pc1.bo.infn.it::/data 0002 DB2.DB shift.cern.ch::/shift/data pc1.bo.infn.it::/data 0003 DB3.DB shift.cern.ch::/shift/data ACAT2000 - FNAL

Summary of the Case Study • Basic functionalities of a Replica Manager for production are already implemented in GDMP • The use of an Information Server would allow easy synchronization of federations and optimized data access during analysis • The same functionalities offered by the Objectivity/DB catalog may be implemented for other kind of files ACAT2000 - FNAL

Conclusions (?) Globus and the various GRID projects try to address the issue of Large Scale distributed data access Their effectiveness is still to be proven The problem again is not the software, it is the organization ACAT2000 - FNAL

Management of Large Scale Data Productions for the CMS Experiment

Management of Large Scale Data Productions for the CMS Experiment

Presentation Transcript

Grid Data Integration In the CMS Experiment

iRODS and Large-Scale Data Management

Formalising data management plans for large scale multi-disciplinary projects

The CMS experiment at the Large Hadron Collider

Large- scale Linked Data Management

Data Indexing for Stateful , Large-scale Data Processing

CMS Experiment

Status of the CMS Experiment

Pixel Sensors for the CMS Experiment

Large scale data processing

HEART Online Large-Scale Assessment Data Management System

Data Management Challenges of Large-Scale Data Intensive Scientific Workflows

NetSearch : Googling Large-scale Network Management Data

Unstructured Data Partitioning for Large Scale Visualization

Large Scale Data Integration

Large Scale Data Analytics

large scale data analysis

P. Tropea For the CMS experiment

HEART Online Large-Scale Assessment Data Management System

CMS Experiment

New Challenges for Large-scale Data Storage