A proposal: from CDR to CDH

A proposal: from CDR to CDH Paolo Valente – INFN Roma [Acknowledgements to A. Di Girolamo] NA62 collaboration meeting

Requirements/1 1. Execute each operation [transfer, reconstruction…] 2. Log operations and errors • Execute/Launch the transfer/reconstruction operations • Typically done with a set of scripts, in part running as demons, in part controlled by an operator • [Adapted] NA48 CDR and [adapted] COMPASS CDR have been used in 2007-2008 and during the technical run. • Logging and [in some case] error recovery

Requirements/2 1. Execute each operation, controlling the sequence of all steps 2. Record every operation, keep a catalog of all filesand relative operations on it • Not only execute operation, but also know their status, recognize success/failure, handle anomalies, interface to operator, … • Know and control the sequence of operations • Handle/Notify the status of the “sequence” 3. Monitor/display the status of entire process, following each element during its lifetime Central Data Recording  Central Data Handler

States and Transitions Burst 99923-0000 • The atomic unit is the burst. • A burst is connected to a sequence of operations to be performed: • First of all, generation of the RAW file • From the RAW file, a number of tasks involving generation of other files or file transfers • Each operation is a transition from one state to another: • RAW_on_farm_disk RAW_file_on_disk_pool  RAW_on_tape • RAW_generated  RECO-1_generated  THIN-1_generated… • An “operation” is performed for each transition: RAWRECO-1, the reconstruction pass-1 has to be executed; the appropriate copy or remote copy command for file transfers • A new entry has to be created in the file catalogfor each transition • Essentially, 2 kinds of transition: • File generation • File transfer cdr00099923-0000 RAW  burst RAW RECO Reconstruction RAW RAW Filter RECO THIN Thinning RECO RECO Split THIN NTUP MakeNtup

The idea • We must have the catalog of all the files [+ their meta-data, e.g. data quality information, basic information from TDAQ, etc.] • Link all the files relative to a given burst: the logical unit is the burst • Define the sequence of statesthroughwhicheachbursthas to pass • Eachstate transitiondefines an operation to be performed on the files • Define a “task” as the operations to be applied to givenset of entries in the file catalog[thuscausing a state transition for the relative bursts] • Webuild an “Handler” process to control operations: given a task the Handlerwill: • Create the list of files on whichexecute the command(s) • Trigger the execution of the appropriate command(s) on them [typicallylaunching a script] • The trigger for startingthis can be eitherautomatic or performed by an operator • Check the execution and notify/handleanomalous or failures The file catalog The “Handler” on_farm_disk on_disk_pool Sequence: file storage The file catalog The “Handler” on_T0_tape Distributed_to_T1

Burst 99923-0000 • The atomic unit is the burst. A burst is connected to a number of files: • There is only one RAW for each burst • Many RECO, THIN, NTUP, …, files can be generated starting from one burst • The files can have multiple copies on different filesystems and in different sites • Files of different kind are generated (RECO, THIN, …) • Use the burst id as primary key. • Generate the first entry as soon as the RAW file appears in the farm disk • Then, attach to it all the following steps in the lifetime of the burst cdr00099923-0000.dat • cdr00099923-0000.dat • cdr00099923-0000.reco-1 • cdr00099923-0000.reco-1.thin-1 • cdr00099923-0000.reco-1.thin-2 • … • cdr00099923-0000.reco-2 • cdr00099923-0000.reco-2.thin-1 • cdr00099923-0000.reco2.thin-2 • … • …

Let’s make a toy example: file storage For the first step it would be ideal to have the MERGER to insert a new record for each new burst into the catalog, as soon as it creates a new RAW file (otherwise we’ll have to poll) on_farm_disk • /merger/../cdr00099925-0000.dat • /merger/../cdr00099923-0000.dat • /merger/../cdr00099924-0000.dat • /merger/../cdr00099925-0000.dat • The Handler queries for bursts in the state on_farm_disk and creates the list of files to be copied • The Handler creates the appropriate tranfer command • The Handler issues the execution of the command on each of the files in the list and checks for success: • If success: • Create new entry in the file catalog, corresponding to the new replica of the RAW file • Change the status of the burst N to on_disk pool • Otherwise: handle or just notify the failure Probably intermediate states are needed in order to correctly handle the progress of the operation + + … + + xrdcp • file //eos/na62/data/cdr root://eosna62.cern.ch • //eos/../cdr00099923-0000.dat • //eos/../cdr00099924-0000.dat • //eos/../cdr00099925-0000.dat on_farm_disk on_disk_pool on_farm_disk on_disk_pool_canceled on_disk_pool_pending on_disk_pool_failed on_disk_pool_started on_disk_pool

The database • The file catalog and the states plus all necessary information will be in this database • Basic tasks of the catalog: • Give an unique file-id and relate to local filename • Relate to its metadata • We also want to: • Keep the relations between all the files related to the same burst, • Keep the state related to the reconstruction/transfer steps • The Handler will trigger the transition, based on the current state of the file Table: Burst • Number* • MotherRAW [File] • RunType • RunNumber • … Table: Storage Table: File NA62-FARM CERN-PROD RAL INFN-CNAF … Table: Site • Name* • StorageType [StorageType] • isActive • hasReplica • … • Name* • FileType [FileType] • CustodialLevel • Version • CreationTimestamp • ModificationTimestamp • DeletionTimeStamp • Site [Site] • Storage [Storage] • CopyNumber • Mother [File] • … SCRATCH-1 FARMDISK-1 EOSNA62 CASTORNA62 … • Name* • SiteType [SiteType] • Location • ContactPerson • isActive • … Table: StorageType Table: SiteType • Name* • isCustodial • … TAPE EOS DISK … • Name* • hasTape • hasDisk • … Table: FileType FARM TIER-0 TIER-1 TIER-2 … • Name* • isData • hasVersion • … RAW RECO THIN NTUP …

Example Burst RAW (farm) RAW (disk pool) RAW (T0 tape) File File File RAW (T1 disk) RAW (T1 tape) File File RECO-1 (T1 disk) RECO-1 (T1 tape) File File THIN-1 (T2 disk) Reconstruction & thinning THIN-1 (T1 disk) File File File File File RAW (T1 disk) copy 2 First reprocessing File RECO-2 (T1 disk) File 300k bursts/year × 3 years 1,000,000 bursts × O(100) entries = 100M entries THIN-2 (T1 disk) File

Which DB technology? 300k bursts/year × 3 years 1,000,000 bursts × O(100) entries = 100M entries • Looks huge e.g. for MySQL, but ALiEn (ALICE distributed environment, including CATALOG an JOB management) successfully uses MySQL • A number of optimizations/tricks can be used: • Partitioning • Indexes • Common queries/caching • … • Of course there are alternatives. SQUID caching necessary. • By the way… • ALiEn is a very close example: it uses open source software and can be inspirational or even reused • ALiEn project started to provide a file catalog to ALICE and then expanded

ALiEn

Grid services Catalog Job management Handler The other piece to have a complete system… User Interface

ALICE WMS LHCb ATLAS

Pull vs. Push job submission • gLite: a set of grid middleware components responsible for the distribution and management of tasks across grid resources • Push model: • Working as a super-batch system • Jobs submitted to the WMS which schedules the jobs to a Grid CE (computing center) • Computing centers implement their internal batch queues to schedule jobs on the worker nodes • Experiments have implemented their solutions to integrate between middleware and application layer • Frameworks born to manage high-level workflows • Direct control on translation from workflow into grid jobs Independently, the LHC experiments are evolving towards “Pilot job” systems: • Pull model: • Pilot jobs are asynchronously submitted jobs which are running on worker nodes • Users submit jobs to a centralized queue • Pilot jobs communicate with the WMS (pilot aggregator) pulling user jobs from the repository

To be continued…

A proposal: from CDR to CDH

A proposal: from CDR to CDH

Presentation Transcript

Proposal Training

CHBE 594 Lect 01

CINT Proposal Submission Guide

Aug 19, 2013

PROPOSAL FOR MAGNETIC CLAMPING SYSTEM

Navigating the Proposal Stage of the URR Process

How to write your proposal

Proposal Structure

PROPOSAL 6

QC/ZTE/CT proposal

OFDM PHY Proposal

Review proposal penelitian

The Proposal Development Process

Agenda

ASHFORD BUS 694 Week 6 Final Proposal

COMM 301 Uop Course/ShopTutorial

COMM 301 In order to succeed, you must read/Uophelpdotcom

MHA 616 Reading feeds the Imagination/Uophelpdotcom