Gridifying the LHCb Monte Carlo production system

Gridifying the LHCb Monte Carlo production system Eric van Herwijnen, CERN eric.van.herwijnen@cern.ch Tuesday, 19 february 2002 Talk given at GGF4, Toronto

Contents • LHCb • LHCb distributed computing environment • Current GRID involvement • Functionality of current Monte Carlo system • Integration of DataGrid middleware • Monitoring and control • Requirements of DataGrid middleware

LHCb • LHC collider experiment • 109 events * 1Mb = 1 Pb • Problems of data storage, access and computation • Monte Carlo simulation very important for detector design • Need a distributed model • Create, distribute and keep track of data automatically

LHCb distributed computing environment • 15 countries, 13 European + Brazil, China, 50 institutes • Tier-0: CERN • Tier-1: RAL, IN2P3 (Lyon), INFN (Bologna), Nikhef, CERN + ? • Tier-2: Liverpool, Edinburgh/Glasgow, Switzerland + ? (grow to ~10) • Tier-3: 50 throughout collaboration • Ongoing negotiatons for centres Tier-1/2/3: Germany, Russia, Poland, Spain, Brazil

Current GRID involvement • EU DataGrid project (involves HEP, Biology, Medecine and Earth Observation sciences) • Active in WP8 (HEP applications) of DataGrid • Use “middleware” (WP1-5) + Testbed (WP6) + Network (WP7) • Current distributed system works since some time, LHCb is: • Grid enabled, but not Grid dependent

MC production facilities (summer 2001)

Update bookkeeping database Submit jobs remotely viaWeb Transfer data to Mass store Execute on farm Data Quality Check Monitor performance of farm via Web

Run mc executable • write log to Web • copy data to mass store • (dg-data-copy) • call CERN servlet mass store • FTP servlet • (dg-data-replication) • copy data to CERN mass store • call servlet to copy data from local mass store to CERN • update bookkeeping db • (?LDAP-now Oracle) GRID-enabling production Construct job script and submit via Web (dg- authentication, dg-job-submit)

Gridi-fying the MC production system • Provide a convenient tool for DataGrid Testbed validation tests • Feed back improvements into the MC system currently in production • Clone current system, replace commands by DataGrid middleware • Report back to WP8 and other workpackages as required

Monitoring and control of running jobs • Control system to monitoring distributed production (based on PVSS, author: Clara Gaspar) • Initially for MC production, later all Grid computing • Automatic quality checks on final data samples • Online histograms and comparisons between histograms • Use DataGrid monitoring tools • Feed back improvements into production MC system

Requirements on DataGrid middleware • Security: single user logon • Job submission: use “sandboxes” to package environment so that use of AFS is unnecessary • Monitoring: integrate with WP3 tools where possible for farm monitoring, use own tools for data quality monitoring • Data moving: use a single API to move data • We are in a cycle of requirements, design, implementation and testing

Gridifying the LHCb Monte Carlo production system