ATLAS Grid Computing Model and Data Challenges 2 June 2004

ATLAS Grid Computing Model and Data Challenges2 June 2004 Dario Barberis (CERN & Genoa University)

Event Data Flow from Online to Offline • The trigger system will reduce the event rate from 40 MHz to: • 20-30 kHz after the Level-1 trigger (muons and calorimetry) • ~3000 Hz after the Level-2 trigger (several algorithms in parallel, running independently for each subdetector) • ~200 Hz after the Event Filter (“offline” algorithms on full event) • These rates are almost independent of luminosity: • there is more “interesting” physics than 200 Hz even at low luminosity • trigger thresholds will be adjusted to follow the luminosity • The “nominal” event size is 1.6 MB • initially it may be much larger (7-8 MB) until data compression in the calorimetry is switched on • The nominal rate from online to offline is therefore 320 MB/s

Parameters of the Computing Model • Data Sizes: • Simulated Event Data 2.0 MB (raw data + MC truth) • Raw Data 1.6 MB (from DAQ system) • Event Summary Data 0.5 MB (full reconstruction output) • Analysis Object Data 0.1 MB (summary of reconstruction) • TAG Data 0.5 kB (event tags in SQL database) • Other parameters: • Total Trigger Rate 200 Hz • Physics Trigger Rate 180 Hz • Nominal year 107 s • Time/event for Simul. 60 kSI2k s • Time/event for Recon. 6.4 kSI2k s

Operation of Tier-0 • The Tier-0 facility at CERN will have to: • hold a copy of all raw data to tape • copy in real time all raw data to Tier-1’s (second copy useful also for later reprocessing) • keep calibration data on disk • run first-pass reconstruction • distribute ESD’s to external Tier-1’s (2/N to each one of N Tier-1’s) • Currently under discussion: • “shelf” vs “automatic” tapes • archiving of simulated data • sharing of facilities between HLT and Tier-0 • Tier-0 will have to be a dedicated facility, where the CPU power and network bandwidth match the real time event rate

Operation of Tier-1’s and Tier-2’s • We envisage at least 6 Tier-1’s for ATLAS. Each one will: • keep on disk 2/N of the ESD’s and a full copy of AOD’s and TAG’s • keep on tape 1/N of Raw Data • keep on disk 2/N of currently simulated ESD’s and on tape 1/N of previous versions • provide facilities (CPU and disk space) for user analysis (~200 users/Tier-1) • run simulation, calibration and/or reprocessing of real data • We estimate ~4 Tier-2’s for each Tier-1. Each one will: • keep on disk a full copy of AOD’s and TAG’s • (possibly) keep on disk a selected sample of ESD’s • provide facilities (CPU and disk space) for user analysis (~50 users/Tier-2) • run simulation and/or calibration procedures

Analysis on Tier-2’s and Tier-3’s • This area is under the most active change • We are trying to forecast resource usage and usage patterns from Physics Working Groups • Assume about ~10 selected large AOD datasets, one for each physics analysis group • Assume that each large local centre will have full TAG to allow simple selections • Using these, jobs submitted to T1 cloud to select on full ESD • New collection or ntuple-equivalent returned to local resource • Distributed analysis systems under development • Metadata integration, event navigation, database designs are all at top priority • ARDA may help, but will be late in the day for DC2 (risk of interference with DC2 developments)

Data Challenge 2 • DC2 operation in 2004: • distributed production of (>107) simulated events in May-July • events sent to CERN in ByteStream (raw data) format to Tier-0 • reconstruction processes run on prototype Tier-0 in a short period of time (~10 days, “10% data flow test”) • reconstruction results distributed to Tier-1s and analysed on Grid • Main “new” software to be used (wrt DC1 in 2002/2003): • Geant4-based simulation, pile-up and digitization in Athena • complete “new” EDM and Detector Description interfaced to simulation and reconstruction • POOL persistency • LCG-2 Grid infrastructure • Distributed Production and Analysis environment

Phases of DC2 operation • Consider DC2 as a three-part operation: • part I: production of simulated data (May-July 2004) • needs Geant4, digitization and pile-up in Athena, POOL persistency • “minimal” reconstruction just to validate simulation suite • will run on any computing facilities we can get access to around the world • part II: test of Tier-0 operation (July-August 2004) • needs full reconstruction software following RTF report design, definition of AODs and TAGs • reconstruction will run on Tier-0 prototype as if data were coming from the online system (at 10% of the rate) • output (ESD+AOD) will be distributed to Tier-1s in real time for analysis • in parallel: run distributed reconstruction on simulated data • this is useful for the Physics community as MC truth info is kept • part III: test of distributed analysis on the Grid (Aug.-Oct. 2004) • access to event and non-event data from anywhere in the world both in organized and chaotic ways

DC2: Scenario & Time scale September 03: Release7 March 17th 04: Release 8 (simulation) May 3rd 04: DC2/I End June 04: Release 9 (reconstruction) July 15th 04: DC2/II August 1st 04: DC2/III Put in place, understand & validate: Geant4; POOL; LCG applications Event Data Model Digitization; pile-up; byte-stream persistency tests and reconstruction Testing and validation Run test-production Start final validation Start simulation; Pile-up & digitization Event mixing Transfer data to CERN Intensive Reconstruction on “Tier0” Distribution of ESD & AOD Start Physics analysis Reprocessing

DC2 resources (needed)

DC2: Mass Production tools • We use: • 3 Grid flavours (LCG-2; Grid3+; NorduGrid) • We must build over all three (submission, catalogues,…) • Automated production system • New production DB (Oracle) • Supervisor-executer component model • Windmill supervisor project • Executers for each Grid and legacy systems (LSF, PBS) • Data management system • Don Quijote DMS project • Successor of Magda • but uses native catalogs • AMI (ATLAS Metadata Interface, mySQL database) for bookkeeping • Going to web services • Integrated with POOL

New Production System (1) • DC1 production in 2002/2003 was done mostly with traditional tools (scripts) • Manpower intensive! • Main features of new system: • Common production database for all of ATLAS • Common ATLAS supervisor run by all facilities/managers • Common data management system • Executors developed by middleware experts (LCG, NorduGrid, Chimera teams) • Final verification of data done by supervisor

ProdDB AMI Data Man. System Don Quijote Windmill super super super super super soap jabber jabber jabber soap LCG exe LCG exe NG exe G3 exe LSF exe Capone Dulcinea Lexor RLS RLS RLS LCG NG Grid3 LSF New Production System (2)

Roles of Tiers in DC2 (1) • Tier-0 • 20% of simulation will be done at CERN • All data in ByteStream format (~16 TB) will be copied to CERN • Reconstruction will be done at CERN (in ~10 days). • Reconstruction output (ESD) will be exported in 2 copies from Tier-0 ( 2 x ~5 TB).

Roles of Tiers in DC2 (2) • Tier-1s will have to • Host simulated data produced by them or coming from Tier-2s; plus ESD (& AOD) coming from Tier-0 • Run reconstruction in parallel to Tier-0 exercise (~2 months) • This will include links to MCTruth • Produce and host ESD and AOD • Provide access to the ATLAS V.O. members • Tier-2s • Run simulation (and other components if they wish to) • Copy (replicate) their data to Tier-1 • All information will be entered into the relevant database and catalog

ATLAS production • Will be done as much as possible on Grid • Few production managers • Data stored on Tier1’s • “Expressions of Interest” to distribute the data in an “efficient” way – anticipates efficient migration of data • Keep the possibility to use “standard” batch facilities but using the same production system • Will use several “catalogs”; DMS will take care of them • Plan: • 20% Grid3 • 20% NorduGrid • 60% LCG-2 (10 “Tier1s”) • To be adapted based on experience

Current Grid3 Status (3/1/04)(http://www.ivdgl.org/grid2003) • 28 sites, multi-VO • shared resources • ~2000 CPUs • dynamic – roll in/out

NorduGrid Resources: details • NorduGrid middleware is deployed in: • Sweden (15 sites) • Denmark (10 sites) • Norway (3 sites) • Finland (3 sites) • Slovakia (1 site) • Estonia (1 site) • Sites to join before/during DC2 (preliminary): • Norway (1-2 sites) • Russia (1-2 sites) • Estonia (1-2 sites) • Sweden (1-2 sites) • Finland (1 site) • Germany (1 site) • Many of the resources will be available for ATLAS DC2 via the NorduGrid middleware • Nordic countries will coordinate their shares • For others, ATLAS representatives will negotiate the usage

LCG-2 today (May 14)

“Tiers” in DC2

ATLAS Distributed Analysis & GANGA • The ADA (ATLAS Distributed Analysis) project started in late 2003 to bring together in a coherent way all efforts already present in the ATLAS Collaboration to develop a DA infrastructure: • GANGA (GridPP in the UK) – front-end, splitting • DIAL (PPDG in the USA) – job model • It is based on a client/server model with an abstract interface between services • thin client in the user’s computer, “analysis service” consisting itself of a collection of services in the server • The vast majority of GANGA modules fit easily into this scheme (or are being integrated right now): • GUI, CLI, JobOptions editor, job splitter, output merger, ... • Job submission will go through (a clone of) the production system • using the existing infrastructure to access resources on the 3 Grids and the legacy systems • The forthcoming release of ADA (with GANGA 2.0) will have the first basic functionality to allow DC2 Phase III to proceed

GANGA GUI Graphical GANGA GANGA GANGA ROOT Job cmd line Task cmd line Job Client tools Client Client Builder Mgt Mgt High level service interfaces (AJDL) Analysis Service High-level services Catalogue services Job Dataset Dataset Splitter Merger Management Middleware service interfaces Middleware CE File etc. etc. ... WMS services Catalog Analysis • This is just the first step • Integrate with the ARDA back-end • Much work needed on metadata for analysis (LCG and GridPP metadata projects) • NB: GANGA allows non-production MC job submission and data reconstruction end-to-end in LCG • Interface to ProdSys will allow submission to any ATLAS resource

Monitoring & Accounting • At a very early stage in DC2 • Needs more discussion within ATLAS • Metrics to be defined • Development of a coherent approach • Current efforts: • Job monitoring “around” the production database • Publish on the web, in real time, relevant data concerning the running of DC-2 and event production • SQL queries are submitted to the Prod DB hosted at CERN • Result is HTML formatted and web published • A first basic tool is already available as a prototype • On LCG: effort to verify the status of the Grid • two main tasks: site monitoring and job monitoring • based on GridICE and R-GMA, integrated with the current production Grid middleware • MonaLisa is deployed for Grid3 and NG monitoring

DC2: where are we? • DC2 Phase I • Part 1: event generation • Release 8.0.1 (end April) for Pythia generation (70% of data) • tested, validated, distributed • test production started 2 weeks ago • real production started this week with current release • Part 2: Geant4 simulation • Release 8.0.2 (mid May) reverted to Geant4 6.0 (with MS from 5.2) • tested, validated, distributed • production will start later this week with current release • Part 3: pile-up and digitization • Release 8.0.4 (bug fix release, if needed, next week) • currently under test (performance optimization) • production later in June

2003 • POOL/SEAL release (done) • ATLAS release 7 (with POOL persistency) (done) • LCG-1 deployment (done) • ATLAS complete Geant4 validation (done) • ATLAS release 8 (done) • DC2 Phase 1: simulation production (in progress) • DC2 Phase 2: intensive reconstruction (the real challenge!) • Combined test beams (barrel wedge) • Computing Model paper • Computing Memorandum of Understanding • ATLAS Computing TDR and LCG TDR • DC3: produce data for PRR and test LCG-n • Physics Readiness Report • Start commissioning run • GO! 2004 NOW 2005 2006 2007 ATLAS Computing Timeline

Final prototype: DC3 • We should consider DC3 as the “final” prototype, for both software and computing infrastructure • tentative schedule is Q4-2005 to end Q1-2006 • cosmic run will be later in 2006 • This means that on that timescale (in fact, earlier than that, if we have learned anything from DC1 and DC2) we need: • a complete s/w chain for “simulated” and for “real” data • including aspects missing from DC2: trigger, alignment etc. • a deployed Grid infrastructure capable of dealing with our data • enough resources to run at ~50% of the final data rate for a sizable amount of time • After DC3 surely we will be forced to sort out problems day-by-day, as the need arises, for real, imperfect data coming from the DAQ: no time for more big developments!

ATLAS Grid Computing Model and Data Challenges 2 June 2004