360 likes | 473 Views
The ATLAS Computing Model. Eric Lançon Saclay LCG-France Lyon 14-15 Dec. 2005. Overview. The ATLAS Facilities and their roles Growth of resources CPU, Disk, Mass Storage Network requirements CERN Tier 1 Tier 2 Data & Service challenges. Computing Resources.
E N D
The ATLAS Computing Model Eric Lançon Saclay LCG-France Lyon 14-15 Dec. 2005
Overview • The ATLAS Facilities and their roles • Growth of resources • CPU, Disk, Mass Storage • Network requirements • CERN Tier 1 Tier 2 • Data & Service challenges Workshop LCG-France E. lancon
Computing Resources • Computing Model fairly well evolved, but still being revised • Documented in:http://doc.cern.ch//archive/electronic/cern/preprints/lhcc/public/lhcc-2005-022.pdf • There are (and will remain for some time) many unknowns • Calibration and alignment strategy is still evolving • Physics data access patterns MAY start to be exercised this Spring • Unlikely to know the real patterns until 2007/2008! • Still uncertainties on the event sizes , reconstruction time • Lesson from the previous round of experiments at CERN (LEP) • Reviews in 1988 underestimated the computing requirements by an order of magnitude! Workshop LCG-France E. lancon
ATLAS Facilities • Event Filter Farm at CERN (pit) • Located near the Experiment • Assembles data into a stream to the Tier 0 Center • Tier 0 Center at CERN (comp. center) • Raw data Mass storage at CERN and to Tier 1 centers • Prompt reconstruction producing Event Summary Data (ESD) and Analysis Object Data (AOD) • Ship ESD, AOD to Tier 1 centers • Tier 1 Centers distributed worldwide (approximately 10 centers) • Re-reconstruction of raw data, producing new ESD, AOD • Tier 2 Centers distributed worldwide (approximately 30 centers) • Monte Carlo Simulation, producing ESD, AOD • ESD, AOD Tier 1 centers • Physics analysis • CERN Analysis Facility • Tier 3 Centers distributed worldwide • Physics analysis Workshop LCG-France E. lancon
Processing • Tier-0: • First pass processing on express/calibration physics stream • 24-48 hours later, process full physics data stream with reasonable calibrations • These imply large data movement from T0 to T1s • Tier-1: • Reprocess 1-2 months after arrival with better calibrations • Reprocess all resident RAW at year end with improved calibration and software • These imply large data movement from T1 to T1 and T1 to T2 Workshop LCG-France E. lancon
Processing cont’d • Tier-1: • 1/10 of RAW data and derived samples • Shadow the ESD for another Tier-1 (e.g. 2/10 of whole sample) • Full AOD sample • Reprocess 1-2 months after arrival with better calibrations (to produce a coherent dataset) • Reprocess all resident RAW at year end with improved calibration and software • Provide scheduled access to ESD samples • Tier-2s • Provide access to AOD and group Derived Physics Datasets • Carry the full simulation load Workshop LCG-France E. lancon
Analysis Model Analysis model broken into two components • Scheduled central production of augmented AOD, tuples & TAG collections from ESD • Derived files moved to other T1s and to T2s • Chaotic user analysis of augmented AOD streams, tuples, new selections etc and individual user simulation and CPU-bound tasks matching the official MC production • Modest job traffic between T2s Workshop LCG-France E. lancon
Inputs to the ATLAS Computing Model (1) Workshop LCG-France E. lancon
Inputs to the ATLAS Computing Model (2) Workshop LCG-France E. lancon
Data Flow from experiment to T2s • T0 Raw data Mass Storage at CERN • T0 Raw data Tier 1 centers • 1/10 in each T1 • T0 ESD Tier 1 centers • 2 copies of ESD distributed worldwide • 2/10 in each T1 • T0 AOD Each Tier 1 center • T1 T2 • Some ESD, ALL AOD? • T0 T2 Calibration processing? • Dedicated T2s for some sub-detectors Workshop LCG-France E. lancon
Total ATLAS Requirements in for 2008 Workshop LCG-France E. lancon
Important points • Storage of Simulation data from Tier 2’s • Assumed to be at T1s • Need partnerships to plan networking • Must have fail-over to other sites? • Commissioning • These numbers are calculated for the steady-state but with the requirement of flexibility in the early stages • Simulation fraction is an important tunable parameter in T2 numbers! • Simulation / Real Data = 20% in TDR • Heavy Ion running still under discussion. Workshop LCG-France E. lancon
ATLAS T0 Resources Workshop LCG-France E. lancon
ATLAS T1 Resources Workshop LCG-France E. lancon
ATLAS T2 Resources Workshop LCG-France E. lancon
Offres / T1 2008 Workshop LCG-France E. lancon
Tier 1 Tier 2 Bandwidth The projected time profile of the nominal aggregate bandwidth expected for an average ATLAS Tier- 1 and its three associated Tier-2s. Workshop LCG-France E. lancon
Tier 1 CERN Bandwidth The projected time profile of the nominal bandwidth required between CERN and the Tier-1 cloud. Workshop LCG-France E. lancon
Tier 1 Tier 1 Bandwidth The projected time profile of the nominal bandwidth required between T1 and Tier-1. Workshop LCG-France E. lancon
Conclusions • Computing Model Data Flow understood for placing Raw, ESD and AOD at Tiered centers • Still need to understand data flow implications of Physics Analysis • How often do you need to “back navigate” AOD to ESD? • How distributed is Distributed Analysis? • Some of these issues will be addressed in the upcoming (early 2006) Computing System Commissioning exercise. • Some will only be resolved with real data in 2007-8 Workshop LCG-France E. lancon
2005 2006 2007 2008 SC3 First physics cosmics First beams Full physics run SC4 LHC Service Operation Key dates for Service Preparation Sep05 - SC3 Service Phase May06 –SC4 Service Phase Sep06 – Initial LHC Service in stable operation Apr07 – LHC Service commissioned SC3 – Reliable base service – most Tier-1s, some Tier-2s – basic experiment software chain – grid data throughput 1GB/sec, including mass storage 500 MB/sec (150 MB/sec & 60 MB/sec at Tier-1s) SC4 – All Tier-1s, major Tier-2s – capable of supporting full experiment software chain inc. analysis – sustain nominal final grid data throughput (~ 1.5 GB/sec mass storage throughput) LHC Service in Operation – September 2006 – ramp up to full operational capacity by April 2007 – capable of handling twice the nominal data throughput Workshop LCG-France E. lancon
Data Challenges Workshop LCG-France E. lancon
Statistics on LCG usage 305410 jobs In2p3 = Clermont + Lyon Assez loin des ambitions Bcp de problèmes techniques au démarrage Vers la fin 10% de LCG au CC Workshop LCG-France E. lancon
SC3 : T0->T1 exercise T0 - T1 dataflow SC3 ATLAS Workshop LCG-France E. lancon
Distributed Data Management (DDM) • Catalogues généraux centralisés (LFC): • Contenus des datasets (1 dataset = plusieurs fichiers) • Localisation des datasets dans les sites (T0-T1-T2) • Liste des requêtes de transferts des datasets, etc… • Catalogues locaux (LFC) • Localisation dans le site des fichiers de chaque dataset • Site prend en charge à travers des agents (VOBox) : • Récupération auprès du catalogue central de la liste des datasets et des fichiers associés à transférer • Gestion du transfert • Enregistrement des informations dans les catalogues locaux et centraux Workshop LCG-France E. lancon
DDM au CC • 1 serveur dCache • 2 pools de chacun 1-2 TB + Driver de bande : 40 MB/s • Transfert par FTS (Geant 1 Gb/s) • A quand 10 Gb/s? • VOBox LCG mode-CCIN2P3 • Machine Linux SL3 • Catalogue LFC • Serveur FTS (futurs transferts T1-T1 ou T1-T2) • Certificats Grille • Accès gsissh • Soft DDM installé • Pas besoin d'accès ROOT • Peut être installé depuis le CERN Pb : cron n’est pas géré par utilisateur grille Workshop LCG-France E. lancon
Services challenges Workshop LCG-France E. lancon
Integrated TB transfered Workshop LCG-France E. lancon
Backup Slides Workshop LCG-France E. lancon
From LATB… Workshop LCG-France E. lancon
Heavy Ion Running Workshop LCG-France E. lancon
Preliminary Tier-1 Resource PlanningCapacity at all Tier-1s in 2008 Workshop LCG-France E. lancon
Preliminary Tier-2 Resource PlanningCapacity at all Tier-2s in 2008 • Includes resource planning from 27 centres/federations • 11 known Tier-2 federations have not yet provided data • These include potentially significant resources: USA CMS, Canada East+West, MPI Munich ….. Workshop LCG-France E. lancon
‘Rome’ production : French sites Workshop LCG-France E. lancon
‘Rome’ production : Italian sites Workshop LCG-France E. lancon
Distributed Data Management (DDM) • Eviter l’utilisation d’un catalogue plat complètement centralisé • Organisation hiérarchique des catalogues de données • Dataset= Collection de fichiers • Datablock : Collection de datasets • Etiquetage des datasets pour maintenir une complète consistence • Information de chaque fichier physique stockée localement • Mouvements de données par dataset • Mouvement vers un site déclenché par une souscription à travers un programme client Workshop LCG-France E. lancon