1 / 25

The ATLAS Computing Model

The ATLAS Computing Model. Roger Jones Lancaster University CHEP06 Mumbai 13 Feb. 2006. Overview. Brief summary ATLAS Facilities and their roles Growth of resources CPU, Disk, Mass Storage Network requirements CERN↔ Tier 1 ↔ Tier 2 Operational Issues and Hot Topics.

zan
Download Presentation

The ATLAS Computing Model

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The ATLAS Computing Model Roger Jones Lancaster University CHEP06 Mumbai 13 Feb. 2006

  2. Overview • Brief summary ATLAS Facilities and their roles • Growth of resources • CPU, Disk, Mass Storage • Network requirements • CERN↔ Tier 1 ↔ Tier 2 • Operational Issues and Hot Topics RWL Jones 13 Feb. 2006 Mumbai

  3. Computing Resources • Computing Model fairly well evolved, documented in C-TDR • Externally reviewed • http://doc.cern.ch//archive/electronic/cern/preprints/lhcc/public/lhcc-2005-022.pdf • There are (and will remain for some time) many unknowns • Calibration and alignment strategy is still evolving • Physics data access patterns MAY be exercised from June • Unlikely to know the real patterns until 2007/2008! • Still uncertainties on the event sizes , reconstruction time • Lesson from the previous round of experiments at CERN (LEP, 1989-2000) • Reviews in 1988 underestimated the computing requirements by an order of magnitude! RWL Jones 13 Feb. 2006 Mumbai

  4. ATLAS Facilities • Event Filter Farm at CERN • Located near the Experiment, assembles data into a stream to the Tier 0 Center • Tier 0 Center at CERN • Raw data  Mass storage at CERN and to Tier 1 centers • Swift production of Event Summary Data (ESD) and Analysis Object Data (AOD) • Ship ESD, AOD to Tier 1 centers  Mass storage at CERN • Tier 1 Centers distributed worldwide (10 centers) • Re-reconstruction of raw data, producing new ESD, AOD • Scheduled, group access to full ESD and AOD • Tier 2 Centers distributed worldwide (approximately 30 centers) • Monte Carlo Simulation, producing ESD, AOD, ESD, AOD  Tier 1 centers • On demand user physics analysis • CERN Analysis Facility • Analysis • Heightened access to ESD and RAW/calibration data on demand • Tier 3 Centers distributed worldwide • Physics analysis RWL Jones 13 Feb. 2006 Mumbai

  5. Processing • Tier-0: • Prompt first pass processing on express/calibration physics stream • 24-48 hours later, process full physics data stream with reasonable calibrations • Implies large data movement from T0 →T1s • Tier-1: • Reprocess 1-2 months after arrival with better calibrations • Reprocess all resident RAW at year end with improved calibration and software • Implies large data movement from T1↔T1 and T1 → T2 RWL Jones 13 Feb. 2006 Mumbai

  6. Analysis model Analysis model broken into two components • Scheduled central production of augmented AOD, tuples & TAG collections from ESD • Derived files moved to other T1s and to T2s • Chaotic user analysis of augmented AOD streams, tuples, new selections etc and individual user simulation and CPU-bound tasks matching the official MC production • Modest job traffic between T2s RWL Jones 13 Feb. 2006 Mumbai

  7. Inputs to the ATLAS Computing Model (1) RWL Jones 13 Feb. 2006 Mumbai

  8. Inputs to the ATLAS Computing Model (2) RWL Jones 13 Feb. 2006 Mumbai

  9. Data Flow Tier 0 view • EF farm  T0 • 320 MB/s continuous • T0 Raw data  Mass Storage at CERN • T0 Raw data  Tier 1 centers • T0 ESD, AOD, TAG  Tier 1 centers • 2 copies of ESD distributed worldwide • T1  T2 • Some RAW/ESD, All AOD, All TAG • Some group derived datasets • T2  T1 • Simulated RAW, ESD, AOD, TAG • T0  T2 Calibration processing? Tier 2 view RWL Jones 13 Feb. 2006 Mumbai

  10. RAW ESD2 AODm2 0.044 Hz 3.74K f/day 44 MB/s 3.66 TB/day RAW ESD (2x) RAW RAW AODm (10x) 1.6 GB/file 0.02 Hz 1.7K f/day 32 MB/s 2.7 TB/day 1.6 GB/file 0.02 Hz 1.7K f/day 32 MB/s 2.7 TB/day 1 Hz 85K f/day 720 MB/s OtherTier-1s OtherTier-1s EachTier-2 ESD2 ESD2 ESD2 ESD1 ESD2 AOD2 AOD2 AODm1 AODm2 AODm1 AODm2 AODm2 AODm2 AODm2 T1 T1 T1 T1 T1 T1 0.5 GB/file 0.02 Hz 1.7K f/day 10 MB/s 0.8 TB/day 0.5 GB/file 0.02 Hz 1.7K f/day 10 MB/s 0.8 TB/day 0.5 GB/file 0.02 Hz 1.7K f/day 10 MB/s 0.8 TB/day 0.5 GB/file 0.02 Hz 1.7K f/day 10 MB/s 0.8 TB/day 0.5 GB/file 0.02 Hz 1.7K f/day 10 MB/s 0.8 TB/day 10 MB/file 0.2 Hz 17K f/day 2 MB/s 0.16 TB/day 10 MB/file 0.2 Hz 17K f/day 2 MB/s 0.16 TB/day 500 MB/file 0.036 Hz 3.1K f/day 18 MB/s 1.44 TB/day 500 MB/file 0.04 Hz 3.4K f/day 20 MB/s 1.6 TB/day 500 MB/file 0.036 Hz 3.1K f/day 18 MB/s 1.44 TB/day 500 MB/file 0.04 Hz 3.4K f/day 20 MB/s 1.6 TB/day 500 MB/file 0.04 Hz 3.4K f/day 20 MB/s 1.6 TB/day 500 MB/file 0.004 Hz 0.34K f/day 2 MB/s 0.16 TB/day 500 MB/file 0.004 Hz 0.34K f/day 2 MB/s 0.16 TB/day ATLAS partial &“average” T1 Data Flow (2008) Tape Tier-0 diskbuffer Plus simulation and analysis data flow CPUfarm diskstorage RWL Jones 13 Feb. 2006 Mumbai

  11. Total ATLAS Requirements in for 2008 RWL Jones 13 Feb. 2006 Mumbai

  12. Important points: • Discussion on disk vs tape storage at Tier-1’s • Tape in this discussion means low-access secure storage • No ‘disk buffers’ included except input to Tier 0 • Storage of Simulation data from Tier 2’s • Assumed to be at T1s • Need partnerships to plan networking • Must have fail-over to other sites • Commissioning • Requirement of flexibility in the early stages • Simulation is a tunable parameter in T2 numbers! • Heavy Ion running still under discussion. RWL Jones 13 Feb. 2006 Mumbai

  13. ATLAS T0 Resources RWL Jones 13 Feb. 2006 Mumbai

  14. ATLAS T1 Resources RWL Jones 13 Feb. 2006 Mumbai

  15. ATLAS T2 Resources RWL Jones 13 Feb. 2006 Mumbai

  16. Required Network Bandwidth • Caveats • No safety factors • No headroom • Just sustained average numbers • Assumes no years/datasets are ‘junked’ • Physics analysis pattern still under study… RWL Jones 13 Feb. 2006 Mumbai

  17. T1 ↔ CERN Bandwidth I+O The projected time profile of the nominal bandwidth required between CERN and the Tier-1 cloud. Mainly outward data movement RWL Jones 13 Feb. 2006 Mumbai

  18. T1 ↔ T1 Bandwidth I+O About half is scheduled analysis The projected time profile of the nominal bandwidth required T1 and T1 cloud. RWL Jones 13 Feb. 2006 Mumbai

  19. T1↔ T2 Bandwidth I+O Dominated by AOD The projected time profile of the nominal aggregate bandwidth expected for an average ATLAS Tier- 1 and its three associated Tier-2s. RWL Jones 13 Feb. 2006 Mumbai

  20. Issues 1: T1 Reprocessing • Reprocessing at Tier 1s is understood in principle • In practice, requires efficient recall of data from archive and processing • Pinning, pre-staging, DAGs all required? • Requires the different storage roles to be well understood RWL Jones 13 Feb. 2006 Mumbai

  21. Issues 2: Streaming • This is *not* a theological issue • All discussions are about optimisation of data access • TDR has 4 streams from event filter • primary physics, calibration, express, problem events • Calibration stream has split at least once since! • At AOD, envisage ~10 streams • ESD streaming? • Straw man streaming schemes (trigger based) being agreed • Will explore the access improvements in large-scale exercises • Will also look at overlaps, bookkeeping etc RWL Jones 13 Feb. 2006 Mumbai

  22. TAG Access • TAG is a keyed list of variables/event • Two roles • Direct access to event in file via pointer • Data collection definition function • Two formats, file and database • Now believe large queries require full database • Restricts it to Tier1s and large Tier2s/CAF • Ordinary Tier2s hold file-based TAG corresponding to locally-held datasets RWL Jones 13 Feb. 2006 Mumbai

  23. Conclusions • Computing Model Data Flow understood for placing Raw, ESD and AOD at Tiered centers • Still need to understand data flow implications of Physics Analysis • SC4/Computing System Commissioning in 2006 is vital. • Some issues will only be resolved with real data in 2007-8 RWL Jones 13 Feb. 2006 Mumbai

  24. Backup Slides RWL Jones 13 Feb. 2006 Mumbai

  25. Heavy Ion Running RWL Jones 13 Feb. 2006 Mumbai

More Related