130 likes | 289 Views
CERN IT-SDC. The ATLAS data flow . Simone Campana. ATLAS Computing Model - Tiers. Implementation of the ATLAS computing model: tiers and clouds. Hierarchical tier organization based on Monarc network topology Sites are grouped into clouds for organizational reasons
E N D
CERN IT-SDC The ATLAS data flow Simone Campana
Implementation of the ATLAS computing model: tiers and clouds • Hierarchical tier organization based on Monarc network topology • Sites are grouped into cloudsfor organizational reasons • Possible communications: • Optical Private Network • T0-T1 • T1-T1 • National networks • Intra-cloud T1-T2 • Restricted communications: General public network • Inter-cloud T1-T2 • Inter-cloud T2-T2 ADC Retreat Napoli 1-4 February 2011
Detector Data Distribution Tier-0 O(2to4GB) files (with exceptions) • RAW and reconstructed data generated at CERN and dispatched at T1s. • Reconstructed data further replicated downstream to T2sof the SAME cloud Tier-1 Tier-1 Tier-1 Tier-2 Tier-2 Tier-2 Tier-2 Tier-2 Tier-2 Tier-2 Tier-2 Tier-2
Data distribution after Reprocessing and Monte Carlo Reconstruction Tier-0 • RAW data is re-processed at T1s to produce a new version of derived data • Derived data are replicated to T1s of the same cloud • Derived data are replicated to few other T1s (or CERN) • And, from there, to other T2s of the same cloud O(2to4GB) files (with exceptions) Tier-1 Tier-1 Tier-2 Tier-2 Tier-2 Tier-2 Tier-2
Monte Carlo production Tier-0 INPUT • Simulation (and some reconstruction) run at T2s • Input data hosted at T1s is transferred (and cached) at T2s • Output data are copied and stored back to T1s • For reconstruction, derived data are • replicated to few other T1s (or CERN) • And, from there, to other T2s of the same cloud OUTPUT Tier-1 Tier-1 Tier-2 Tier-2 Tier-2
Analysis • The paradigm is “jobs go to data” i.e. • Jobs are brokered at sites where data have been pre-placed • Jobs access data only from the local storage of the site where they run • Jobs store the output in the storage the site where they run • No WAN involved. Simone Campana – ATLAS TIM Tokyo
Issues - I • You need data at some T2 (normally “your” T2) • The inputs are at some other T2 in a different cloud • Examples: • Outputs of analysis jobs • Replication of particular samples on demand According to the model you should: Tier-1 Tier-1 Tier-2 Tier-2 Simone Campana – ATLAS TIM Tokyo
Issues - II • You need to process data available only at a give T1 • All sites of that cloud are very busy • You assign jobs to some T2 of a different cloud INPUT According to the model you should: OUTPUT Tier-1 Tier-1 Tier-2 Simone Campana – ATLAS TIM Tokyo
Evolution of the model • ATLAS decided to relax the monarch model • Allow T1-T2 and T2-T2 traffic between different clouds • Any site can exchange data with any site if the system believes it is convenient • The “convenience” is measured in terms of per-file transfer rate, between direct transfers and multi-hopped transfer Simone Campana – ATLAS TIM Tokyo
Evolution of the model Multi-Cloud Monte Carlo production Analysis Output Tier-1 Tier-1 Tier-1 Tier-1 Tier-2 Tier-2 Tier-2 Simone Campana – ATLAS TIM Tokyo
How to choose a transfer path? • Currently • If a source file is available in the same cloud as destination, transfer directly • If a source file is available at another cloud • Compute the time it would take transferring directly • Compute the time it would take creating intermediate replica(s) at T1(s) • Times are computed based on previous transfers between the same sites • Take the most “time convenient” option • In future • We intend to eliminate the separation between cloud and non cloud sites • We intend to eliminate the creation of intermediate replicas (and use always direct transfers) Simone Campana – ATLAS TIM Tokyo
To Conclude Evolution of the model and impact on the connectivity for a large T2 (like Tokyo) • So far ATLAS asked (large) T2s • To be well connected to their T1 • To be well connected to the T2s of their cloud • Now we are asking large T2s: • To be well connected to all T1s • To foresee non negligible traffic from/to other (large) T2s Simone Campana – ATLAS TIM Tokyo