230 likes | 243 Views
Explore the challenges, data acquisition, and impact of the Large Hadron Collider (LHC) Computing Grid with worldwide collaboration and support for the ATLAS experiment. Discover the worldwide LHC computing service operating in coordination with numerous centers and international grid projects.
E N D
The LHC Computing Grid Visit of Prof. Frantisek Gaher Rector Comenius University Bratislava Slovak Republic Friday 13th March 2009 Frédéric HemmerIT Department Head
Outline • Computing Challenges • Current Grid Usage • EGEE & EGI
The ATLAS experiment 7000 tons, 150 million sensors generating data 40 millions times per second i.e. a petabyte/s
The Data Acquisition Ian.Bird@cern.ch
Tier 0 at CERN: Acquisition, First pass processingStorage & Distribution 1.25 GB/sec (ions) Ian.Bird@cern.ch
The LHC Computing Challenge • Signal/Noise: 10-9 • Data volume • High rate * large number of channels * 4 experiments • 15 PetaBytes of new data each year • Compute power • Event complexity * Nb. events * thousands users • 100 k of (today's) fastest CPUs • 45 PB of disk storage • Worldwide analysis & funding • Computing funding locally in major regions & countries • Efficient analysis everywhere • GRID technology
The Worldwide LHC Computing • The LHC Grid Service is a worldwide collaboration between: • 4 LHC experiments and • ~140 computer centres that contribute resources • International grid projects providing software and services • The collaboration is brought together by a MoU that: • Commits resources for the coming years • Agrees a certain level of service availability and reliability • As of today 33 countries have signed the MoU: • CERN (Tier 0) + 11 large Tier 1 sites • 130 Tier 2 sites in 60 “federations” • Other sites are expected to participate but without formal commitment
Tier 0 – Tier 1 – Tier 2 • Tier-0 (CERN): • Data recording • Initial data reconstruction • Data distribution • Tier-1 (11 centres): • Permanent storage • Re-processing • Analysis • Tier-2 (~130 centres): • Simulation • End-user analysis Ian.Bird@cern.ch
Preparation for accelerator start up • Since 2004 WLCG has been running a series of challenges to demonstrate aspects of the system; with increasing targets for: • Data throughput • Workloads • Service availability and reliability • Culminating in a 1 month challenge in May 2008 with • All 4 experiments running realistic work (simulating what will happen in data taking) • Demonstrated that we were ready for real data • In essence the LHC Grid service has been running for several years
Recent grid use CERN: 11% Tier 2: 54% Tier 1: 35% The grid concept really works – all contributions – large & small contribute to the overall effort!
Recent grid activity • In readiness testing WLCG ran more than10 million jobs /month • (1 job is ~ 8 hours use of a single processor) 350k /day • These workloads are at the level anticipated for 2009 data
Data transfer out of Tier 0 • Full experiment rate needed is 650 MB/s • Desire capability to sustain twice that to allow for Tier 1 sites to shutdown and recover • Have demonstrated far in excess of that • All experiments exceeded required rates for extended periods, & simultaneously • All Tier 1s achieved (or exceeded) their target acceptance rates
WLCG Site reliability • Overall average 75-80% • Top 50% sites: 95% • Tops 20% sites: 98% • > 70% resources are at sites > 90% reliability • T1 reliability above target (93%) in May’08 CERN + Tier-1 Reliability
Improving Reliability Monitoring Metrics Workshops Data challenges Experience Systematic problem analysis Priority from software developers
Production Grids • WLCG relies on a production quality infrastructure • Requires standards of: • Availability/reliability • Performance • Manageability • Will be used 365 days a year ... (has been for several years!) • Tier 1s must store the data for at least the lifetime of the LHC - ~20 years • Not passive – requires active migration to newer media • Vital that we build a fault-tolerant and reliable system • That can deal with individual sites being down and recover
Impact of the LHC Computing Grid in Europe Bio-informatics Education, Training Medical Imaging LCG has been the driving force for the European multi-science Grid EGEE (Enabling Grids for E-sciencE) EGEE is now a global effort, and the largest Grid infrastructure worldwide Co-funded by the European Commission (Cost: ~130 M€ over 4 years, funded by EU ~70M€) EGEE already used for >100 applications, including…
Grid infrastructure project co-funded by the European Commission - now in 3rd phase with partners in 45 countries 240 sites 45 countries 45,000 CPUs 12 PetaBytes > 5000 users > 100 VOs > 100,000 jobs/day • Archeology • Astronomy • Astrophysics • Civil Protection • Comp. Chemistry • Earth Sciences • Finance • Fusion • Geophysics • High Energy Physics • Life Sciences • Multimedia • Material Sciences • …
The EGEE project EGEE-III Slovak Beneficiary: Ustav Informatiky, Slovenska Akademia Vied • EGEE • Started in April 2004, now in third phase (2008-2010) • Brings together more than 240 institutions in 45 countries world-wide • Objectives • Large-scale, production-quality grid infrastructure for e-Science • Attracting new resources and users from industry as well asscience • Maintain and further improve“gLite” Grid middleware
Sustainability Need to prepare for permanent Grid infrastructure Ensure a high quality of service for all user communities Independent of short project funding cycles Infrastructure managed in collaboration with National Grid Initiatives (NGIs) European Grid Initiative (EGI)
The LHC Grid Service A worldwide collaboration Has been in production for several years Is now being used for real data Is ready to face the computing challenges as LHC gets up to full speed
The very first beam-splash event from the LHC in ATLAS on 10:19, 10th September 2008