140 likes | 244 Views
The Worldwide LHC Computing Grid. Visit of INTEL ISEF CERN Special Award Winners 2012 Thursday, 21 st June 2012. Frédéric Hemmer IT Department Head. The LHC Data Challenge. The accelerator will run for 20 years
E N D
The Worldwide LHC Computing Grid Visit of INTEL ISEF CERN Special Award Winners 2012 Thursday, 21st June 2012 Frédéric HemmerIT Department Head
The LHC Data Challenge The accelerator will run for 20 years Experiments are producing about20 Million Gigabytes of data each year (about 3 million DVDs – 700 years of movies!) LHC data analysis requires a computing power equivalent to ~100,000 of today's fastest PC processors Requires many cooperating computer centres, as CERN can only provide ~20% of the capacity June 2012 - Frédéric Hemmer
WLCG – what and why? • A distributed computing infrastructure to provide the production and analysis environments for the LHC experiments • Managed and operated by a worldwide collaboration between the experiments and the participating computer centres • The resources are distributed – for funding and sociological reasons • Our task was to make use of the resources available to us – no matter where they are located • Tier-0 (CERN): • Data recording • Initial data reconstruction • Data distribution • Tier-1 (11 centres): • Permanent storage • Re-processing • Analysis • Tier-2 (~130 centres): • Simulation • End-user analysis June 2012 - Frédéric Hemmer
Global Lambda Integrated Facility June 2012 - Frédéric Hemmer
Data acquired in 2012 2012 Data written: Total 9.4 PB to end May >3 PB in May (cf 2 PB/month in 2011) Data accessed from tape, 2012 June 2012 - Frédéric Hemmer
Data transfers Global transfers (last month) Global transfers > 10 GB/s (1 day) CERN Tier 1s (last 2 weeks) June 2012 - Frédéric Hemmer
WLCG – no stop for computing Activity on 3rd Jan
Sequence Production & IT Infrastructure at EMBL 4 x Ilumina HiSeq2000 25 TB data each week Compute Power:2000+ CPU Cores, 6+ TB RAM 2 x IluminaGAIIx Storage:1+ PB High Performance Disk
NGS - The Big Picture • ~ 8.7 million species in the world (estimate) • ~ 7 billion people • Sequencers exist in both large centres &small research groups • > 200 IluminaHiSeq sequencers in Europe alone=> capacity to sequence 1600 human genomes / month • Largest centre: Beijing Genomics Institute (BGI) • 167 sequencers, 130 HiSeq • 2,000 human genomes / day • 500-1000 Hiseq devices worldwide today • 3-6 PB /day • 1.1 – 2.2 Exabytes / year
The CERN Data Centre in Numbers • Data Centre Operations (Tier 0) • 24x7 operator support and System Administration services to support 24x7 operation of all IT services. • Hardware installation & retirement • ~7,000 hardware movements/year; ~1800 disk failures/year • Management and Automation framework for large scale Linux clusters June 2012 - Frédéric Hemmer
Scaling CERN Data Center(s) to anticipated needs Renovation of the “barn” for accommodating 450 KW of “critical” IT loads – an EN, FP, GS, HSE, IT joint venture • Exploitation of 100 KW of remote facility down town • Understanding costs, remote dynamic management, ensure business continuity • Exploitation of a remote Data center in Hungary • 100 Gbps connections • Agile infrastructure • virtualization CERN Data Center dates back to the 70’s • Now optimizing the current facility (cooling automation, temperatures, infrastructure) June 2012 - Frédéric Hemmer