200 likes | 322 Views
(European) Networking infrastructure needs for Particle Physics. Matthias Kasemann DESY and CERN Serenate Users Workshop January 17-19, 2003. Country. Country. Country. Country. Country. Argentina. 6. Colombia. 4. India. 17. Pakistan. 1. Taiwan. 5. Armenia. 1. Costa Rica. 1.
E N D
(European) Networking infrastructure needs for Particle Physics Matthias KasemannDESY and CERN Serenate Users Workshop January 17-19, 2003
Country Country Country Country Country Argentina 6 Colombia 4 India 17 Pakistan 1 Taiwan 5 Armenia 1 Costa Rica 1 Indonesia 2 Peru 2 Ukraine 5 Australia 8 Croatia 2 Iran 3 Poland 7 UK 24 Austria 6 Cuba 1 Ireland 3 Portugal 5 Uruguay 1 Azerbeijan 2 Czech Rep. 4 Israel 6 Romania 3 USA 166 Belarus 1 Denmark 3 Italy 33 Russia 15 Uzbekistan 2 Belgium 8 Egypt 2 Japan 37 Slovakia 4 Venezuela 2 Brazil 10 Finland 4 Korea 15 Slovenia 2 Vietnam 3 Bulgaria 2 France 30 Mexico 8 South Africa 2 Yugoslavia 3 Canada 22 Georgia 3 Mongolia 1 Spain 11 Chile 5 Germany 39 Netherlands 6 Sweden 6 China 22 Hungary 3 Norway 3 Switzerland 10 HEP institutes (from PDG, 2002) 602 institutes, collaborating at ~10 accelerator centers around the world HEP networking needs M. Kasemann
e+ f 2037 2446 1733 1699 4003 3611 952 1328 2132 1870 2093 3271 4732 1102 2491 3216 2421 1211 2319 2133 3451 1942 1121 3429 3742 1288 2343 7142 Z0 _ f e- Interaction with detector material Multiple scattering, interactions Raw data (Bytes) Read-out addresses, ADC, TDC values, Bit patterns Detector response Noise, pile-up, cross-talk, inefficiency, ambiguity, resolution, response function, alignment, temperature From Physics to Raw Data:what happens in a detector 250Kb – 1 Mb Theoretical Model of Particle interaction Fragmentation, Decay Particle production and decays observed in detectors are Quantum Mechanical processes. Hundreds or thousands of different production- and decay-channels possible, all with different probabilities. In the end all we measure are probabilities!! HEP networking needs M. Kasemann
e+ f 2037 2446 1733 1699 4003 3611 952 1328 2132 1870 2093 3271 4732 1102 2491 3216 2421 1211 2319 2133 3451 1942 1121 3429 3742 1288 2343 7142 Z0 f e- Raw data Convert to physics quantities Detector response apply calibration, alignment, Fragmentation, Decay Physics analysis Basic physics Results From Raw Data to Physics:what happens during analysis 250Kb – 1 Mb 100 Kb 25 Kb 5 Kb 500 b _ Interaction with detector material Pattern, recognition, Particle identification Analysis Reconstruction Simulation (Monte-Carlo) HEP networking needs M. Kasemann
Particle Physics Computing Challenges • Geographical dispersion: of people and resources • Complexity: the detector and the data • Scale: Petabytes per year of data per experiment Example: CMS Experiment 1750 Physicists 150 Institutes 32 Countries • Major challenges associated with: • Communication and collaboration at a distance • Distributed computing resources • Remote software development and physics analysis HEP networking needs M. Kasemann
LHC Data reduction and recording: ATLAS, CMS, ALICE, LHCb • On-line System • Multi-level trigger • Filter out background • Reduce data volume • 24 x 7 operation protons anti-protons 40 MHz (1000 TB/sec) Level 1 - Special Hardware 75 KHz (75 GB/sec) CMS Level 2 - Embedded Processors 5 KHz(5 GB/sec) Level 3 – Farm of commodity CPUs 100 Hz (100 MB/sec) Data Recording & Offline Analysis HEP networking needs M. Kasemann
Goal for LHC analysis • Example: ATLAS guiding principles (true for all LHC experiments): • Every physicist in ATLAS must have the best possible access to the data necessary for the analysis, irrespective of his/her location. • The access to the data should be transparent and efficient. • We should profit from resources (money, manpower and hardware) available in the different countries. • We should benefit from the outcome of the Grid projects. HEP networking needs M. Kasemann
Computing for the LHC experiments A new Project has been setup at CERN: the LHC Grid Computing Project (LCG) The first phase of the project: 2002-2005 • preparing the prototype computing environment, including • support for applications – libraries, tools, frameworks, common developments, ….. • global grid computing service • Shared funding by Regional Centers, CERN, Contributions • Grid software developments by national and regional Grid projects Phase 2: 2005-2007construction and operation of the initial LHC Computing Service HEP networking needs M. Kasemann
CERN will provide the data reconstruction & recording service (Tier 0)-- but only a small part of the analysis capacity Distributed Analysis must work • current planning for capacity at CERN + principal Regional Centers • 2002: 650 KSI2000 <1% of capacity required n 2008 • 2005: 6,600 KSI2000 < 10% of 2008 capacity HEP networking needs M. Kasemann
Tier 0 CERN Tier 1 Centers Brookhaven National Lab CNAF Bologna Fermilab FZK Karlsruhe IN2P3 Lyon Rutherford Appleton Lab (UK) University of Tokyo CERN Other Centers Academica Sinica (Taipei) Barcelona Caltech GSI Darmstadt Italian Tier 2s(Torino, Milano, Legnaro) Manno (Switzerland) Moscow State University NIKHEF Amsterdam Ohio Supercomputing Centre Sweden (NorduGrid) Tata Institute (India) Triumf (Canada) UCSD UK Tier 2s University of Florida– Gainesville University of Prague …… Centers taking part in LCG-1Centers that have declared resources – December 2002 HEP networking needs M. Kasemann
Target for the end of the decade LHC data analysis using “global collaborative environments integrating large-scale, globally distributed computational systems and complex data collections linking tens of thousands of computers and hundreds of terabytes of storage” The researchers concentrating on science, unaware of the details and complexity of the environment they are exploiting Success will be when the scientist does not mention the Grid HEP networking needs M. Kasemann
Tier2 Center Tier2 Center Tier2 Center Tier2 Center Tier2 Center HPSS HPSS HPSS HPSS Data Grid for LHC experiments CERN/Outside Resource Ratio ~1:2Tier0/( Tier1)/( Tier2) ~1:1:1 Experiment ~PBytes/sec Online System ~100 MBytes/sec Bunch crossing per 25 nsecs.100 triggers per secondEvent is ~1 MByte in size CERN Computer Center > 20 TIPS Tier 0 +1 HPSS 2.5 Gbits/sec France Center Italy Center UK Center USA Center Tier 1 0.15 -2.5 Gbits/sec Tier 2 ~622 Mbits/sec Tier 3 Institute ~0.25TIPS Institute Institute Institute 100 - 1000 Mbits/sec Physics data cache Physicists work on analysis “channels”. Each institute has ~10 physicists working on one or more channels Tier 4 Workstations,other portals HEP networking needs M. Kasemann
Building a Grid for LHC Collaborating Computer Centers HEP networking needs M. Kasemann
Building a Grid for LHC The “virtual” LHC Computing Center Collaborating Computer Centers Alice VO CMS VO HEP networking needs M. Kasemann
HENP Major Links: Bandwidth Roadmap (Scenario) in Gbps HEP networking needs M. Kasemann From: ICFA SCIC, H. Newman, Feb, 2002
The “virtual” LHC Computing Center • The aim is to build • a general computing service - • for a very large user population - • of independently-minded scientists - • using a large number of independently managed sites! • This is NOT a collection of sites providing pre-defined services • it is the user’s job that defines the service • it is current research interests that define the workload • it is the workload that defines the data distribution • DEMAND - Unpredictable & Chaotic • But the SERVICE had better be Available & Reliable HEP networking needs M. Kasemann
HENP Lambda Grids:Fibers for Physics • Analysis Problem: • Extract “Small” Data Subsets of 1 to 100 Terabytes from 1 to 1000 Petabyte Data Stores • Survivability of the HENP Global Grid System, with hundreds of such transactions per day (by ~ 2007) requires that each transaction be completed in a relatively short time. • Example: Take 800 secs to complete the transaction. Then • Transaction Size (TB) Net Throughput (Gbps) 1 10 10 100 100 1000 (Capacity of Fiber Today) • Summary: Providing Switching of 10 Gbps wavelengths within ~3 years; and Terabit Switching within 5-7 years would enable “Petascale Grids with Terabyte transactions”, as required to fully realize the discovery potential of major HENP programs, as well as other data-intensive fields. HEP networking needs M. Kasemann
A major concern: the Digital Divide • In an era of global collaborations, it is a particular concern that Digital Divide problems will delay and in some cases prevent physicists in the economically less favored regions of the world from participating effectively as equals in their experiments. • One can decompose the Digital Divide Problems into three components: • the Long Range (wide area) Connection, • the Last Mile Connection and • the Institution Internal Network. • If one of these components presents a bandwidth bottleneck, it will not be possible to effectively exploit the new generation of Grid architectures that are being designed and implemented to empower physicists in all world regions and to work with the data of the experiments like LHC. Pointed out by:The International Committee for Future Accelerators – Standing Committee on Inter-Regional Connectivity (ICFA SCIC) subcommittee on the Digital Divide HEP networking needs M. Kasemann
HEP Computing & Networking needs: Summary • LHC computing based on models emerging and experiences gained in current HEP experiments • Experiments need substantial resources to perform computing and analysis (10 - 20% of detector costs) • Central laboratories not able to provide all required resources • It is essential to fund and operate this in a distributed way (using Grid ideas and technology) • LHC experiments will need a substantial amount of computing and starts the Exabyte Era for event storage. • To work in this computing world, the GRID is a mandatory tool. And in consequence, networks at compatible speeds connecting institutes. • Bandwidths that allow working at long distance with distributed data are essential. • Infrastructure bottlenecks can not be tolerated or the whole system will not work well. HEP networking needs M. Kasemann