300 likes | 468 Views
Grids for the LHC. Paula Eerola Lund University, Sweden Four Seas Conference Istanbul 5-10 September 2004. Acknowledgement: much of the material is from Ian Bird, Lepton-Photon Symposium 2003, Fermilab. Outline. Introduction What is a Grid? Grids and high-energy physics? Grid projects
E N D
Grids for the LHC Paula Eerola Lund University, Sweden Four Seas Conference Istanbul 5-10 September 2004 Acknowledgement: much of the material is from Ian Bird, Lepton-Photon Symposium 2003, Fermilab.
Outline • Introduction • What is a Grid? • Grids and high-energy physics? • Grid projects • EGEE • NorduGrid • LHC Computing Grid project • Using grid technology to access and analyze LHC data • Outlook Paula Eerola Four Seas Conference, Istanbul, 5-10 September 2004
What is a Grid? Introduction Paula Eerola Four Seas Conference, Istanbul, 5-10 September 2004
About the Grid • WEB: get information on any computer in the world • GRID: get CPU-resources, disk-resources, tape-resources on any computer in the world • Grid needs advanced software, middleware, which connects the computers together • Grid is the future infrastructure of computing and data management Paula Eerola Four Seas Conference, Istanbul, 5-10 September 2004
Short history • 1996: Start of the Globus project for connecting US supercomputers together (funded by US Defence Advanced Research Projects Agency...) • 1998: early Grid testbeds in the USA - supercomputing centers connected together • 1998 Ian Foster, Carl Kesselman: GRID:Blueprint for a new Computing Infrastructure • 2000— PC capacity increases, prices drop supercomputers become obsolete Grid focus is moved from supercomputers to PC-clusters • 1990’s – WEB, 2000’s – GRID? • Huge commercial interests: IBM, HP, Intel, … Paula Eerola Four Seas Conference, Istanbul, 5-10 September 2004
Grid prerequisites • Powerful PCs are cheap • PC-clusters are everywhere • Networks are improving even faster than CPUs • Network & Storage & Computing exponentials: • CPU performance (# transistors) doubles every 18 months • Data storage (bits per area) doubles every 12 months • Network capacity (bits per sec) doubles every 9 months Paula Eerola Four Seas Conference, Istanbul, 5-10 September 2004
Grids and high-energy physics? • The Large Hadron Collider, LHC, start 2007 • 4 experiments, ATLAS, CMS, ALICE, LHCb, with physicists from all over the world • LHC computing = data processing, data storage, production of simulated data • LHC computing is of unprecedented scale Massive data flow The 4 experiments are accumulating 5-8 PetaBytes of data/year Paula Eerola Four Seas Conference, Istanbul, 5-10 September 2004
Needed capacity • Storage – 10 PetaBytes of disk and tape • Processing – 100,000 of today’s fastest PCs • World-wide data analysis • Physicists are located in all the continents • Computing must be distributed for many reasons • Not feasible to put all the capacity in one place • Political, economic, staffing: easier to get funding for resources at home country • Faster access to data for all physicists around the world • Better sharing of computing resources required by physicists Paula Eerola Four Seas Conference, Istanbul, 5-10 September 2004
~100-1500 MBytes/s Tier 0 CERN Center PBs of Disk; Tape Robot Experiment Tier 1 FNAL Center IN2P3 Center INFN Center RAL Center Tier 2 Tier2 Center Tier2 Center Tier2 Center Tier2 Center Tier2 Center Institute Institute Institute Institute Physics data cache Workstations Tier 0= CERN. Tier 0 receives raw data from the Experiments and records them on permanent mass storage. First-pass reconstruction of the data, producing summary data. LHC Computing Hierarchy Tier 1 Centres = large computer centers (about 10). Tier 1’s provide permanent storage and management of raw, summary and other data needed during the analysis process. Tier 2 Centres = smaller computer centers (several 10’s). Tier 2 Centres provide disk storage and concentrate on simulation and end-user analysis. Paula Eerola Four Seas Conference, Istanbul, 5-10 September 2004
Grid technology as a solution • Grid technology can provide optimized access to and use of the computing and storage resources • Several HEP experiments currently running (Babar, CDF/DO, STAR/PHENIX), with significant data and computing requirements, have already started to deploy grid-based solutions • Grid technology is not yet off-the shelf product Requires development of middleware, protocols, services,… Grid development and engineering projects: EDG, EGEE, NorduGrid, Grid3,…. Paula Eerola Four Seas Conference, Istanbul, 5-10 September 2004
US, Asia, Australia USA • NASA Information Power Grid • DOE Science Grid • NSF National Virtual Observatory • NSF GriPhyN • DOE Particle Physics Data Grid • NSF TeraGrid • DOE ASCI Grid • DOE Earth Systems Grid • DARPA CoABS Grid • NEESGrid • DOH BIRN • NSF iVDGL • … • Asia, Australia • Australia: ECOGRID, GRIDBUS,… • Japan: BIOGRID, NAREGI, … • South Korea: National Grid Basic Plan, Grid Forum Korea,… • … Paula Eerola Four Seas Conference, Istanbul, 5-10 September 2004
Europe • EGEE • NorduGrid • EDG, LCG • UK GridPP • INFN Grid, Italy • Cross-grid projectsin order to link together Grid projects • Many Grid projects have particle physics as the initiator • Other fields are joining in: healthcare, bioinformatics,… • Address different aspects of grids: • Middleware • Infrastructure • Networking, cross-Atlantic interoperation Paula Eerola Four Seas Conference, Istanbul, 5-10 September 2004
A seamless international Grid infrastructure to provide researchers in academia and industry with a distributed computing facility PARTNERS 70 partners organized in nine regional federations Coordinating and Lead Partner: CERN CENTRAL EUROPE – FRANCE - GERMANY & SWITZERLAND – ITALY - IRELAND & UK - NORTHERN EUROPE - SOUTH-EAST EUROPE - SOUTH-WEST EUROPE – RUSSIA - USA STRATEGY • Leverage current and planned national and regional Grid programmes • Build on existing investments in Grid Technology by EU and US • Exploit the international dimensions of the HEP-LCG programme • Make the most of planned collaboration with NSF CyberInfrastructure initiative ACTIVITY AREAS SERVICES • Deliver “production level” grid services (manageable, robust, resilient to failure) • Ensure security and scalability MIDDLEWARE • Professional Grid middleware re-engineering activity in support of the production services NETWORKING • Proactively market Grid services to new research communities in academia and industry • Provide necessary education
EGEE: goals and partners • Create a European-wide Grid Infrastructure for the support of research in all scientific areas, on top of the EU Reseach Network infrastructure (GEANT) • Integrate regional grid efforts 9 regional federations covering 70 partners in 26 countries http://public.eu-egee.org/ Paula Eerola Four Seas Conference, Istanbul, 5-10 September 2004
Project funded by EU FP6, 32 MEuro for 2 years • Project start 1 April 2004 • Activities: • Grid Infrastructure: Provide a Grid service for science research • Next generation of Grid middleware gLite • Dissemination, Training and Applications (initially HEP & Bio) EGEE project Paula Eerola Four Seas Conference, Istanbul, 5-10 September 2004
EGEE: timeline Paula Eerola Four Seas Conference, Istanbul, 5-10 September 2004
Grid in Scandinavia: the NorduGrid Project Nordic Testbed for Wide Area Computing and Data Handling www.nordugrid.org Paula Eerola Four Seas Conference, Istanbul, 5-10 September 2004
Goals 2001 (project start): Introduce the Grid to Scandinavia Create a Grid infrastructure in Nordic countries Apply available Grid technologies/middleware Operate a functional Testbed Expose the infrastructure to end-users of different scientific communities Status 2004: The project has grown world-wide: nodes in Germany, Slovenia, Australia,... 39 nodes, 3500 CPUs Created own NorduGrid Middleware, ARC (Advanced Resource Connector), which is operating in a stable way Applications: massive production of ATLAS simulation and reconstruction Other applications: AMANDA simulation, genomics, bio-informatics, visualization (for metheorological data), multimedia applications,... NorduGrid: original objectives and current status Paula Eerola Four Seas Conference, Istanbul, 5-10 September 2004
Current NorduGrid status Paula Eerola Four Seas Conference, Istanbul, 5-10 September 2004
The LHC Computing Grid, LCG The distributed computing environment to analyse the LHC data lcg.web.cern.ch
Technical Design Report for Phase 2 Event simulation productions LCG service opens LCG full multi-tier prototype batch+interactive service LCG with upgraded m/w, management etc. LCG - goals Goal: prepare and deploy the computing environment that will be used to analyse the LHC data Phase 1: 2003 – 2005 • Build a service prototype • Gain experience in running a production grid service Phase 2: 2006 – 2008 • Build and commission the initial LHC computing environment 2003 2004 2005 2006 Paula Eerola Four Seas Conference, Istanbul, 5-10 September 2004
LCG composition and tasks • The LCG Project is a collaboration of • The LHC experiments • The Regional Computing Centres • Physics institutes • Development and operation of a distributed computing service • computing and storage resources in computing centres, physics institutes and universities around the world • reliable, coherent environment for the experiments • Support for applications • provision of common tools, frameworks, environment, data persistency Paula Eerola Four Seas Conference, Istanbul, 5-10 September 2004
Resource targets ´04 Paula Eerola Four Seas Conference, Istanbul, 5-10 September 2004
LCG status Sept ’04 Tier 0 • CERN Tier 1 Centres • Brookhaven • CNAF Bologna • PIC Barcelona • Fermilab • FZK Karlsruhe • IN2P3 Lyon • Rutherford (UK) • Univ. of Tokyo • CERN Tier 2 centers • South-East Europe: HellasGrid, AUTH, Tel-Aviv, Weizmann • Budapest • Prague • Krakow • Warsaw • Moscow region • Italy • ……
LCG status Sept ´04 • First production service for LHC experiments operational • Over 70 centers, over 6000 CPUs, although many of these sites are small and cannot run big simulations • LCG-2 middleware – testing, certification, packaging, configuration, distribution and site validation • Grid operations centers in RAL and Taipei (+US) – performance monitoring, problem solving – 24x7 globally • Grid call centers in FZK Karlsruhe and Taipei. • Progress towards inter-operation between LCG, NorduGrid, Grid3 (US) Paula Eerola Four Seas Conference, Istanbul, 5-10 September 2004
Outlook EU vision of e-infrastructure in Europe
Grids Grids Grids Grids middleware GÉANT IPv6 IPv6 IPv6 Moving towards an e-infrastructure Paula Eerola Four Seas Conference, Istanbul, 5-10 September 2004
Grid-empowered e-infrastructure – “all in one” Grids middleware e-Infrastructure Moving towards an e-infrastructure Paula Eerola Four Seas Conference, Istanbul, 5-10 September 2004
Summary • Huge investment in e-science and Grids in Europe • regional, national, cross-national, EU • Emerging vision of European-wide e-science infrastructure for research • High Energy Physics is a major application that needs this infrastructure today and is pushing the limits of the technology Paula Eerola Four Seas Conference, Istanbul, 5-10 September 2004