240 likes | 373 Views
GridPP Overview (emphasis on beyond GridPP). Tony Doyle. Executive Summary II.
E N D
GridPP Overview(emphasis on beyond GridPP) Tony Doyle Collaboration Meeting
Executive Summary II • “2004 was a pivotal year, marked by extraordinary and rapid change with respect to Grid deployment, in terms of scale and throughput. The scale of the Grid in the UK is more than 2000 CPUs and 1PB of disk storage (from a total of 9,000 CPUs and over 5PB internationally), providing a significant fraction of the total resources required by 2007. A peak load of almost 6,000 simultaneous jobs in August, with individual Resource Brokers able to handle up to 1,000 simultaneous jobs, gives confidence that the system should be able to scale up to the required 100,000 CPUs by 2007. A careful choice of sites leads to acceptable (>90%) throughput for the experiments, but the inherent complexity of the system is apparent and many operational improvements are required to establish and maintain a production Grid of the required scale. Numerous issues have been identified that are now being addressed as part of GridPP2 planning in order to establish the required resource for particle physics computing in the UK.” • Most projects fail in going from prototype to production… • There are many issues: methodical approach reqd. At the end of GridPP2 Year 1, the initial foundations of “The Production Grid” are built. The focus is on “efficiency”. Some Open Questions.. What will it take to build upon this foundation? What are the underlying problems? Science Committee Meeting
Open Questions • "LCG Service Challenges" (plans for SC4 based on experience of SC3) How do we all prepare? • "Running Applications on the Grid"(Why won't my jobs run?) • "Grid Documentation" (What documentation is needed/missing? Is it a question of organisation?) • "What value does GridPP add?" • "Beyond GridPP2 and e-Infrastructure" (What is the current status of planning?) • "Managing Large Facilities in the LHC era" (What works? What doesn't? What won't) • "What is a workable Tier-2 Deployment Model?" • "What is Middleware Support?" (really all about) Aim: to recognise the problems (at all levels), respond accordingly, define appropriate actions Collaboration Meeting
Open Questions • "LCG Service Challenges" (plans for SC4 based on experience of SC3) How do we all prepare? • "Running Applications on the Grid"(Why won't my jobs run?) • "Grid Documentation" (What documentation is needed/missing? Is it a question of organisation?) • "What value does GridPP add?" • "Beyond GridPP2 and e-Infrastructure" (What is the current status of planning?) • "Managing Large Facilities in the LHC era" (What works? What doesn't? What won't) • "What is a workable Tier-2 Deployment Model?" • "What is Middleware Support?" (really all about) Aim: to recognise the problems (at all levels), respond accordingly, define appropriate actions Collaboration Meeting
Beyond GridPP2.. Funding from September 2007 will be incorporated as part of PPARC’s request for planning input for LHC exploitation from the LHC experiments and GridPP that will be considered by a Panel consisting of Prof. G. Lafferty (Chair), Prof. S. Watts and Dr. P. Harris meeting over the summer to provide input to Science Committee in the Autumn. An important issue to note is the need to ensure matching funding is fully in place for the full term of EGEE-2, anticipated to be 1st April 2006 to 31st March 2008. Such funding for SA1 and JRA1 is currently provided by PPARC through GridPP2, but this will terminate under current arrangements at the end of GridPP2 in August 2007. Collaboration Meeting
LCG Tier-1 Planning(CPU & Storage) Experiment requests are large e.g. in 2008 CPU ~50MSi2k Storage ~50PB! They can be met globally except in 2008. UK expected to contribute ~ 7%. [Currently more] First LCG Tier-1 Compute Law: CPU:Storage ~1[kSi2k/TB] Second LCG Tier-1 Storage Law: Disk:Tape ~ 1 (The number to remember is.. 1) Collaboration Meeting
LCG Tier-1 Planning(Storage) Collaboration Meeting
LCG Tier-1 Planning • 2006: March 2005 detailed planning (bottom up) v26b • [uncertainty on when within 2006 - bid to PPARC] • PPARC signatures required in Q4 2005 • 2007-10: • March 2005 detailed planning (bottom up) v26b [current plan] • August 2005 minimal Grid (top down) [input requiring LHC-UK experiments support, further iteration(s)..] Collaboration Meeting
LCG Tier-2 Planning Third LCG Tier-2 Compute Law: Tier-1:Tier-2 CPU ~1 Zeroth LCG Law: There is no Zeroth law – all is uncertain Fifth LCG Tier-2 Storage Law:CPU:Disk~5[kSi2k/TB]) • 2006: October 2004 Institute MoU commitments • [deployment, 2005] requirement currently less than “planned” reduced CPU and disk currently delivered, need to monitor this.. • PPARC signatures required in Q4 2005 • 2007-10: • 2007 MoU, followed by pessimistic guess [current plan] • August 2005 minimal Grid (top down) [input requiring LHC-UK experiments support, further iteration(s)..] Collaboration Meeting
Cascaded Pledges.. • T2 resource LCG pledges depend upon MoU commitments • Current (Q2) Status: • SouthGrid has (already) met its MoU commitment • Other T2s have not • The Q3 status will be reported to PPARC as the year 1 outturn • (info. must be correct) Collaboration Meeting
"What value does GridPP add?" Collaboration Meeting
"What happens when GridPP disappears?" Collaboration Meeting
"Beyond GridPP2 and e-Infrastructure" LHC EXPLOITATION PLANNING REVIEW Input is requested from the UK project spokespersons, for ATLAS and CMS for each of the financial years 2008/9 to 2011/12, and for LHCb, ALICE and GridPP for 2007/8 to 2011/12. Physics programme Please give a brief outline of the planned physics programme. Please also indicate how this planned programme could be enhanced with additional resources. In total this should be no more than 3 sides of A4. The aim is to understand the incremental physics return from increasing resources. Input was based upon PPAP roadmap input E-Science and LCG-2 (26 Oct 2004) and feedback from CB (12 Jan & 7 July 2005) 3 page description: “The Grid for LHC Exploitation” submitted in August 2005 Collaboration Meeting
Beyond GridPP2.. • 3 page description: “The Grid for LHC Exploitation” • “In order to calculate the minimum amount of resource required at the UK Tier-1 and Tier-2 we have taken the total Tier-1 and Tier-2 requirements of the experiments multiplied by a UK ‘share’.” • Experiments should determine the “incremental physics return from increasing resources”. Collaboration Meeting
UK Support for the LHC Experiments • The basic functionality of the Tier-1 is: • ALICE - Reconstruction, Chaotic Analysis • ATLAS - Reconstruction, Scheduled Analysis/strimming, Calibration • CMS - Reconstruction • LHCb - Reconstruction, scheduled strimming, chaotic analysis • The basic functionality of the Tier-2s is: • ALICE - MC Production, Chaotic Analysis • ATLAS - Simulation, Analysis, Calibration • CMS - Analysis, All Simulation Production • LHCb - MC Production, No analysis Collaboration Meeting
Support for the LHC Experiments in 2008 • UK Tier-1 (~7% of Global Tier-1): • UK Tier-2 (pre-SRIF3): Status of current UK planning by experiment Collaboration Meeting
Tier-1 Requirements • Minimal UK Grid – each experiments may wish to increase their share (tape omitted for clarity) Collaboration Meeting
Tier-2 Requirements • Initial requirements can be met via SRIF3 (2007-08..) • Uncertain beyond this.. Collaboration Meeting
Manpower • Input Requirements for “minimal” Grid • Supports LHC and other experiments • Does not include wider E-Infrastructure (EGEE and beyond) Collaboration Meeting
Estimated Costs • Naïve Full Economic Cost approach ~£10m p.a. Collaboration Meeting
Cost Breakdown Total: £9,393k Collaboration Meeting
Deliver a 24/7 Grid service to European science build a consistent, robust and secure Grid network that will attract additional computing resources. continuously improve and maintain the middleware in order to deliver a reliable service to users. attract new users from industry as well as science and ensure they receive the high standard of training and support they need. 100 million euros/4years, funded by EU >400 software engineers + service support 70++ European partners Viewpoint: Enabling Grids for E-science in Europe is “E-Infrastructure” Collaboration Meeting
Phase 2 Overview EGEE is the Grid Infrastructure Project in Europe • Take the lead in developing roadmaps, white papers, collaborations • Organise European flagship events • Collaborate with other projects (including CPS) • start date = April 1 2006 • UK partners • CCLRC+NeSC+PPARC (+TCD) (n.b. UK e-Science, not only HEP) • NeSC : Training, Dissemination & Applications • NeSC : Networking • CLRC : Grid Operations, Support & Management • CLRC : Middleware Engineering (R-GMA) • UK phase 2 added partners • Glasgow, ICSTM, Leeds(?), Manchester, Oxford, (+QMUL) • Funded effort dedicated to deploying regional grids (+dissemination) • UK T2 coordinators (+newsletter) Collaboration Meeting
Summary • This meeting aims to address the uncertain areas of developing and maintaining a Production Grid • Long-term planning (2007-12) is one of these (particularly) uncertain areas • LCG MoUs will be signed shortly based upon Worldwide planning • GridPP is providing PPARC with planning input for the LHC Exploitation Grid (+input from ALICE, ATLAS, CMS, LHCb) • The (full economic) costs involved for even a minimal LHC Computing Grid are significant • GridPP needs to demonstrate its wider significance (in order to enhance PPARC’s funding at a higher level) • EGEE 2 starting, but beyond EGEE requires more planning • Real work required for "Beyond GridPP2 and e-Infrastructure" open for (tomorrow’s) discussion.. Collaboration Meeting