560 likes | 575 Views
GridPP aims to develop and deploy a large-scale science grid in the UK for use by the worldwide particle physics community. This document outlines the project goals, metrics for success, project elements, and risks/dependencies.
E N D
Tony Doyle GridPP Oversight Committee 15 May 2002
Exec Summary Goals Metrics for success Project Elements Risks/Dependencies (and mechanisms) Summary PMB-02-EXEC PMB-01-VISION PMB-02-EXEC Gantt Charts, PMB-05-LCG, TB-01-Q5-Report, TB-02-UKRollout, PMB-06-TierAstatus, PMB-04-Resources PMB-03-STATUS, PMB-07-INSTRUMENTS PMB-02-EXEC Document Mapping Tony Doyle - University of Glasgow
The Vision Thing… Grid Scale Integration Dissemination LHC Analyses Other Analyses DataGrid LCG Interoperability Infrastructure Finanaces Summary Outline Tony Doyle - University of Glasgow
GridPP Documents Tony Doyle - University of Glasgow
From Web to Grid - Building the next IT Revolution Premise The next IT revolution will be the Grid. The Grid is a practical solution to the data-intensive problems that must be overcome if the computing needs of many scientific communities and industry are to be fulfilled over the next decade. Aim The GridPP Collaboration aims to develop and deploy the largest-scale science Grid in the UK for use by the worldwide particle physics community. GridPP Vision Many Challenges.. Shared distributed infrastructure For all experiments Tony Doyle - University of Glasgow
1. SCALE: GridPP will deliver the Grid software (middleware) and hardware infrastructure to enable the testing of a prototype of the Grid for the LHC of significant scale. 2. INTEGRATION: The GridPP project is designed to integrate with the existing Particle Physics programme within the UK, thus enabling early deployment and full testing of Grid technology and efficient use of limited resources. 3. DISSEMINATION: The project will disseminate the GridPP deliverables in the multi-disciplinary e-science environment and will seek to build collaborations with emerging non-PPARC Grid activities both nationally and internationally. 4. UK PHYSICS ANALYSES (LHC): The main aim is to provide a computing environment for the UK Particle Physics Community capable of meeting the challenges posed by the unprecedented data requirements of the LHC experiments. 5. UK PHYSICS ANALYSES (OTHER): The process of creating and testing the computing environment for the LHC will naturally provide for the needs of the current generation of highly data intensive Particle Physics experiments: these will provide a live test environment for GridPP research and development. 6. DATAGRID: Grid technology is the framework used to develop this capability: key components will be developed as part of the EU DataGrid project and elsewhere. 7. LCG: The collaboration builds on the strong computing traditions of the UK at CERN. The CERN working groups will make a major contribution to the LCG research and development programme. 8. INTEROPERABILITY: The proposal is also integrated with developments from elsewhere in order to ensure the development of a common set of principles, protocols and standards that can support a wide range of applications. 9. INFRASTRUCTURE: Provision is made for facilities at CERN (Tier-0), RAL (Tier-1) and use of up to four Regional Centres (Tier-2). 10. OTHER FUNDING: These centres will provide a focus for dissemination to the academic and commercial sector and are expected to attract funds from elsewhere such that the full programme can be realised. (…. WHAT WE SAID WE COULD DO IN THE PROPOSAL) GridPP Objectives Tony Doyle - University of Glasgow
Grid – A Single Resource GRID A unified approach Many millions of events Various conditions Many samples GRID A unified approach Peta Bytes of data storage Distributed resources Many 1000s of computers required Worldwide collaboration Heterogeneous operating systems Tony Doyle - University of Glasgow
OGSA Grid - What’s been happening? GRID A unified approach • A lot… • GGF4, OGSA and support of IBM (and others) • [as opposed to .NET development framework and passports to access services] • Timescale? September 2002 • W3C architecture for web services • Chose (gzipped) XML as opposed to other solutions for metadata descriptions… and web-based interfaces • linux • [as opposed to other platforms… lindows??] • C++ (experiments) and C, Java (middleware) APIs • [mono - Open Source implementation of the .NET Development Framework??] Tony Doyle - University of Glasgow
GridPP Context Provide architecture and middleware Future LHC Experiments Running US Experiments Build Tier-A/prototype Tier-1 and Tier-2 centres in the UK and join worldwide effort to develop middleware for the experiments Use the Grid with simulated data Use the Grid with real data Tony Doyle - University of Glasgow
EDG TestBed 1 Status GRID A unified approach Web interface showing status of (~400) servers at testbed 1 sites GRID extend to all expts Tony Doyle - University of Glasgow
LHC computing at a glance 1. scale • The investment in LHC computing will be massive • LHC Review estimated 240MCHF (before LHC delay) • 80MCHF/y afterwards • These facilities will be distributed • Political as well as sociological and practical reasons Europe: 267 institutes, 4603 users Elsewhere: 208 institutes, 1632 users Tony Doyle - University of Glasgow
RTAG Status 7. LCG • 6 RTAGs created to date: • RTAG1 (Persistency Framework; status: completed) • RTAG2 (Managing LCG Software; status: running) • RTAG3 (Math Library Review; status: running) • RTAG4 (GRID Use Cases; status: starting) • RTAG5 (Mass Storage; status: running) • RTAG6 (Regional Centres; status: starting) • Two more in advanced state of preparation: • Simulation components • Data Definition Tools Tony Doyle - University of Glasgow
Fabrics & Grid Deployment 7. LCG • LCG Level 1 Milestone: deploy a Global Grid Service within 1 year • sustained 24 X 7 service • including sites from three continents • identical or compatible Grid middleware and infrastructure • several times the capacity of the CERN facility • and as easy to use • Ongoing work at CERN to increase automation and streamline configuration, especially for migration to RedHat 7.2. • Aim to phase out old CERN solutions by mid-2003. Tony Doyle - University of Glasgow
Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 2002 2003 2004 2005 LCG Timeline 1. timescale Prototype of Hybrid Event Store (Persistency Framework) Hybrid Event Store available for general users applications Distributed production using grid services Full Persistency Framework Distributed end-user interactive analysis LHC Global Grid TDR grid “50% prototype” (LCG-3) available LCG-1 reliability and performance targets First Global Grid Service (LCG-1) available Tony Doyle - University of Glasgow
LCG Development – Long Term Attachment at CERN This will enable Grid developments in the UK to be (more) fully integrated with long-term Grid development plans at CERN. The proposed mechanism is: 1. submit a short one-page outline of current and proposed work, noting how this work can best be developed within a named team at CERN, by e-mail to the GridPP Project Leader (Tony Doyle) and GridPP CERN Liaison (Tony Cass). 2. This case will be discussed at the following weekly GridPP PMB meeting and outcomes will be communicated as soon as possible by e-mail following that meeting. Notes 1. The minimum period for LTA is 3 months. It is expected that a work programme will be typically for 6 months (or more). 2. Prior DataGrid and LHC (or other) experiments' Grid work are normally expected. 3. It is worthwhile reading http://cern.ch/lcg/peb/applications in order to get an idea of the areas covered, and the emphasis placed, by the LCG project on specific areas (building upon DataGrid and LHC experiments' developments). 4. Please send all enquiries and proposals to: Tony Doyle <a.doyle@physics.gla.ac.uk> and Tony CASS <tnt@mail.cern.ch> Be a part of this? Tony Doyle - University of Glasgow
Summary of LCG 7. LCG • Project got under way early this year • Launch workshop and early RTAGs give good input for high-level planning … • … to be presented to LHCC in July • New plan takes account of first beam in 2007 • No serious problems foreseen in synchronising LCG plans with those of the experiments • Collaboration with the many Grid projects needs more work • Technical collaboration with the Regional Centres has to be established • Recruitment of special staff going well (but need to keep the recruitment momentum going) • Serious problem with materials funding Tony Doyle - University of Glasgow
Building upon Success 6. DataGrid • The most important criterion for establishing the status of this project was the European Commission review on March 1st 2002. • The review report of project IST-2000-25182 DATAGRID is available from PPARC. • The covering letter states “As a general conclusion, the reviewers found that the overall performance of the project is good and in some areas beyond expectations.” • The reviewers state “The deliverables due for the first review were in general of excellent quality, and all of them were available on time… All deliverables are approved. The project is doing well, exceeding expectations in some areas, and coping successfully with the challenges due to its size.” Tony Doyle - University of Glasgow
6. DataGrid Tony Doyle - University of Glasgow
WP1 – Workload Management (Job Submission) 6. DataGrid 1. Authentication grid-proxy-init 2. Job submission to DataGrid dg-job-submit 3. Monitoring and control dg-job-status dg-job-cancel dg-job-get-output 4. Data publication and replication (WP2) globus-url-copy, GDMP 5. Resource scheduling – use of CERN MSS JDL, sandboxes, storage elements Important to implement this for all experiments… Tony Doyle - University of Glasgow
WP2 - Spitfire 6. DataGrid Tony Doyle - University of Glasgow
Application Code Application Code Archiver Servlet Consumer Servlet Consumer Servlet Consumer Servlet Consumer Servlet Consumer API Archiver API Consumer API Consumer API User code monitors output here. Consumer API Registry API Registry API Registry API Registry API Consumer API DBProducer Registry Servlet Schema API User code here. Builds on R-GMA Database Structures. DBProducer Servlet Registry API ProducerServlet Producer API Schema Servlet Sensor Code “Event Dictionary” WP3 - R-GMA 6. DataGrid Tony Doyle - University of Glasgow
WP4 - LCFG 6. DataGrid Tony Doyle - University of Glasgow
WP5 – Storage Element Interface Layer The Core and the Bottom Layer 2 Interface Queue Manager Network 3 1 Request Manager Pipe Manager 4 Handler Disk MSM Pipe Store Named Pipe 6 Named Pipe Tape 5 Named Pipe 8 7 Named Pipe Data Flow Diagram for SE 6. DataGrid • A consistent interface to MSS. • MSS • Castor • HPSS • RAID arrays • SRM • DMF • Enstore • Interfaces • GridFTP • GridRFIO • /grid • OGSA Tony Doyle - University of Glasgow
6. DataGrid WP6 - TestBed 1 Status Web interface showing status of (~400) servers at testbed 1 sites GRID extend to all expts Tony Doyle - University of Glasgow
WP7 – Network Monitoring 6. DataGrid Tony Doyle - University of Glasgow
o=xyz,dc=eu-datagrid, dc=org o=testbed,dc=eu-datagrid, dc=org ou=People ou=People ou=Testbed1 ou=??? CN=John Smith CN=Mario Rossi CN=John Smith Authentication Certificate Authentication Certificate Authentication Certificate CN=Franz Elmer CN=Franz Elmer mkgridmap ban list grid-mapfile local users WP7 - EDG Authorisationgrid-mapfile generation VODirectory “AuthorizationDirectory” 6. DataGrid Tony Doyle - University of Glasgow
WP8 - Applications 6. DataGrid • 5. Portability • Demonstrable portability of middleware: a) use other resources, b) debugging • 6. Scratch Space • Job requests X amount of scratch space to be available during execution, system tells job where it is • 7. Output File Support • JDL support for output files: specify where output should go in JDL, not in job script • 1. Realistic Large-Scale Tests • Reliability! Need reliable dg-job-*command suite • 2. Data management • Reliability! Need reliable gdmp-* command suite, file-transfer commands • 3. Mass Storage Support • Working access to MSS (CASTOR and HPSS at CERN, Lyon) • 4. Lightweight User Interface • Put on a laptop or std. Desktop machine Tony Doyle - University of Glasgow
Expt. Feedback 4. and 5. Expts Tony Doyle - University of Glasgow
5. Other Expts 8. Interoperability = Minimal e-Bureaucracy Tony Doyle - University of Glasgow
GRID JOB SUBMISSION – External User Experience 5. Other Expts Tony Doyle - University of Glasgow
Things Missing, apparently… 5. Other Expts Tony Doyle - University of Glasgow
Expt. Feedback 4. and 5. Expts Tony Doyle - University of Glasgow
GridPP Poster 3. Dissemination Tony Doyle - University of Glasgow
Tier 1/A EDG Poster 3. Dissemination Tony Doyle - University of Glasgow
BaBar Poster 3. Dissemination Tony Doyle - University of Glasgow
LHCb Poster 3. Dissemination Tony Doyle - University of Glasgow
ScotGRID Poster 3. Dissemination Tony Doyle - University of Glasgow
Identifiable Progress... 3. Dissemination t0 t1 Tony Doyle - University of Glasgow
WebLog Allows every area/sub group to have its own 'news' pages Tony Doyle - University of Glasgow
GridPP & Core e-Science Centres 3. Dissemination Written formally to all e-Science centres inviting contact and collaboration with GridPP. • NeSC • Close ties, hosted 2nd GridPP Collaboration Meeting, Collaboration on EDIKT Project? Training... • Belfast • Replied but not yet up and running. • Cambridge • Close ties, hosted 3rd GridPP Collaboration Meeting. Share one post with GridPP. Will collaborate on ATLAS Data Challenges. • Cardiff • Replied - contacts through QM (Vista) and Brunel GridPP Group. Tony Doyle - University of Glasgow
GridPP & Core e-Science Centres 3. Dissemination • London • No formal reply but close contacts through IC HEP Group. IC will host 5th GridPP Collaboration Meeting. • Manchester • No collab. projects so far. Manchester HEP Group will host 4th GridPP Collaboration Meeting. • Newcastle • In contact - Database projects? • Oxford • Close ties, collaboration between Oxford HEP Group and GridPP on establishment of central Tier-2 centre? CS/Core-GridPP-EDG links? Probably host 6th GridPP Collaboration Meeting. • Southampton • Replied but no collaboration as yet. Tony Doyle - University of Glasgow
GLUE 8. Interoperability • How do we integrate with developments from elsewhere in order to ensure the development of a common set of principles, protocols and standards that can support a wide range of applications? • GGF… • Within the Particle Physics community, these ideas are currently encapsulated in the Grid Laboratory Uniform Environment (GLUE). • Recommend this as a starting point for the wider deployment of Grids across the Atlantic. See http://www.hicb.org/glue/GLUE-v0.1.doc (Ruth Pordes et al.) Tony Doyle - University of Glasgow
8. Interoperability Tony Doyle - University of Glasgow
UK Tier-A/prototype Tier-1 Centre 9. Infrastructure • Roles • Tier-A Centre for BaBar • EDG testbed(s) • LCG prototype Tier-1 Centre • prototype Tier-1 for LHC experiments (Data Challenges independent of LCG development…) • Interworking with other UK resources (JIF, JREI, eSC) = UK portal • existing LEP, DESY and non-accelerator experiments • Purchases • First year = Hardware Advisory Group (HAG1) • Determine balance between cpu, disk, and tape • Experts on specific technologies • Propose more HAGs (2 and 3).. • Needs to be successful in all roles... Tony Doyle - University of Glasgow
Rollout of the UK Grid for PP 9. Infrastructure • Operational stability of GridPP middleware = Testbed team • The “gang of four” … Andrew McNab, Steve Traylen, Dave Colling (other half) and Owen Moroney • Ensures the release of “Testbed” quality EDG software • documentation • lead for other system managers in terms of implementation • pre-defined software cycle releases (2 months..) • Subject of the Rollout Plan… “Planning for EDG Testbed software deployment and support at participating UK sites” (Pete Clarke, John Gordon) • LCG is the proposed mechanism by which the EDG testbed at CERN becomes an LCG Grid Service. The evolution of the EDG testbed to the LCG Grid Service will take account of both EDG and US grid technology. Need to take account of this.. Tony Doyle - University of Glasgow
Longer Term.. 9. Infrastructure • LCG Grid Service • Takes account of EDG and US grid technology • A large-scale Grid resource, consistent with the LCG timeline, within the UK. • Scale in UK? 0.5 Pbytes and 2,000 distrib. CPUs = GridPP in Sept 2004 • “50% prototype” Tony Doyle - University of Glasgow
£17m 3-Year Project Dave Britton 10. Finances • Five components • Tier-1/A = Hardware + ITD Support Staff • DataGrid = DataGrid Posts + PPD Staff • Applications = Experiments Posts • Operations = Travel + Management + e Early Investment • CERN = LCG posts + Tier-0 + e LTA Tony Doyle - University of Glasgow
1. Recruitment • EDG Funded Posts (Middleware/Testbed) • All 5 in post + 1 additional • EDG Unfunded Posts (Middleware/Testbed) • 15 out of 15 in post • GridPP Posts (Applications + Tier1/A) • Allocated Dec 2001 • 13 out of 15 in post • CERN Posts • First Round = 105 Applicants, 12 Offers, 9 Accepted • 4 in Applications, 2 Data Management, 3 Systems • Second Round = 140 applicants, 9 Offers • Third Round ~ 70 Applicants • Aim ~ 28 posts Tony Doyle - University of Glasgow
2. Monitoring Staff Effort [SM] Robin Middleton Tony Doyle - University of Glasgow
3. Progress towards deliverables.. Pete Clarke Tony Doyle - University of Glasgow