310 likes | 413 Views
Update on EU DataGrid progress and plans for EGEE. Fabrizio Gagliardi EU DataGrid Project Leader Fabrizio.Gagliardi@cern.ch www.edg.org. Overview. Project Outline Atlas task force CMS stress test Tutorials Relationships with other grid projects Future Directions (EGEE) Summary.
E N D
Update on EU DataGrid progress and plans for EGEE Fabrizio Gagliardi EU DataGrid Project Leader Fabrizio.Gagliardi@cern.ch www.edg.org
Overview • Project Outline • Atlas task force • CMS stress test • Tutorials • Relationships with other grid projects • Future Directions (EGEE) • Summary EDG/EGEE status
The Project • 9.8 M Euros EU funding over 3 years • 90% for middleware and applications (HEP, Earth Obs. and Bio Med.) • Three year phased developments & demos (2001-2003) • Total of 21 partners • Research and Academic institutes as well as industrial companies • Related projects and activities: • DataTAG (2002-2003) • CrossGrid (2002-2004) • GRIDSTART (2002-2004) • Grace (2002-2004) EDG/EGEE status
DataGRID project priorities After initial middleware development and testbed deployment effort has been refocused on quality and stability • Quality Policy Statement published http://eu-datagrid.web.cern.ch/eu-datagrid/WP12/default.htm • List of priorities defined at the project retreat http://documents.cern.ch/age?a021130 • Followed-up at project conference http://www.tomiexpress.hu/datagrid/ • Show-stoppers found by users on the application testbed are the highest priority • Incremental improvements to current release driven by the needs of the applications (HEPCAL) EDG/EGEE status
ATLAS-EDG Task Force (by Oxana Smirnova) • ATLAS is eager to use Grid tools for the Data Challenges • ATLAS Data Challenges are already on the Grid (NorduGrid, USA) • The DC1/phase2 (to start in October) is expected to be done using the Grid tools to a bigger extent • ATLAS-EDG Task Force was put together in August with the aims: • To assess the usability of the EDG testbed for the immediate production tasks • To introduce the Grid awareness to the ATLAS collaboration • The Task Force has representatives both from ATLAS and EDG: 40 members (!) on the mailing list, ca 10 of them working nearly full-time • The initial task: to process 5 input partitions of the Dataset 2000 at the EDG Testbed + one non-EDG site (Karlsruhe); if this works, continue with other datasets EDG/EGEE status
Achievements (by Oxana Smirnova): • A team of hard-working people across the Europe • ATLAS software (release 3.2.1) is packaged into relocatable RPMs, distributed and validated elsewhere • DC1 production script is “gridified”, submission script is produced • User-friendly testbed status monitor deployed • 5 Dataset 2000 input files are replicated to 5 sites (2 @ each) • After fixing the “long jobs” problem, 50% of the planned challenge is performed (5 researchers × 10 jobs) – unfortunately, only CERN testbed was fully available • With the rest of the testbed being fixed, jobs are getting scheduled and executed elsewhere • Second test: 4 input files (ca 400 MB each) replicated to 4 sites; 250 jobs submitted, adjusted to run ca 4 hours each. The jobs were distributed across all the testbed by the Resource Broker EDG/EGEE status
Summary ( by Oxana Smirnova ) • Advantages of the Grid: • Possibility to execute tasks and move files over a distributed computing infrastructure by using one single personal certificate (no need to memorize dozens of passwords) • Possibility do distribute the workload adequately and automatically, without logging in explicitly to each remote system • Possibility to do worldwide production in a perfectly coordinated way, using identical software (RPMs), scripts and databases • Where we are now: • Several Grid toolkits are on the market • EDG – probably the most elaborated, but still in development • This development goes way faster with the help of the users running real applications • Common efforts of the ATLAS-EDG Task Force proved that it is possible to execute real tasks on the EDG Testbed already now • Thanks all the members for the efforts so far, but there’s more to be done! EDG/EGEE status
CMS/EDG stress test status Andrea Sciabà on behalf of CMS & EDG collaboration CCS general meeting December 3, 2002
Sites and resources EDG/EGEE status
CMSIM events vs. time EDG/EGEE status
Current issues • The biggest problems related to the Information System: • Symptom: no resources are foundCause: instability of the MDS when it is overloaded • Solution: submitting jobs at a lower rate improves the chances of success • Symptom: the RB gets stuck (no job ever starts)Cause: investigating... • Symptom: grid elements disappear from the IICause: services on some machines stopped workingSolution: restart the services • Symptom: timeouts when copying the input sandbox • Symptom: log file lost (“Stdout does not contain useful data”)Cause: several (no free files/inodes, broken connect. between CE & RB, …) • Problems related to the replica manager: • Symptom: file registration in the RC fails from time to time EDG/EGEE status
Current issues • None of these problems is a show-stopper and they happen just in a fraction of the jobs! • Fixes are already there for some of them (but not yet deployed) EDG/EGEE status
Conclusions • 50000 events (FZ files) produced in ~ 2 days! • The CMS Task Force has made impressive progress and the first results are promising. A few issues have been identified and solutions are being worked out/applied • The entire task force shows a fruitful cooperation between CMS and EDG! EDG/EGEE status
DAY1 Introduction to Grid computing and overview of the DataGrid project Security Testbed overview Job Submission lunch hands-on exercises: job submission Tutorials The tutorials are aimed at users wishing to "gridify" their applications using EDG software and are organized over 2 full consecutive days. Approx. 100 people have followed the tutorial since August. October: 3 & 4 – CERN 31 & Nov 1 - CERN December 2 & 3 – Edinburgh 5 & 6 - Italy 9 & 10 – NIKHEF 12 - Cracow More sessions will be organised in the future http://hep-proj-grid-tutorials.web.cern.ch/hep-proj-grid-tutorials/ DAY2 • Data Management • LCFG, fabric mgmt & sw distribution & installation • Applications and Use cases • Future Directions lunch • hands-on exercises: data mgmt EDG/EGEE status
GriPhyN PPDG iVDGL Related Grid projects Through links with sister projects, there is the potential for a truely global scientific applications grid EDG/EGEE status
CrossGrid Using the same security certs. Testbed sites install EDG software Extending it for needs of intensive interactive applications Participating in the EDG testing activities Representatives in each projects architecture & management groups DataTAG (EDT) EDT is deploying EDG sw to investigate inter-operability with US projects (iVDGL, GriPhyN, PPDG) Results feedback into EDG software releases (e.g. GLUE compatible information providers/consumers) NorduGrid Using the same security certs. Involved in EDG architecture work Good ideas for gatekeeper and MDS configuration Helped develop GDMP and GSI extensions for Replica Catalog Involved in Glue schema work Security policy Mware testing Working in WP8 (HEP applications) iVDGL/GriPhyN/PPDG US members in EDG architecture group Looking for common packaging and toolkit usage solutions GriPhyN PPDG iVDGL Interaction with sister projects No strict boundaries with a large cross-fertilization of ideas, software and people DataGRID is learning from the experiences in these projects EDG/EGEE status
Plans for the future • Further development in 2003 • Further iterative improvements to middleware driven by LCG and users needs • More extensive testbeds providing more computing resources • Prepare EDG software for future migration to Open Grid Services Architecture • Interaction with LCG • LCG intends to make use of the DataGRID middleware • LCG is contributing to DataGRID • Testbed support and infrastructure • Get access to more computing resources in HEP computing centres • Testing and verification • Reinforce the testing group and maintain a certification testbed • Fabric management and middleware development • New EU project (EGEE) • Make plans to preserve current major asset of the project: probably the largest Grid development team in the world • EoI for FP6 ( www.cern.ch/egee-ei ) EDG/EGEE status
EGEE vision Enabling Grids for E-science in Europe • Goal • create a general European Grid production quality infrastructure on top of present and future EU RN infrastructure • Build on • EU and EU member states major investment in Grid Technology • Several pioneering prototype results • Largest Grid development team in the world • Goal can be achieved for about €100m/4 years on top of the national and regional initiatives • Approach • Leverage current and planned national and regional Grid programmes (e.g. LCG) • Work closely with relevant industrial Grid developers, NRNs and US Applications EGEE Geant network EDG/EGEE status
Work done so far • EoI for FP6: www.cern.ch/egee-ei submitted on June 7th • Several follow up meetings • An editorial board and an Interim Task Force established to prepare a position paper and a presentation for a EU Grid workshop in Brussels on October 3-4 • Both bodies extended to follow-up with the EU (IST02, ER02, individual contacts) EDG/EGEE status
GÉANT and GRIDs: The model GRIDs use GÉANT infrastructure Application areas GÉANT profits from technological innovation GRIDs empowered GÉANT R&D on GRIDs GRIDs platforms GÉANT network International dimension EDG/EGEE status
Instruments Research Infrastructures IST Programme Structuring the ERA Programme 665 M Euro GÉANT, GRIDs, other ICT-RI 100 + 200 M Euro 2.655 M Euro K. Baxevanidis EU 3.825 M Euro • Integrated Projects • Networks of Excellence • Specific Targeted Projects • Coordinated actions • Support actions • Integrated Infrastructure Initiatives • Coordinated actions • Support actions • More info on: http://www.cordis.lu/ist/fp6/activities.htm Separate calls for proposals! EDG/EGEE status
Communication Network Development Call • 45-47 Million Euros available in the first EU call (Dec 17th, 2002) • Hard to get the whole budget, we will need to share with one, two, more projects and a lot of competition to be expected (1200 EoIs received in this area!) • Focus on support and integration of already established Grid infrastructures • Build a Grid production layer on top of the EU RN infrastructure • No major funds for H/W, CS research or application development (in a first approximation) EDG/EGEE status
Integrated Infrastructure Initiative (I3) • Three lines of funding supported (with possible budget breakdown): • Networking activities (nothing to do with networks…): • This is the overhead: management, coordination, dissemination and outreach (7-10% of the total funding) • Specific service activities: • Provision and procurement of Grid services (60% of total funding) • Joint research activity • Engineering development to improve the services provided by the Grid infrastructure (20% of total funding) • Application support and focused R&D (10% of total funding) EDG/EGEE status
Networking activities • Coordination and management of the participating Grid infrastructures • Management structure to be defined • Dissemination, training and outreach • Leverage EDG and other project tutorials • Proposal from Terena received • User clubs, industry forum etc. EDG/EGEE status
Specific service activities • Integration of major national and international Grid infrastructures • Two tier structure: • 1st Tier: Major Grid centres (6-8). Must satisfy minimum level of Grid resources and staffing • 2nd Tier: POPs in all other Geant supported countries • EU resources for doubling the 1st tier centres Grid support staff, a central operation centre and a distributed call and support centre • Interface to Geant follow-on project • Mostly staff and overhead (computer fabrics and storage provided by the partners) EDG/EGEE status
Joint research activity • Focus on hardening and re-engineering of Middleware • Leverage current EU Grid projects and international Grid technology developers (large and established M/W development community) • 8-10 WPs with critical mass in a single geographical center, dedicated WP managers hired by the project and reporting to the project technical management (possible international and industrial participation) • Quality assurance group, integration, certification and distribution group with industrial quality • International senior advisory group for project review, long term technology development and direction EDG/EGEE status
Additional activities • Application support: • high level interface and portals • user requirements (a la HEPCAL) • CS focused activity: • Long term CS issues for production quality Grids EDG/EGEE status
Distribution of responsibilities Motivation: provide transparent, effective process for proposal preparation EGEE Executive Committee: • Responsible for defining Work Packages and setting up Task Forces to deliver technical content for proposal. Max ~10 persons for effective process • Should represent stakeholders with major, proven computer and human resources to contribute to EGEE • US has observer status (Ian Foster) EGEE technical advisory board: • Advise the Executive Committee on the overall architecture and specific technical issues • US participation confirmed EDG/EGEE status
Distribution of responsibilities EGEE Editorial board: • Responsible for gathering input from taskforces, overall editing of proposal, filling out administrative forms and maintaining timeline EGEE National Partners board: • Responsible for coordination and communicating with interested parties on national/regional level. Ideally one person per country/region • Consulted by Executive Committee during preparation of proposal, to ensure adequate transparency – must be seen as impartial EGEE interest group: • All institutes, companies, organisations interested to remain informed about progress of EGEE proposal. Includes potential subcontractors for different workpackages EDG/EGEE status
EGEE proposal timeline Tentative Schedule (continued) • EU call out on Dec 17th • Draft 1: overall project structure end of February 2003 • Draft 2: with detailed workpackages end of March 2003 • Final proposal including admin and management end of April 2003 • Submission by May 6th 2003 • First feedback from EU in June-July • Contract negotiation late summer, fall ’03 • Contract signature by the end of ’03 • Start of project Q1-Q2 ‘04 EDG/EGEE status
Summary • ATLAS/DataGRID task force has been a successful experience for EDG • CMS stress test still on-going is a major advance on production quality performance in view of next EU EDG review on February 4-5 • Deployment of a very large production Grid testbed being explored with the EU (EGEE) • This needs to be done in close collaboration with LCG and the US Grid developers for the maximum benefit of the LCG experiments and potential application to other international scientific communities (also good for long term future of HEP…) EDG/EGEE status