180 likes | 269 Views
EGEE Service Activity 1 (SA1). EGEE is proposed as a project funded by the European Union under contract IST-2003-508833. SA1 Objectives. Core Infrastructure services: Operate essential grid services Grid monitoring and control:
E N D
EGEE Service Activity 1 (SA1) EGEE is proposed as a project funded by the European Union under contract IST-2003-508833
SA1 Objectives • Core Infrastructure services: • Operate essential grid services • Grid monitoring and control: • Proactively monitor the operational state and performance, • Initiate corrective action • Middleware deployment and resource induction: • Validate and deploy middleware releases • Set up operational procedures for new resources • Resource provider and user support: • Coordinate the resolution of problems from both Resource Centres and users • Filter and aggregate problems, providing or obtaining solutions • Grid management: • Coordinate Regional Operations Centres (ROC) and Core Infrastructure Centres (CIC) • Manage the relationships with resource providers via service-level agreements. • International collaboration: • Drive collaboration with peer organisations in the U.S. and in Asia-Pacific • Ensure interoperability of grid infrastructures and services for cross-domain VO’s • Participate in liaison and standards bodies in wider grid community
CERN (OMC, CIC) UK+Ireland (CIC,ROC) France (CIC, ROC) Italy (CIC, ROC) Germany+Switzerland (ROC) Northern Europe (ROC) South West Europe (ROC) South East Europe (ROC) Central Europe (ROC) Russia (CIC – M12, ROC) 48 Partners involved in SA1 ROC’s in several regions are distributed across many sites SA1 Partners • ROC’s must take responsibility: • Chapter in execution plan • Organisation within region • Reporting to overall SA1 OAG: Operations Advisory Group OMC: Operations Management Centre CIC: Core Infrastructure Centres ROC: Regional Operations Centres
Steps to project start-up • Hiring • Refine roles and interactions between CIC, ROC, NOC and other operational organisations • Clarify interactions with JRA1 • Clarify security issues • Clarify relationship to international grid projects for VO’s that are larger than EGEE (LCG, other HEP,…) • Understand national infrastructure integration issues • What requirements will this place on operations, mw, processes, security, etc. • Set up basic policy frameworks: • How VOs access resources, limitations, rules, etc – sufficient that infrastructure can be exploited from project start • Identify sites willing to run development service (based on EDG tb sites?) • Set up OMC team • Organise membership of QA, Security, Requirements, etc. groups • Define transition process from LCG operations to EGEE operations
OAG Objectives • Advise the operations management on opertional policy issues • Negotiate the agreements among the Resource Centres (security policies, access policies, acceptance of CA’s etc.) • Manage the operations of the entire EGEE grid from a single centralised location (CERN) OMC Objectives CIC Objectives • Provide the basic service infrastructure of the Grid • Operate the key services which connect users with resources • Support the Regional Operation Centers
ROC Objectives and Activities • Customize and certificate the middleware releases • Provide the middleware release and up-to-date documentation to a set of Resource Centers (RC) • Maintain the Grid Resource Centers aligned with the release • Validate the set of grid services and components to form a functioning Grid to serve user communities • Validate installation/upgrade procedures and documentation • Develop/Run test suites for components and Grid services to certify a site installation • Develop Expertise in the usage and troubleshooting of problems experienced by resource centers and users
ROC Objectives and Activities (cont.) • Provide support to resource centers and users, interact with Application/VO specific support • Refer and escalate middleware problems to developers • Collaborate with CIC to have monitoring systems and performance evaluator running • Define the tools to measure the service level provided by resource centers • Management of access policies and sharing of the resources • Provide 24x7 support service
Certification - Release documentation and distribution • Certification of the release: • the Certification Activity is a collaborative activity between the ‘ROC’s Release team’, • contact with the Middleware developers • ROC’s customize the release for their region • add specific region VO’s • provide configuration and automatic installation tools • A CVS packages repository will be used: • Cern central packages repository • ROC specific repository if needed • Certification testbed to verify installation procedures and certificate the RC’s joining the grid
ROC Management Team • Has the responsibility to coordinate the upgrade or new installation with the Resource Centers • Mantains a repository of Resource Centers configurations • Collaborates with Resource Centers to install/upgrade middleware release • ‘Certificates’ the resources (in collaboration with RC managers) and grid services: • Development of certification suites • Registration of the resources in the Grid Information Service • Provides support for the deployment of the release in collaboration with the Sites (Resource Centers) and the Release Group
ROC: Support • User and Service Support • VO Services • Grid Services • Resource Centers and local resources managers • ROC Groups (Release Group and Management team) • Development of specific procedures to recover or proactively avoid congestion or faulty situation in grid services or sites • Knowledge Base • Escalation to middleware developers (need to define who is allowed to submit bugs) • Need for coordination of support procedures and ticket-database exchange procedures to provide interoperability between ROC’s support systems and give users ‘transparent’ problem resolution
ROC: Monitoring (in collaboration with CIC) Monitoring of: • computing and storage resources at Resources Centers • grid services (Resource Brokers, RLS, GIS etc) • VO usage/availability of resources • Jobs at VO level Monitoring tools already exist (GridICE); need for improvements
Personale (cont.) TOTALE: 20 fund + 8.5 unf Contributo EU: 8.5 FTE per INFN ( ~20 pers. fund + 8.5 unfund) 2 FTE per ENEA + Unics + Unile + Unina ( forniranno 1 FTE ciascuno al ROC)
Richieste di inventariabile Per il Certification Testbed: 5 nodi di calcolo (10Keuro) e 1 TB (3KEuro) per ognuno dei Tier2 + CNAF. Uno o più siti (da definire) parteciperanno al Certification Testbed di EGEE, gli altri al Certification Testbed italiano. Macchine per servizi: 8 macchine (10 KEuro) per i servizi grid in due sedi (servizio principale al CNAF e mirror a Padova/LNL) per un totale di 16 macchine
Richieste di inventariabile (cont.) Complessivamente le richieste di inventariabile ammontano a: CNAF: 30 keuro MI: 13 PD/LNL: 29 ROMA1: 13 TO: 13 -------------------------------------- TOTALE 98 keuro
Richieste per missioni Nessuna variazione rispetto alla richiesta del settembre 2003 Verifica ed eventuale aggiustamento a settembre