310 likes | 393 Views
LCG Deployment in the UK. John Gordon GridPP10. You’ve heard about LCG… … so what’s happening in the UK? LCG Deployment, now and future The wider UK picture ….and what’s this EGEE? The Plan. A. Management Structure. LCG. ARDA. EGEE. Expmts. CB. PMB. Deployment Board. User Board.
E N D
LCG Deployment in the UK John Gordon GridPP10
You’ve heard about LCG… • … so what’s happening in the UK? • LCG Deployment, now and future • The wider UK picture • ….and what’s this EGEE? • The Plan
A. Management Structure LCG ARDA EGEE Expmts CB PMB Deployment Board User Board Tier1/Tier2, Testbeds, Rollout Service specification & provision Requirements Application Development User feedback Metadata Storage Workload Network Security Info. Mon. In LCG Context
Recent LCG ScotGrid NorthGrid SouthGrid London Grid • Tier1 +10 other sites • DCs • Tier2 structure • Support structure • GOC Monitoring • LCG Accounting
GridPP Summary: From Prototype to Production BaBarGrid BaBar EGEE SAMGrid CDF D0 ATLAS EDG LHCb ARDA GANGA LCG ALICE CMS LCG CERN Tier-0 Centre CERN Prototype Tier-0 Centre CERN Computer Centre UK Tier-1/A Centre UK Prototype Tier-1/A Centre RAL Computer Centre 4 UK Tier-2 Centres 19 UK Institutes 4 UK Prototype Tier-2 Centres Separate Experiments, Resources, Multiple Accounts Prototype Grids 'One' Production Grid 2004 2007 2001
Vision • GridPP2 should deliver a production quality grid • Meeting the computing needs of UK Particle Physics • Autonomous and self-supporting with its own identity • Participating in LCG, EGEE, BaBarGrid, SAMGrid, and any others desired by its members • Part of an integrated UK Grid • Independent but integrated, separate but seamless
Delivery Plans • Keep up with LCG • Participate in LHC Data Challenges • TierA for BaBar and BaBarGrid • Participate in LCG Service Challenges • Use by other VOs • Put in place the structure to deliver this • …..and more
Production Team • Deployment • User Support • Middleware Support • Applications Support • Network Support • Security • Operations
UK Tier-2 Centres ScotGrid NorthGrid SouthGrid London Grid NorthGrid **** Daresbury, Lancaster, Liverpool, Manchester, Sheffield SouthGrid * Birmingham, Bristol, Cambridge, Oxford, RAL PPD, Warwick ScotGrid * Durham, Edinburgh, Glasgow LondonGrid *** Brunel, Imperial, QMUL, RHUL, UCL Current UK Status: 11 Sites via LCG
Tier2 Centres ScotGrid NorthGrid SouthGrid London Grid • UK model of distributed Tier2 Centres • Managerial and organisational ‘centre’ • Tier2 is free to organise internally • so I cannot describe yet • Tier2 is smaller than an EGEE Region • but some aspects of the model may be useful (their own VO? own RB?) • May hide some of the internal structure CE, GIIS?
Deployment • A Team to roll out software across UK • Software release certification, installation support, site certification • Specialist support for sysadmins • Consists of staff from T1 + T2
User Support • Migrate from mailing list to problem-tracking • From sysadmin support to user support • Managed Helpdesk • for assignment, tracking, escalation • We already have a lot of experience • we haven’t encapsulated it in FAQs etc
Middleware, Security and Network Development Security Middleware Networking Network Monitoring Configuration Management Grid Data Management Storage Interfaces Information Services Security M/S/N builds upon UK strengths as part of International development
Middleware Support • GridPP2 Middleware development should have an emphasis on delivery and support • Middleware teams should support their software area • T2 assigned 5 specialist support posts • Integrate support effort into Production Team
Applications Support • Stephen Burke – roaming support • 2 T1 experiment-facing people • UK experiments • Get deployment and middleware support working with experiments • to ensure successful UK involvement in experiments’ use of Grid.
Network Support • Mark Leese (CCLRC-DL) • Rolled out network monitoring to UK Core e-Science programme • GridPP2 role in network support • Network optimisation • Participation in service challenges • Hopefully using lightpaths
Security • New Security Officer (to be appointed) • Security operations • Consultants • Kelsey - Joint EGEE-LCG Security • Jensen – technical advice to CA/ middleware • McNab – e-Science Security Centre • Track UK developments (Permis, Shibboleth)
GOC Secure Database Management via HTTPS / X.509 GOC GridSite MySQL Monitoring Resource Centre Resources & Site Information EDG, LCG-1, LCG-2, … bdii ce se rb RC
Operations • LCG Operations centre • EGEE ROC • Monitor GridPP (and NGS and GridIreland) • Developed tools for LCG, reuse for GridPP • Continue developing for EGEE • EGEE CIC running grid-wide services • Accounting
Wider Support • GSC • UK helpdesk • UK E-Science CA • Training • Our own and EGEE(NeSC)
Other UK Grids • NGS • National Grid Service • 4 large clusters + 2 UK Supercomputers • Already using VDT and BDII • ETF • Developing UK OGSA/WSRF Grid • UK Grid Operations Centre Director • Speaking next • Should all be part of EGEE
EGEE • UK/I Region in EGEE covers GridPP, NGS, and Grid Ireland – one of 10 regions • EGEE’s aim is to integrate national grids • Not to interfere or impose limits on them • All of the work I have described, short of actually running the Resource Centres, is EGEE work • Many sites are actually signed up to EGEE so we can report it formally as such • Many of you will be asked to report work to EGEE (timesheets, quarterly reports) but this shouldn’t be an imposition • The development of GridPP will be aligned with EGEE • But EGEE is not well defined, so we plan GridPP and participate in the developing EGEE to learn, adopt, and influence.
EGEE Issues • EGEE=LCG? • non-European sites in LCG • non-LCG sites in EGEE • Platform Support • non-Linux, free linux (cf RHEL) • Integrated user support • Support for new VOs • Security, security, security
The Next Steps • Just appointed Jeremy Coles • as GridPP Production Manager • Grid Definition • define GridPP, • get buy-in of stakeholders • Production Team • build the team • Workplan
Production Manager Tasks • Develop work plan (deliverables/milestones) • Compile problems and issues list (implement tracking) • Organise a GridPP deployment group workshop • Better establish GridPP identity – address UK specific needs • Review/develop operating procedures to maintain GridPP service • Get GridPP more involved at UK/experiment software meetings • Coordinate UK Tier-2 resource input to LCG and EGEE • Work with other grids to establish a single production grid.
Running a production service: areas to be reviewed and developed Main areas to be considered (transparency, control, accountability, security, improvement) • Grid accounting • Who needs to know what and in what form? Where are the gaps in LCG accounting? • Grid monitoring • Service-level management tools. Efficiency of resource usage. Replication issues. • Detailed metrics to be agreed • Real-time notification and problem resolution • Management & reporting • Grid management: VO setup procedures; adding new Tier-2 resources • Frequency, structure and content of reports to be agreed (e.g. resource usage, job success rates against targets) • Security • Processes and procedures (e.g. incident handling) • Mechanics of trust model defined: identity, privacy, policy and authority. (e.g how are rights revoked. Appeals.) • Misuse of resources (intrusion), user & usage audits • Support • Installation (joining) requirements/guidelines • integration & helpdesk requirements • Library – deployment documentation. User feedback – mechanism to inform future developments • Training • For new GridPP users and new operations staff • Middleware release strategy (and stabilisation!) • Tier-2 management • Service levels (SLAs/MoUs to be developed) • Resource, quota and priority handling • Resource • Maintenance plans • Audit • Of Grid usage by user/VO
Vision • GridPP2 should deliver a production quality grid • Meeting the computing needs of UK Particle Physics • Autonomous and self-supporting with its own identity • Participating in LCG, EGEE, BaBarGrid, SAMGrid, and any others desired by its members • Part of an integrated UK Grid • Independent but integrated, separate but seamless
Challenge • LCG has given us a good base • We now have a critical mass based on LCG2 • Make it production quality grid • Attract the satellite grids UKQCD, BaBar, • And bring in other experiments • Participate fully in LCG and EGEE • Without alienating non LHC experiments
Can we do it? Yes, we can!