290 likes | 467 Views
SEE-GRID-2 Infrastructure and Operations Overview. 2 nd Regional Grid Projects Concertation Workshop EGEE’07, Budapest, Hungary, 4 October 2007. Antun Balaz WP3 Leader Institute of Physics, Belgrade antun@phy.bg.ac.yu. WP3 Objectives. Develop the next-generation SEE-GRID infrastructure
E N D
SEE-GRID-2 Infrastructure and Operations Overview 2nd Regional Grid Projects Concertation Workshop EGEE’07, Budapest, Hungary, 4 October 2007 Antun Balaz WP3 Leader Institute of Physics, Belgrade antun@phy.bg.ac.yu The SEE-GRID-2 initiative is co-funded by the European Commission under the FP6 Research Infrastructures contract no. 031775
WP3 Objectives • Develop the next-generation SEE-GRID infrastructure • Next generation of EGEE middleware (gLite) and services • Support in deployment and operations of the Resource Centres • Monitoring, helpdesk, overall upgrade of infrastructure • Network resource provision and assurance • in close cooperation with the SEEREN2 project • Bandwidth-on-Demand requirements • CA and RA guidelines and deployment • catch-all Certification Authority (CA) • per-country CA deployment and operations • User portal deployment and operations • P-GRADE 2nd Regional Grid Projects Concertation Workshop @ EGEE’07, Budapest, Hungary, 4 October 2007 2
WP3 Deliverables & Milestones • Deliverables • D3.1a - Infrastructure Deployment Plan, M04 (CERN) • D3.2 - CA and RA guidelines for new candidates, M05 (GRNET/AUTH) • D3.3 - Portal specifications and functionality, M06 (SZTAKI) • D3.1b - Infrastructure Deployment Plan, M14 (CERN) • D3.4 - Infrastructure overview and assessment, M23 (UOB-IPB) • Milestones • M3.1 - Infrastructure Deployment Plan Defined, (M04) • M3.2 - CA and RA guidelines for new candidates defined, (M05) • M3.3 - Portal operational across the pilot Grid (M12) 2nd Regional Grid Projects Concertation Workshop @ EGEE’07, Budapest, Hungary, 4 October 2007 3
WP3 Activities • A3.1 - Implementation of the advanced SEE-GRID infrastructure (UOB-IPB/IPP) • Deals with support for configuration, deployment and operations of the Resource Centres within the SEE-GRID pilot infrastructure, as well as transition of mature centres into EGEE. • A3.2 - Network Resource Provision and BoD requirements (IPP) • Support liaison actions to ensure adequate network provision, including the requirements for Bandwidth-on-demand, if and where necessary depending on the application. • A3.3 - Deploy and operate Grid CAs (GRNET) • Should provide CA and RA guidelines and help establish per-country CAs to cover the authentication issues • A3.4 - Provide a user portal (SZTAKI) • Supports the deployment of a user-friendly and multi-grid interoperable portal for convenient Grid access and usage. 2nd Regional Grid Projects Concertation Workshop @ EGEE’07, Budapest, Hungary, 4 October 2007 4
WP3 Main Achievements • Infrastructure maintained and expanded • Core services deployed redundantly and maintained with no interruptions in operation • Operations maintained and improved: • BBmSAM deployed and integrated with HGSM • Other operational tools developed, deployed and integrated • SLA conformance; availabilities • Grid-Operator-On-Duty shifts • Development areas identified and significant progress already achieved: HGSM, BBmSAM, WiatG, Application-level accounting, YAIM customizations, JAVA Data Management API etc. 2nd Regional Grid Projects Concertation Workshop @ EGEE’07, Budapest, Hungary, 4 October 2007 5
Network Status • Majority of SEE-GRID countries covered by GEANT2 and SEEREN2; problems still with the connectivity of • Albania • Moldova • Liaison with SEEREN2 for effective network and services provision • Two applications with BoD requirements have been identified: • EMMIL (developed by International Business School, Hungary) • VIVE (developed by the University of Belgrade, Serbia) • SALUTE application actively uses FTS • Several applications use site-level MPI 2nd Regional Grid Projects Concertation Workshop @ EGEE’07, Budapest, Hungary, 4 October 2007 6
SEE-GRID Infrastructure (1) 2nd Regional Grid Projects Concertation Workshop @ EGEE’07, Budapest, Hungary, 4 October 2007 7
SEE-GRID Infrastructure (2) • SEE-GRID infrastructure contains currently the following resources: • 32 sites in SEE-GRID production • 6 sites in certification phase (2 AL + 1 HR + 2 RO + 1 MD) • Over 1100 CPUs available • Storage: 18 TB + 27 TB in preparation • All sites on gLite-3, with 7 sites on gLite-3.1 and the rest on gLite-3.0 • glite-CE final assessment by EGEE and SEE-GRID is that this service is not stable enough for production, so it has been officially removed from production • glite-WMSLB actively used • Guides provided for deployment of gLite-3.1 WNs on SL4.5, for both 32-bit and 64-bit architectures 2nd Regional Grid Projects Concertation Workshop @ EGEE’07, Budapest, Hungary, 4 October 2007 8
SEE-GRID Infrastructure (3) SEE-GRID total and free CPUs from October 2006 (from GStat) 2nd Regional Grid Projects Concertation Workshop @ EGEE’07, Budapest, Hungary, 4 October 2007 9
SEE-GRID Infrastructure (4) • SEE-GRID Core services • Catch-all Certification Authority • enables regional sites to obtain user and host certificates • Virtual Organisation Management Service (VOMS), • authorization system for the SEE-GRID Virtual Organisation (VO), • supporting groups and roles • deployed two instances (master and slave) for failover • Workload management service (lcg-RB and glite-WMSLB) • deployed several instances for failover • Information Services (BDII) • deployed several instances for failover • MyProxy is operational • supports certificate renewal • FTS deployed • used in production 2nd Regional Grid Projects Concertation Workshop @ EGEE’07, Budapest, Hungary, 4 October 2007 10
SEE-GRID Infrastructure (5) • As sites mature, they migrate to EGEE • Croatia, Turkey, Serbia, Romania • However, this depends on agencies providing funding for the hardware • Each participating institute has its own strategy 2nd Regional Grid Projects Concertation Workshop @ EGEE’07, Budapest, Hungary, 4 October 2007 11
t-infrastructure: sites • AEGIS02-RCUB no restriction on the number of jobs (i.e. up to 11 with SEEGRID VO) • AEGIS04-KG no restriction on the number of jobs (i.e. up to 8 with SEEGRID VO) • BG04-ACAD 8 CPUs, provide glite-CE, together with lcg-CE • GR-01-AUTH 13 CPUs, (this is with other VOs supported) • MK-02-ETF 1 CPU • RO-03-UPB 10 CPUs, (this is with SEEGRID and other VOs supported) • TR-01-ULAKBIM 16 CPUs, • TOTAL ~60 CPUs 2nd Regional Grid Projects Concertation Workshop @ EGEE’07, Budapest, Hungary, 4 October 2007 12
t-infrastructure: Core services • SGDEMO CA • http://www.grid.auth.gr/pki/seegrid-demo-ca/ (GR-01-AUTH) • VOMS • https://voms.grid.auth.gr:8443/voms/sgdemo/ (GR-01-AUTH) • https://voms.irb.hr:8443/voms/sgdemo/ (HR-01-RBI) • BDIIs • bdii.phy.bg.ac.yu (AEGIS01-PHY-SCL) • bdii.ulakbim.gov.tr (TR-01-ULAKBIM) • bdii01.grid.auth.gr (GR-01-AUTH) • RBs • rb.phy.bg.ac.yu (AEGIS01-PHY-SCL) • rb.ulakbim.gov.tr (TR-01-ULAKBIM) • rb01.grid.auth.gr (GR-01-AUTH) • WMS • grid-wms.ii.edu.mk (MK-01-UKIM_II) • wms.phy.bg.ac.yu (AEGIS01-PHY-SCL) • wms.ulakbim.gov.tr (TR-01-ULAKBIM) • LFC • grid-lfc.ii.edu.mk primary (MK-01-UKIM_II) • grid02.rcub.bg.ac.yu lfcLocal (AEGIS02-RCUB) • Lfc.phy.bg.ac.yu lfcLocal (AEGIS01-PHY-SCL) 2nd Regional Grid Projects Concertation Workshop @ EGEE’07, Budapest, Hungary, 4 October 2007 13
SEE-GRID Operations (1) 2nd Regional Grid Projects Concertation Workshop @ EGEE’07, Budapest, Hungary, 4 October 2007 14
SEE-GRID Operations (2) • Distributed Operations – currently one ROC • EGI: SEE ROC probably integrated with the SEE-GRID ROC • Pilot SLA established • Monitoring and Accounting Tools • Helpdesk tickets procedures • Generic support group for users • TPM-like (monitoring open tickets created by users, trying to solve the simple ones, route the tickets, etc.). • Country level user support groups • Step towards stand-alone operations • Grid-Operator-On-Duty shifts introduced, initial results very positive in terms of site availabilities improvements • SEEGRID Wiki with detailed information for site admins: • http://wiki.egee-see.org/index.php/SEE-GRID_Wiki • VOMS Role=ops used for SAM jobs submission 2nd Regional Grid Projects Concertation Workshop @ EGEE’07, Budapest, Hungary, 4 October 2007 15
Operational & monitoring tools (1) GSTAT (Taiwan) HELP-DESK RTM (UK) VOMS MonALISA BDII SAM NAGIOS BBmSAM R-GMA GridICE HGSM Google maps 2nd Regional Grid Projects Concertation Workshop @ EGEE’07, Budapest, Hungary, 4 October 2007 16
Operational & monitoring tools (2) • Operational & monitoring tools deployment status • Hierarchical Grid Site Management (HGSM) – Turkey • Service Availability Monitoring (SAM) (+ porting to MySQL) – Bosnia and Herzegovina with CERN support • Helpdesk - Romania • BBmSAM - Bosnia and Herzegovina • GridICE – FYR of Macedonia • SEE-GRID GoogleEarth – Turkey + Gidon Moont (ic.ac.uk) • SEE-GRID GoogleMaps - Turkey • Global Grid Information Monitoring System (GStat) – Min-Hong Tsai (ASGC, Taiwan) • Relational Grid Monitoring Architecture (R-GMA) – Bulgaria • Nagios - Bulgaria • Real Time Monitor (RTM) – Gidon Moont (ic.ac.uk) and Turkey (HGSM) • MONitoring Agents using a Large Integrated Services Architecture (MonALISA) – Romania • What is at the Grid (WiatG) – CERN with support from Serbia 2nd Regional Grid Projects Concertation Workshop @ EGEE’07, Budapest, Hungary, 4 October 2007 17
Operational & monitoring tools (3) R-GMA BBmSAM SAM BDII • Integration status • HGSM+SAM, HGSM+BBmSAM • Automatic creation of list of sites to be tested • HGSM+BDII • Automatic creation of list of sites in the infrastructure • HGSM+GStat • Automatic creation of list of sites to be monitored • HGSM+RTM, HGSM+R-GMA • Automatic creation of list of sites monitoring and for accounting • VOMS+Helpdesk • Automatically create new user accounts when accessing helpdesk • Certificate based access for Helpdesk HGSM GSTAT RTM Google maps VOMS HELP-DESK 2nd Regional Grid Projects Concertation Workshop @ EGEE’07, Budapest, Hungary, 4 October 2007 18
BBmSAM & BBmobileSAM • BBmSAM portal • Created for SLA monitoring • Generating site availability statistics according to several criteria • Overview (HTML) and full dump (CSV) of data possible • Extended into full SAM portal • Availability for last 24h period for all sites/services • Latest results per service • History for nodes/services • BBmobileSAM • Optimized for small-screen devices and low bandwidth • Possible filtering of sites • Possible three levels of details 2nd Regional Grid Projects Concertation Workshop @ EGEE’07, Budapest, Hungary, 4 October 2007 19
WiatG 2nd Regional Grid Projects Concertation Workshop @ EGEE’07, Budapest, Hungary, 4 October 2007 20 • Web application for visualization of BDII information • http://bdii.phy.bg.ac.yu/WiatG/pl/WiatG.pl • Used as an operational tool for site monitoring • Highly responsive tool because it uses AJAX • Partial refresh (client receives part by part of the page) • Asynchronous (server is processing in the background, so one may send several requests) • Current version seeks for: CE, gCE, RB, gRB, SE, LFC, FTS and GridICE • Documentation available: • http://wiki.egee-see.org/index.php/WiatG SEE-GRID-2 PSC05 meeting, Thessalonica, Greece - September 11-12, 2007
WiatG Usage 2nd Regional Grid Projects Concertation Workshop @ EGEE’07, Budapest, Hungary, 4 October 2007 21 • Several regional projects • EUMedGRID (bdii.isabella.grnet.gr) • EUChinaGrid (euchina-bdii-1.cnaf.infn.it) • EELA (lnx112.eela.if.ufrj.br) • BalticGrid (bdii.mif.vu.lt) • Int-EU-Grid (i2g-ii01.lip.pt) • Health-e-Child (hec-maat-server2.cern.ch)http://hec-maat-server1.cern.ch/WiatG/pl/WiatG.pl • ROC CERN • PROD (lcg-bdii.cern.ch) • PPS (pps-bdii.cern.ch) • OPS (sam-bdii.cern.ch) SEE-GRID-2 PSC05 meeting, Thessalonica, Greece - September 11-12, 2007
SEE-GRID-2 SLA • Hardware and connectivity criteria • Min. amount of resources for sites to participate in the infrastructure • Network to fulfill operations test requirements • Level of support • Site and security administrators availability and response time • Level of expertise • Site and security administrators declaration of expertise • VO support • Site to provide support to SEEGRID VO and its OPS role • Conformance to Operational Metrics • Site availability • Downtimes • SEE-GRID-2 SLA communicated to EGEE 2nd Regional Grid Projects Concertation Workshop @ EGEE’07, Budapest, Hungary, 4 October 2007 22
Conformance to SEE-GRID-2 SLA Improvements seen after three quarters of pilot SLA enforcement 2nd Regional Grid Projects Concertation Workshop @ EGEE’07, Budapest, Hungary, 4 October 2007 23
VO Management • Regional catch-all SEEGRID VO • Members from all participating institutes • Distributed VO management: all countries have VOMS admin representatives • National VOs • Serbia • Romania • Turkey • Regional VO is supported on all sites • Other regional discipline-oriented VOs will be created soon (SEE-GRID-SCI) • Seismology • Meteorology • Environmental sciences • etc. 2nd Regional Grid Projects Concertation Workshop @ EGEE’07, Budapest, Hungary, 4 October 2007 24
WP3 Contribution Areas • HGSM • Application-level accounting tool • YAIM customizations • SAM porting to MySQL (BBmSAM) • WiatG • New tool “What should be at the Grid” (WsbatG) • Based on the site configuration exported from HGSM, should provide the expected status of BDII • JAVA Data Management API • Contributions to standards (e.g. Glue Schema) • Mainly providing feedbacks • Coordination with other projects missing 2nd Regional Grid Projects Concertation Workshop @ EGEE’07, Budapest, Hungary, 4 October 2007 25
Joint Development Areas (1) • Full interoperation with EGEE • M/W issues - what happens if a user has a problem that cannot be solved inside the project because it is M/W related? • Reported and followed up by the SEE-GRID operations team trough GGUS and Savannah • Participation in operations meetings • Common SEE-GRID and EGEE partners crucial • How can the project send request for middleware improvements to EGEE and has this taken into account? • Participation in EGEE operations meetings (including ROC managers meetings, on behalf of SEE-GRID) • Contributions to EGEE working groups and other bodies 2nd Regional Grid Projects Concertation Workshop @ EGEE’07, Budapest, Hungary, 4 October 2007 26
Joint Development Areas (2) • How can we send our own modifications to middleware and be included in the official gLite release? • Much more difficult • Some of our YAIM customizations made it • voms-renewd for lcg-RB never did • Basically, only through common partners • How can we give our contributions in operational tools to EGEE-III? How can middleware extensions be passed to EGEE? • Common partners will certainly be important • SEE-GRID-SCI will develop additional operations- and app-related tools • Coordination with other regional Grid projects crucial 2nd Regional Grid Projects Concertation Workshop @ EGEE’07, Budapest, Hungary, 4 October 2007 27
CA Status • CAs accredited in the region in 2007 • Bulgaria (BG.ACAD CA), Accredited on March 5, 2007 • Serbia (AEGIS CA), Accredited on June 1, 2007 • Romania (ROSA CA), Accredited on August 1, 2007 • Earlier accredited CAs • Greece (HellasGrid CA) • Croatia (SRCE CA) • Turkey (TRGRID CA) • Grid CA candidates • Montenegro CA (MREN CA) • CP/CPS reviewed by GridAUTH (via see-ca-incubation mailing list) on July 10, 2007 • F.Y.R.O.M. CA (MARGI CA) • Accreditation request on May 4, 2007 • First CP/CPS not yet available 2nd Regional Grid Projects Concertation Workshop @ EGEE’07, Budapest, Hungary, 4 October 2007 28
CA Map Catch All CA Established CA New CA Candidate CA Training CA RA 2nd Regional Grid Projects Concertation Workshop @ EGEE’07, Budapest, Hungary, 4 October 2007 29