1 / 9

Site Validation Session Report

Site Validation Session Report. Co-Chairs: Piotr Nyczyk, CERN IT/GD Leigh Grundhoefer, IU / OSG Notes from Judy Novak WLCG-OSG-EGEE Workshop CERN, June 19-20th 2006. Service Availability Monitoring (SAM) - “extension” of SFT:.

Download Presentation

Site Validation Session Report

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Site Validation Session Report Co-Chairs: Piotr Nyczyk, CERN IT/GD Leigh Grundhoefer, IU / OSG Notes from Judy Novak WLCG-OSG-EGEE Workshop CERN, June 19-20th 2006

  2. Service Availability Monitoring (SAM) - “extension” of SFT: • generalized framework to monitor all LCG/EGEE services and not only CE: BDII, RB, LFC, FTS, etc. • most of the sensors run remotely (from central machine) • no installation needed on service machines • moved from MySQL to Oracle, optimized data schema Available at: https://lcg-sam.cern.ch:8443/sam/sam.cgi

  3. SAM sensors: • currently: BDII (Taiwan), RB (RAL), CE, SRM, LFC, FTS, SE (CERN) • release updates + SAM (SFT) • certifying current tests with each new release • Create update tests as necessary • CA cert. releases are special • Availability views • current, daily, weekly, monthly • For CE, SE, SRM, siteBDII • displayed with GridView http://glite.cvs.cern.ch/cgi-bin/glite.cgi/sft2/tests/

  4. OSG Validation services • CE/SE Validation aggregation : VORS - site scanner, BDII info • http://vors.grid.iu.edu/ • OSG VO’s VOMS validation • http://voms-monitor.grid.iu.edu/ • GridEX - application validation ( pilot job submissions ) • http://www.cs.wisc.edu/condor/tools/exerciser/ • Site Policy template and publication • http://vors.grid.iu.edu/site_policies.html • GIP Validation • http://grow.its.uiowa.edu/osg-gip/Production.shtml • Monitoring validation : MonALisa Client status (VO Jobs I/O) • http://grid02.uits.indiana.edu:8080/stats?page=summary • GridCat and the MIS-CI client • http://osg-cat.grid.iu.edu/ - Production instance • Client software: http://software.grid.iu.edu/pacman/tarballs/misci-0.4.1.tar.gz

  5. Summary • It seems to be impossible to avoid cross-monitoring (OSG monitoring doesn't include LCG-specific services, and the other way around) • We should synchronize on VO level, but LCG/EGEE is also using regional structuring

  6. OSG and EGEE Validation Interoperability • Site discovery - using discovered sites using BDII • Ops VO - supported only on OSG sites which are interoperable. (fully deployed in July) • How can we determine if EGEE site is interoperable? Review certain BDII informations • Cross installation of necessary tools and libraries for site validation • LCG tools - added as optionally installed package for OSG sites • OSG environment variables - ? (GIP)

  7. OSG and EGEE Validation Interoperability (cont) • Use of existing GGUS- OSG GOC ticket exchange for error reporting • SAM database to use contact information for OSG GOC • Issue of coordinating scheduled downtime • OSG GOC will maintain a web page with downtimes • Propose review of effort to add OSG specific validations to SAM framework. • Testing and iterative development will be accomplished using Pre-Production sites and OSG ITB

  8. DB monitoring in SAM for Tier 1’s (Dirk Duellmann) • Jobs are connecting to the DB with either http (VO lib) or direct Oracle (instant client) • Should be completed by October when experiments will start using DBs • CMS + Alice don't need them, but only 'squid’ • existing DB monitoring is too detailed for SAM/SFT, but SAM could provide highlevel monitoring of DB service • some DB services (like LFC) are already tested by SAM, BUT only the functionality is tested, not the DB! The test could be: • threshold for connection between T0 -> T1 • user access (squid) • client latency (?) • Oracle client will be installed on the Worker Nodes

  9. Comments/Discussion

More Related