120 likes | 272 Views
24th September 2013. Status September 2013. Information System meeting with users. Performed releases. Update 3 on 09.09.2013 http :// gridinfo.web.cern.ch/sys-admins/bdii-releases bdii 5.2.22-1 Fix for hardcoded path affecting ARC glite -info-provider- ldap 1.4.6-1
E N D
24th September 2013 Status September 2013 Information System meeting with users
Performed releases • Update 3 on 09.09.2013 http://gridinfo.web.cern.ch/sys-admins/bdii-releases • bdii 5.2.22-1 • Fix for hardcoded path affecting ARC • glite-info-provider-ldap1.4.6-1 • Rollback to previous version in order to publish GLUE 2 Contact and Location objects • These were not published after a modification in the ldap query requested by ARC • New GOCDB v5 release • Tested with Top BDII: backwards compatible for retrieving the site BDIIs endpoints Information System meeting with users - 1st October 2013
Upcoming releases • Fix in the glite-info-provider-ldap needed • Top BDIIs may not publish all sites if the host is performing slowly • Tracked in https://savannah.cern.ch/bugs/?102608 • Documented in http://gridinfo.web.cern.ch/sys-admins/known-issues • Few top BDIIs seem to be affected • Looking into GLUE 2 (we could look at GLUE 1 too): • 328 site BDIIs published by GOCDB • Average of 30 EGI site BDIIs unresponsive • Publishing more than 300 sites means top BDII is OK • 20 out of 80 top BDIIs endpoints may be affected • Performance issues • Currently monitoring performance of top BDII • Due to LDAP design feature • Performance issues already showed up in GLUE 1! Information System meeting with users - 1st October 2013
BDII deployment status Information System meeting with users - 1st October 2013
EGI Technical Forum • Training on glue-validator • Recorded and available in: https://documents.egi.eu/public/ShowDocument?docid=1955 • 17 people registered • Few non registered participants • Few connected remotely • OGF GLUE WG meeting • Discussion to include ARC changes in LDAP • Information System workshop • Unicore and Globus resources now integrated in BDII Information System meeting with users - 1st October 2013
BDII and ARC DITs • Works for ARC but not for BDII (glite-info-provider-ldap1.4.4-1) • GLUE2Contact and GLUE2Location missing! ldapsearch –x –LLL –h site-bdii –p port –b GLUE2DomainID=site-name,o=glue ldapsearch –x –LLL –h site-bdii –p port –b GLUE2DomainID=site-name,o=glue –s base • Works for BDII but not for ARC (glite-info-provider-ldap1.4.6-1) • Services missing! ldapsearch –x –LLL –h site-bdii –p port –b GLUE2DomainID=site-name,o=glue • However, all ARC sites in WLCG seem to have a BDII like DIT! Information System meeting with users - 1st October 2013
EPEL • No progress to obtain packaging status • Started during holidays • Postponed due to other priorities • M. Ellert agreed to release bdii when needed • EPEL status • https://twiki.cern.ch/twiki/bin/view/EMI/BDIIEPELstatus Information System meeting with users - 1st October 2013
GLUE 2 validation for sites • Still analysing September results • Will summarise findings for GDB next week • Checking that sites can actually fix problems • Using exclude-known-issues option • Most errors related to default values being published! • Estimated Average and Worst waiting times, Max running jobs -> calculated by dynamic scheduler • Waiting jobs (famous 444444) • All these attributes rely on batch system configuration! Information System meeting with users - 1st October 2013
GLUE 2 validation for middleware • Sent a mail to URT to get testing resources to check newer versions • No answer! IS not a priority for MW developers • Storage Capacity in GLUE 2 • Discussions need to be restarted • Is there a need for a usage document for GLUE 2? Information System meeting with users - 1st October 2013
Glue-validator in Nagios • Final version on midmon 01.10.2013 • Validation by COD/ROD team 10.10.2013 • Glue-validator in operations on 01.11.2013 Information System meeting with users - 1st October 2013
Retirement of GLUE 1 • EGI is preparing the retirement of GLUE 1 • Test GLUE 2 information consumption • 2014 QR1 • Stop support of GLUE 1 as of May 2014 • If no blocking issues are found • How will the end of support be actually implemented? • Modify information providers so they don’t publish GLUE 1? • This will take time! • Retiring GLUE 1 will take a long time • But it won’t be a trustful source of information as soon as the support officially ends Information System meeting with users - 1st October 2013
FCR • FCR only in GLUE 1 • Any plans to write it also for GLUE 2? • CMS queues are removed completely if they are blacklisted • All the ACBRs are removed • The object is no longer valid and does not get published • Do we want to fix this? It’s a known issue • Entries removed due to FCR are still cached • Is this OK? Should this be fixed? Information System meeting with users - 1st October 2013