80 likes | 257 Views
Status, open issues. Ilija Vukotic. Content. SSB test changes Remaining site availability issues To do list. SSB test changes. Move to rucio Went according to plan All endpoints not having rucio enabled N2N turned red
E N D
Status, open issues Ilija Vukotic
Content • SSB test changes • Remaining site availability issues • To do list
SSB test changes • Move to rucio • Went according to plan • All endpoints not having rucio enabled N2N turned red • Now situation much more stable as proved by very rare “instant” mails to the cloud supports lists. Side note: now “Delays” and “Monitoring” tests split off in a new “FAX specialists” view.
SSB test changes • “locate -r” issue • We noticed that sometimes the “direct” test was failing but only for the file being tested and all other files were working. • Reason was found by Andy: in another “delay” test I was doing a “locate –r” which does a file lookup after dropping it from cmsd cache. That triggers a bug. • The bug will be fixed by Andy and in a meanwhile SSB uses different files for these two tests. • X509 check • Before was using my certificate without VO ATLAS • That was not a good test as both DPM and dCache don’t check for VO but DN. • Thanks to Wei, I have a new SLAC VO cert. • Now all the working endpoint do require ATLAS cert to read the data.
Deployment Upgrades are coming (slowly) and are often partial I’ll make the code to check for this Matevz will implement sending this information to collector (two weeks)
Operations • Should we de-activate part of the FR cloud? Currently no mainland France endpoint works. • LFC-free N2N for dCache exists and has been deployed. • LFC-free N2N for other sites is in testing • In process of adding Beijing endpoint • The only remaining issue is opening the ports. • All of the test datasets in place (both SSB and FDR) • Looking for a solution to a problem seen at 3 IT and one UK endpoint: Xrd: CheckErrorStatus: Server [t2-dpm-01.na.infn.it:1094] declared: Secgsi: ErrParseBuffer: unknown CA: cannot verify client credentials: kXGC_certreq(error code: 3010)
Smaller issues • localSetupFAX • Should we move to xrootd client 3.3.6? • For the second time we had a temporary issue with a freeGeoIP service. Will look at the possibility of hosting it at MWT2. • A large speed up of pandamon fax failover page. Still some way to go.
Plans • Push on deployment • Continue USA cloud stress testing • Further tutorial development