1 / 6

Summary of the last GridKA Cloud Meeting (07 July 2010)

Summary of the last GridKA Cloud Meeting (07 July 2010). Marc Goulette (University of Geneva). Cloud Status (Guenter Duckeck). * New ATLAS contact Andreas Petzold at GridKa since 1st July * Operations running smooth in June, some problems :

stacie
Download Presentation

Summary of the last GridKA Cloud Meeting (07 July 2010)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Summary of the last GridKA Cloud Meeting (07 July 2010) Marc Goulette (University of Geneva) Swiss WLCG Operations Meeting

  2. Cloud Status (Guenter Duckeck) * New ATLAS contact Andreas Petzold at GridKa since 1st July * Operations running smooth in June, some problems: - GridKa tape reading tests failed, tape library broken - Freiburg extended downtime due to cooling problems - DESY-HH: ATLASSCRATCHDISK size/overload - DESY-ZN: observed ATLAS jobs with excessive memory usage * Amsterdam Jamboree: WLCG meeting on evolution of data and storage element - Trend to more dynamic data distribution (caching) rather than static placement - Several demonstrator projects in the next months - Might change/increase network usage * TAB and HGF-Grid PB meetings: - Discussed network situation in DE cloud - Started first analysis of ATLAS data transfer patterns using log information provided by sites + GridKa dominates + DE T2-T2 traffic low (<10%) + Interpretation difficult as GridKa numbers also include FTS 3rd-party transfers betweeen sites (but this is expected to be small contribution) + Some variations between sites (DESY has relatively large non-DE & CERN transfer fraction) - Discussion of network situation in DE cloud (see http://indico.cern.ch/getFile.py/access?contribId=3&resId=0&materialId=0&confId=100512 for details). Mixed situation wrt. network connectivity in Germany. - J. Schultes provided a script to parse dCache billing logs Swiss WLCG Operations Meeting

  3. Cloud Status (Guenter Duckeck) * ATLAS GridKa F2F operations meeting on June 24 - Agenda and minutes: http://indico.cern.ch/conferenceDisplay.py?confId=98902 - Extensive and productive discussion of operation areas, monitoring, testing, documentation - Template for operations wiki (to be filled): https://twiki.cern.ch/twiki/bin/view/Sandbox/GridKaSquadPage - We should extend our cloud monitoring page http://happyface-goegrid.gwdg.de/cloudmon/CloudMon.html + Job info (e.g. running, queued, CPU/Wallt for prod and user), + Storage info (e.g. space token usage, IO rates, movers) + Will discuss if/how sites could provide this information * ATLAS DE cloud computing meeting on July 19/20 - Main focus on user analysis experience and support - Plan to have 2 hrs T1/T2 operations meeting before - Preliminary agenda: https://indico.desy.de/conferenceOtherViews.py?view=standard&confId=3161 Swiss WLCG Operations Meeting

  4. TIER1 OPERATIONS (Gen Kawamura, Andreas Petzold): ------------------------------------------------- * dCache milestone file space is ready (ongoing this week, 2/3 ready, 1/3 yet to come) * FTS updated to latest release including OS upgrade * CREAM CE: cream-3-fzk available (CREAM 1.6 / SL5) cream-2-fzk had been drained and was updated * OPS tests switched off, nagios probes used instead * Upgrade of VOBOX * LFC: new 1.7.5 on SL5 will be installed (no date yet) * dCache access statistics: - dcap access to all space tokens becoming more important - Most accessed files: COND, DBRELEASE, group.phys-top.D2PD (on SCRATCHDISK), DATADISK (see T1 report pdf file) * Tape problems of last month: All problems fixed, tape library back online but still not as reliable as required PRODUCTION OPERATIONS: ---------------------- * In general no problems to report, almost no production in June, "missing pilots"-problems under investigation Swiss WLCG Operations Meeting

  5. DATA MANAGEMENT (Cedric Serfon): -------------------------------- * Smooth operation in June * Overall transfer efficiency in the last 30 days: 96% (95% last month) * Volume transfers a bit lower (~1.7M files [June], ~2.5M [May], 136MB/s [June], 300MB/s [May]) * 2 file losses - Wuppertal (~19000 files) due to problem with disk controller - LRZ (~9000 files) backplane burned * It was recently proposed not to export MC (DATA) to T2s that do not have at least 50TB for ATLASMCDISK (ATLASDATADISK) - Until now only a proposition, will probably discussed in software week - Current situation: 3 sites in DE cloud to cross this threshold for at least one of their space tokens: Cyfronet, MPPMU, Innsbruck + CYFRONET: Will get new hardware this year and will be able to increase tokens to 50TB + MPPMU expect new hardware in September, Increased one token to 50TB + Innsbruck will add new hardware * LOCALGROUPDISK usage: - 22 user over 1TB (17 in May), Total space used: 173TB (110TB in May, 89TB in April) - Could run into problems soon with LOCALGROUPDISK filling up - Quota system still under development (probably not available before end of summer) * Discussion of provenance of files at Tier2 sites (obtained from Dashboard/site services): - Most of transfer volumes from GridKa - Exception for CSCS where more that 1/2 of files coming from CERN (caused by group user doing production at CERN Swiss WLCG Operations Meeting

  6. TIER2 REPORT (Jan Erik Sundermann): ----------------------------------- * Discussion on space token usage (see pdf file) - CYFRONET MCDISK close to be full * Accounting (see pdf file) - Almost no production in June - See increased user activity (mainly via PANDA pilots) SOFTWARE INSTALLATION (Joerg Meyer), see pdf file: -------------------------------------------------- * Most sites have the latest releases installed. Some smaller problems under investigation Swiss WLCG Operations Meeting

More Related