1 / 28

CC-IN2P3 Tier-2s Cloud

Explore the LCG-France Tier-2 and Tier-3 sites and their contribution to the ATLAS project. Learn about the activities, resources, and collaborations involved.

crippen
Download Presentation

CC-IN2P3 Tier-2s Cloud

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CC-IN2P3 Tier-2s Cloud Frédérique Chollet (IN2P3-LAPP) on behalf of the LCG-France project and Tiers representatives ATLAS visit to Tier-1 Lyon, April 26-27 2007

  2. Contents • LCG-France sites / Tiers of ATLAS • ATLAS Cloud FR Activities • ATLAS and Sites • Discussion Thanks to Ueda, Fabio, Jean-Pierre, Eric, Stéphane

  3. LCG-France sites (1/3) • LCG-Francepromotes the creation and coordinates the integration of Tier-2/Tier-3 french sites into the WLCG collaboration • WLCG Tiers-2 :Analysis facility in Lyon and 3 Tiers-2 • GRIF-Paris Region acting as a federation of 5 sites (DAPNIA, IPNO, LAL,LLR, LPNHE) • LPC-Clermont • Subatech-Nantes • Resources outside Tier-1 :a set of 3 Tier-2s and 4 Tier-3s • 10 french laboratories involved (1 more candidate : LPSC Grenoble) • Tier-2s and Tier-3s funded by universities, local/regional governments, hosting laboratories, … • Open to EGEE VOs - Collaborations established outside HEP • Tier-2 strategy : WLCG Tier-2 (+ Tier-3/EGEE outside MoU) • WLCG Tier-2 : ~ 80 % of GRIF resources • Tier-3 strategy : local analysis facility, happy to be considered as small Tier-2 (opportunistic use by experiments), open to EGEE VOs

  4. LCG-France sites (2/3) Scientific and Technical project leaders : CC-IN2P3, AF Lyon : F.Malek, F. Hernandez CPPM Marseille : C.Bee,T.Mouthuy GRIF Paris Region: JP. Meyer, M. Jouvin IPHC Strasbourg : D.Bloch,Y.Patois IPNL Lyon : S.Perries, D.Pugnère LAPP Annecy : S.Jézéquel, N.Neyroud LPC Clermont : D. Pallin, J.C Chevaleyre Subatech Nantes : L. Aphecetche, JM. Barbet Technical Teams G.Baulieu,JM. Barbet, C. Barbier, J.Bernier, D.Bouvet, B.Boutherin, L.Caillat, K.Chawoshi, H.Cordier, C.Diarra, S. Elles, E. Fede, Y.Giraud, P.Girard, M. Gougerot, C.Gondrand, Z. Georgette, E. Knoops, P. Micout, P. Larrieu, C. Leroy, C. L’Horphelin, L. Martin, E. Medernach, V.Mendoza, P. Mora de Fraitas, T.Ollivier, Y.Perret, G.Philippon, G. Rahal, M. Ricard, R. Rumler, F. Schaer, J. Schaeffer, L. Schwarz, I. Semeniouk, D. Terront, A. Trunov

  5. LCG-France sites (3/3) • Supported LHC experiments • All sites also support other virtual organizations

  6. Tier-2s Contribution (to WLCG MoU) • Computing resources in 2008 • Tier-2s : 45 % of the total CPU resources pledged in France • Tier-2s planning has been revised according to new estimates of computing capacity requirements Source :http://lcg.web.cern.ch/LCG/planning/phase2_resources/P2PRCcaps170407.pdf

  7. Tier-2s Planned Capacity in 2008

  8. Tier-3s Planned Evolution • Tier-3, Analysis facilities and EGEE resources (outside MoU)

  9. Tiers of ATLAS LCG-France sites • Tier-2: GRIF • CEA/DAPNIA • LAL • LLR • LPNHE • IPNO • Tier-2: GRIF • CEA/DAPNIA • LAL • LPNHE Tier-3: IPHC Strasbourg Ile de France Tier-3: IPNL Nantes Tier-2: Subatech Tier-3: LAPP Clermont-Ferrand Tier-2: LPC Annecy Lyon Tier-1: CC-IN2P3 AF: CC-IN2P3 Marseille Tier-3: CPPM

  10. ATLAS Planned Capacity in 2008 AF Tier-2 (dedicated to simulation) MoU contribution eq 1/30 of ATLAS T2 Target ATLAS T2 CPU Disk

  11. ATLAS Tier-2s planned evolution • ATLAS resources in Tier-2s

  12. Tier-2/Tier-3 Activities • LCG-France Tier-2/Tier-3 technical activities officially set up in April 2006 • Collaboration tools in place • Mailing list, wiki pages, regular video-conference meetings • Activities • Very active in the Quattor working group • Used by most of the LCG-France sites • Network-level and SRM-level data transfer tests from and to tier-1 • Including associated foreign sites • CC-IN2P3 support to Tier-2s • Monitoring of collective services (FTS), common infrastructure (Network) http://cctools.in2p3.fr/dcache/monitoring/ftsmonitor.php • CPU benchmarking… • Meetings held with several potential hardware providers • Sharing of technical and commercial information (hardware evaluation results, commercial conditions, etc.) • DPM day (advanced session) with S.Lemaitre

  13. Tier-2/Tier-3 Activities (cont.) • In close contact with some foreign associated tier-2s • Europe • Belgium CMS Tier-2 • Romanian Federation ATLAS Tier-2 • Asia • IHEP China - ATLAS and CMS Tier2 • ICEPP Japan - ATLAS Tier2 • In close contact with • EGEE SA1: Grid Operations (ROC support) • with Experiments : LHC computing tracking • CAF (Computing ATLAS-France), TFEP (Task force “Efficacité de production”) • with Network experts in IN2P3 and Renater NREN

  14. Site availability survey • Trying to define LCG-France site reports • Site availability measured as CE & sBDII & SE & SRM from SAM testshttps://lcg-sam.cern.ch:8443/sam/sam.py • Data Extraction from SAM (non official) • aiming for 95% availability

  15. Site availability survey • GRIF overall availability benefits from the federation (redundancy of site services instances) • Impact of SAM BDII failures (timeout & information instabilities) to be appreciated • Low score of Availability for ATLAS VO (metrics in real conditions) compared to OPS (no space left, permission denied…) Comparaison of overall availability for OPS and ATLAS VO Example of GRIF due to (SRM failures) SAM OPS SAM ATLAS

  16. Jobs survey - Country view (from EGEE accounting) • Accounting report : Data extracted from the EGEE Country View • http://www3.egee.cesga.es/gridsite/accounting/CESGA/country_view.html • Accounting enforcement /Benchmarking discussion :(on-going work) • CPU time plots require appropriate SpecInt values being published and normalized • Apel toolprovide average figures irrelevant to heterogeneous farm • CCIN2P3 using an adapted accounting normalized per job ~36 % of Total number of jobs are ATLAS jobs

  17. Roumanie Pekin Pekin Tokyo ATLAS FR Cloud • Tier-2: GRIF • CEA/DAPNIA • LAL • LPNHE Ile de France Nantes Tier-3: LAPP Tier-2: LPC Annecy Tier-1: CC-IN2P3 AF: CC-IN2P3 Marseille Tier-3: CPPM

  18. Network Performance Tests: LyonT1 – TokyoT2 • On-going effort from Tokyo and CCIN2P3 experts to make smooth data transfers over long distance network • SL4 (kernel 2.6 with BIC TCP) : much better in congestion control than SL3 (kernel 2.4) and Solaris 10. 1-stream 10-stream Lyon to Tokyo: 0-5 MB/s 2-20 MB/s Tokyo to Lyon: 10-15 MB/s 44-60 MB/s (max 100 MB/s) • Software Pacer (PSPacer by AIST) in addition: gives a stable and good performance 1-stream 2 to 8-stream Lyon to Tokyo: 45 MB/s 45 MB/s Tokyo to Lyon: 70 MB/s 100 MB/s by courtesy of H.Matsumoto, L.Caillat

  19. ATLAS FR Cloud activities • CAF activities • Monte Carlo Production • Autumn 2006: executor installed at Lyon to distribute production jobs within FR-Cloud. • Production shift organization • FR sites have assumed 16 % of LCG for 2006 • FR Cloud Production Monitoring http://atlas-saclay.in2p3.fr/eln/index.php • Improving contacts between production group and siteadmins • Share a clear understanding of what’s going on by courtesy of E.Lançon, J.Schwindling and CAF

  20. Do we work well ? • ATLAS Monitoring : http://atlas-php.web.cern.ch/atlas-php/DbAdmin/Ora/php 4.3.4/proddb/monitor/OverViews.php • How to improve site efficiency ? Set up Site Alerts but follow-up of errors not so easy

  21. Do we work well ? Sites should check : EXECG_GETOUT_EMPTYOUT:. Possible reasons: WNs with local disk full No write rights Dying disk Incorrect ssh keys on the WN … WRAPLCG_WNCHECK_SWMISS: problem with the ATLAS software NFS problems $VO_ATLAS_SW_DIR not correctly defined …

  22. Tier-1  Tier-2 July 2006 • DDM Functional Test LAL LPHNE SACLAY BEIJING LPC LAPP Tokyo TOKYO by courtesy of G.Rahal, S.Jezequel

  23. BAD OK

  24. ATLAS and Sites concerns • Optimizing processing capacity • centralized / distributed, done at VO level or/and site level • VO strategy being pushed to sites / Sites strategy being published • Job priorities based on VOMS group and roles integration • Optimizing grid-enable disk storage and integrating data management tools • VOMS group and roles integration, DPM ACLs changes • SRM V2.2, Data access protocols • difficult for Tier-2s to exercise data transfers infrastructure by theirselves • Provisioning specific services according to the experiment requirements • compatibility with other VOS, security • clear understanding of specificities and plans • Assuming service level and response times • Operating grid services • Assuming experiments activity and Xmas, and Summer periods (laboratories may be closed) Evolution, reliability of the information system

  25. Plans for 2007 • Storage space provision is a major concern for all Tiers • Data access patterns required by the expriments • Managed disk enabled storage : SRM v2.2 implemetation • File Systems studies (GPFS, Lustre evaluation) and GSI enabled protocols • VOMS groups and roles integration • Site availability : improve stability despite on-going activities at sites • Infrastructure consolidation, hardware procurements, OS evolution, Mware upgrade • Electric and cooling infrastructure is an issue • Running over XMas, holidays period… • Efficiency : Plans for a close collaboration with the new TFEP (Task Force Efficacité de Production) • Improve the global efficiency of ATLAS production on the FR cloud • More on Grid Security in connection with IN2P3 security managers • Enhance Tier-2s representation to GDB • Improve monitoring • Set-up a new SRM-level, FTS data transfers test period if possible

  26. Conclusions • Additional resources coming from Tier-2s & even Tier-3 initiatives • Not in competition with Tier-1 funding but funding support expected in 2009 • Significant effort in terms of Budget, infrastructure, human support… • Collaborative work • Within EGEE SA1 • Resources and base line services • Within LCG-France • Tier-1 – Tier-2s Tier-3s integration • Enhance collaboration between sites experts and experiment representatives • Relation with the corresponding T1 is fundamental • Working together with experiments • Experiment computing models define tasks distribution, data distribution and specific data flows between Tier-1s and Tier-2s

  27. ATLAS and Sites Tier-1 CC-IN2P3 experts • ATLAS • CAF Sites LCG-France T2-T3 Thanks to experts from ATLAS, sites and CC - IN2P3 !

  28. Reference documents • LCG-France Tier-2 Tier-3 Resource Planning 2006-2010https://edms.in2p3.fr/document/I-008142/18/04/2007 update • W-LCG Reference documents: • Summary of Regional Centres Capacity 17/04/2007 updatehttp://lcg.web.cern.ch/LCG/planning/phase2_resources/ http://lcg.web.cern.ch/LCG/planning/phase2_resources/P2PRCcaps170407.pdf • Revised Computing Capacity RequirementsOctober 2006http://lcg.web.cern.ch/LCG/MB/revised_resources/

More Related