1 / 42

LHC Computing Grid: CCIN2P3 role and Contribution

LHC Computing Grid: CCIN2P3 role and Contribution. KISTI-CCIN2P3 Workshop Ghita Rahal. KISTI, December 1st, 2008. Index. LHC computing grid LCG France LCG at CCIN2P3 Infrastructure Validation: An example with Alice General issues Conclusions.

ham
Download Presentation

LHC Computing Grid: CCIN2P3 role and Contribution

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. LHC Computing Grid: CCIN2P3 role and Contribution KISTI-CCIN2P3 Workshop Ghita Rahal KISTI, December 1st, 2008

  2. Index • LHC computing grid • LCG France • LCG at CCIN2P3 • Infrastructure Validation: An example with Alice • General issues • Conclusions Credits to Fabio Hernandez (CC), LatchezarBetev (Alice)

  3. LHC ComputingGrid

  4. Worldwide LCG Collaboration • LHC Computing Grid • Purpose: develop, build and maintain a distributed computing environment for the storage and processing of data for the 4 LHC experiments • Ensure the computing service and common (to the 4 experiments) application libraries and tools • Resources contributed by the countries participating in the experiments • Commitments made each October year N for year N+1 • Planning 5-year forward

  5. LHC Data Flow • Raw data generated by the detectors that need to be permanently stored • These figures don't include neither the derived nor the simulated data Accelerator duty cycle: 14 hours/day, 200 days/year 7 PB of additionalraw data per nominal year

  6. Processing power for LHC data • Computing resource requirements • All LHC experiments for 2009 • About 28.000 quad-core Intel Xeon 2.33 GHz (Clovertown) CPUs(14.000 computenodes) • … and 5 MW of electrical power!!! More than 73.000 1TB-disk spins Source: WLCG RevisedComputingCapacityRequirements, Oct. 2007

  7. WLCG Architecture (cont.) • Resource location per tier level Significant fraction of the resourcesdistributed over 130+ centres

  8. Tier-1 centres Source: WLCG Memorandum of Understanding – 2007/12/07

  9. LCG France

  10. LCG-France project • Goal • Setup, develop and maintain a WLCG Tier-1 and an Analysis Facility at CC-IN2P3 • Promote the creation and coordinate the integration of Tier-2/Tier3 French sites into the WLCG collaboration • Funding • national funding for tier-1 and AF • Tier-2s and tier-3s funded by universities, local/regional governments, hosting laboratories, … • Schedule • Started in June 2004 • 2004-2008: setup and ramp-up phase • 2009 onwards: cruise phase • Equipment budget for Tier-1 and Analysis Facility • 2005-2012: 32 M€

  11. LCG-France • GRIF: tier-2 • APC • CEA/DSM/IRFU • IPNO • LAL • LLR • LPNHE IPHC: tier-3 Strasbourg Ile-de-France Nantes Subatech: tier-2 LAPP: tier-2 Clermont-Ferrand LPC: tier-2 Annecy Lyon IPNL: tier-3 CC-IN2P3: tier-1 & analysisfacility Grenoble LPSC: tier-3 Marseille CPPM: tier-3 Source: http://lcg.in2p3.fr

  12. Associated Tier-2s BELGIUM CMS TIER-2s ROMANIAN ATLAS FEDERATION CC-IN2P3 - LYON IHEP- ATLAS/CMS TIER-2 in BEIJING ICEPP – ATLAS TIER-2 in TOKYO

  13. LCG-France sites • Most sites serve the needs of more than one experiment and group of users

  14. Tier-2s planned contribution LCG-France target

  15. Connectivity • Excellent connectivity to other national and international institutions provided by RENATER • The role of the national academic & research network is instrumental for the effective deployment of the grid infrastructure Kehl Le Mans Angers Tours Genève (CERN) Cadarache Source: Frank Simon, RENATER Darkfiber 2,5 Gbit/s link 1 Gbit/s (GE) link

  16. LCG at CCIN2P3

  17. LCG-France tier-1 & AF Roughly equivalent to 305 Thumpers (with 1TB disks) or 34 racks

  18. LCG-France tier-1 & AF contribution

  19. CPU ongoing activity at CC 2007 2007 2008 ATLAS CMS NOTE: scale is not the same on all plots Alice LHCb 2007 2008

  20. Resource usage (tier-1 + AF)

  21. Resource deployment X 6.7

  22. Resource deployment (cont.) X 3.1

  23. LCG tier-1: availability & reliability Scheduledshutdown of services on: 18/09/2007 03/11/2007 11/03/2008 Source: WLCG T0 & T1 Site Reliability Reports

  24. LCG tier-1: availability & reliability (cont.) Source: WLCG T0 & T1 Site Reliability Reports

  25. Infrastructure Validation: an example with ALICE

  26. Validation program: Goal • Registration of Data in T0 and on the GRID . • T0T1 replication • Condition Data on the GRID • Quasi online reconstruction • Pass 1 at T0 • Reprocessing at T1 • Replication of ESD : T1T2/CAFs • Quality Control • MC production and user’s analysis at T2/CAFs

  27. Data flow and rates First part: ½ nominal acquisition rate p+p (DAQ) + nominal rate for distribution rfcp Gridftp 60MB/s FTS DAQ CASTOR2 T1 storage xrootd Average:60MB/s Pic: 3GB/s xrootd CAF reco@T0 Source: L. Betev

  28. CCRC08 15 February- 10 March • Tests with half the DAQ-to-CASTOR rates • 82TB total with 90K files (0.9 GB/file) • 70% of the nominal monthly volume p+p

  29. T0  T1 replication End of data taking Expected Rate: 60 MB/s

  30. T0T1 replication ALL CCRC phase 1 CCRC phase 2

  31. T0  CC-IN2P3 End of data taking Tests before Run III (May)

  32. T0  CC-IN2P3 ALL Goal: 160 MB/sec or 14 TB/day Note: the expected rates are stillunknown for someexperiments (and keepchanging). This is the goal according to the Megatable, whichis the reference document (even if itis no longer maintained)

  33. Alice CCRC08 : May period • Detector activities • Alice offline upgrades • New VO-box installation • New AliEn version • Tuning of reconstruction software • Exercise of ‘fast lane’ calib/alignment procedure… • Data Replication • T0->T1 Scheduled according to Alice shares

  34. May: All 4 experiments concurrently • Tier-0 → CCIN2P3 Goal: 160 MB/sec or 14 TB/day Note: the expected rates are stillunknown for someexperiments (and keepchanging). This is the goal according to the Megatable, whichis the reference document (even if itis no longer maintained)

  35. Post Mortem CCRC08 • Reliable central Data Distribution • High CC-IN2P3 efficiency/stability (dCache, FTS,…) • Good and high performance of French Tier 2s • Shown large security margins for transfers between T1 and T2

  36. High priority: Analysis Farm 1/2 • Time to Concentrate on users analysis: • Must take place in parallel with other tasks • Unscheduled burst access to the data • User expects fast return of her/his output • Interactivity…. • At CC ongoing activity: • identify the needs • Setup a common infrastructure for the 4 LHC experiments.

  37. High priority: Analysis Farm 2/2 • At CC ongoing activity cont’d: • Goal: prototype to be tested beginning of 2009. • Alice Specifics: • Farm design already in test at CERN. Expect to deploy one in France according to specs but shareable with other experiments.

  38. General Issues for CC-IN2P3 • Improve each component: • Storage: higher performances for HPSS and improved interactions with dCache, • Increase level of redundancy of the services to decrease human interventions (Voboxes, LFC,…) • Monitoring, Monitoring, Monitoring….. • Manpower: Need to reach higher level of staffing, mainly for storage.

  39. Conclusion • 2008 challenge has shown the Capability of LCG-France to meet the challenges of the computing for LHC • It has also shown the need of permanent background test and monitoring of the worldwide platform • Need to improve the level of reliability of storage and data distribution components.

  40. Backup Slides

  41. ALICE Computing Model • p-p: • Quasi-online data distribution and first reco at T0 • Further reconstruction at Tiers-1s • AA • Calibration, alignment and pilot recon during data taking • data distribution and first reco At T0 • One Copy of RAW at T0 and one among Tier-1s

  42. ALICE Computing Model • T0: • First pass reco, storage of 1 Copy of RAW, • Calibration and first pass ESD. • T1 • Storage of % of RAW, ESD’s and AODs on disk • Reconstructions • Scheduled analysis • T2 • Simulation • End User analysis • Copy of ESD and AOD

More Related