1 / 22

INFN-T1 site report

INFN-T1 site report. Andrea Chierici On behalf of INFN-T1 staff HEPiX Fall 2013. Outline. Network Farming Storage Common services. Network. Cisco7600. NEXUS. WAN Connectivity. RAL PIC TRIUMPH BNL FNAL TW-ASGC NDFGF. LHC OPN. IN2P3 SARA. LHC ONE. GARR Bo1. General IP.

sal
Download Presentation

INFN-T1 site report

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff HEPiX Fall 2013

  2. Outline • Network • Farming • Storage • Common services Andrea Chierici

  3. Network

  4. Cisco7600 NEXUS WAN Connectivity • RAL • PIC • TRIUMPH • BNL • FNAL • TW-ASGC • NDFGF LHC OPN IN2P3 SARA LHC ONE GARR Bo1 General IP 10 Gb/s CNAF-FNAL CDF (Data Preservation) 10 Gb/s For General IP Connectivity 10Gb/s 20Gb/s 10Gb/s General IP 20 Gb/s (Q3-Q4 2013) 20 Gb Physical Link (2x10Gb) shared for LHCOPN and LHCONE. LHCOPN/ONE 40 Gb/s (Q3-Q4 2013) T1 resources

  5. Farming and Storage current connection model LHCOPN INTERNET cisco 7600 10Gb/s bd8810 nexus 7018 10Gb/s Disk Servers 2x10Gb/s Up to 4x10Gb/s Oldresources 2009-2010 4X1Gb/s Farming Switch Farming Switch 20 Worker Nodes per switch WorkerNodes • Core switches and routers are fully redundant (power, CPU, fabrics) • Every Switch is connected with load sharing on different port modules • Core switches and routers have a strict SLA (next solar day) for maintenance Andrea Chierici

  6. Farming

  7. Computing resources • 195K HS-06 • 17K job slots • 2013 tender installed in summer • AMD CPUs, 16 job slots • Upgraded whole farm to SL6 • Per-VO and per-Node approach • Some CEs upgraded and serving only some VOs • Older nehalem nodes got a significant boost switching to SL6 (and activating hyperthreading too…) Andrea Chierici

  8. New CPU tender • 2014 tender delayed until beginning of 2014 • Will probably cover also 2015 needs • Taking into account TCO (energy consumption) not only sales price • 10 Gbit WN connectivity • 5 MB/sper job (minimum) required • 1 gbit link is not enough to face the traffic generated by modern multi core CPUs • Network bonding is hard to configure • Blade servers areattractive • Cheaper 10 gbit network infrastructure • Cooling optimization • OPEX reduction • BUT: higher street price Andrea Chierici

  9. Monitoring & Accounting (1) • Rewritten our local resource accounting and monitoring portal • Old system was completely home-made • Monitoring and accounting were separate things • Adding/removing queues on LSF meant editing lines in monitoring system code • Hard to maintain: >4000 lines of Perl code Andrea Chierici

  10. Monitoring & Accounting(2) • New system: monitoring and accounting share same data base • Scalable and based on open source software (+ few python lines) • Graphite (http://graphite.readthedocs.org) • Time series oriented data base • DjangoWebapp to plot on-demandgraphs • lsfmonacct module released on github • Automatic queue management Andrea Chierici

  11. Monitoring & Accounting (3) Andrea Chierici

  12. Monitoring & Accounting (4) Andrea Chierici

  13. Issues • Grid accounting problems starting from April 2013 • Subtle bugs affecting the log parsing stage on the CEs (DGAS urcollector) and causing it to skip data • WNODeS issue upgrading to SL6 • Code maturity problems: addressed quickly • Now ready for production • Babar and CDF will be using it rather soon • Potentially the whole farm can be used with WNODeS Andrea Chierici

  14. New activities • Investigation on Grid Engine as an alternative batch system ongoing • Testing zabbix as a platform for monitoring computing resources • Possible alternative to nagios + lemon • WNs dynamic update to deal mainly with kernel/cvmfs/gpfsupgrades • Evaluating APEL as an alternative to DGAS for grid accounting system Andrea Chierici

  15. Storage

  16. Storage Resources • Disk Space: 15.3 PB-N (net) on-line • 7 EMC2 CX3-80 + 1 EMC2 CX4-960 (~2 PB) + 100 servers (2x1 gbps connections) • 7 DDN S2A 9950 + 1 DDN SFA 10K + 1 DDN SFA 12K(~11.3PB) + ~80 servers (10 gbps) • Installation of the latest system (DDN SFA 12K 1.9 PB-N) was completed this summer • ~1.8 PB-N expansion foreseen before Christmas break • Aggregate bandwidth: 70 GB/s • Tape library SL8500 ~16 PB on line with 20 T10KB drives and 13 T10KC drives (3 additional drives were added during summer 2013) • 8800 x 1 TB tape capacity, ~ 100MB/s of bandwidth for each drive • 1200 x 5 TB tape capacity, ~ 200MB/s of bandwidth for each drive • Drives interconnected to library and servers via dedicated SAN (TAN). 13 Tivoli Storage manager HSM nodes access the shared drives • 1 Tivoli Storage Manager (TSM) server common to all GEMSS instances • A tender for additional 470 x 5TB tape capacity is under way • All storage systems and disk-servers on SAN (4Gb/s or 8Gb/s) Andrea Chierici

  17. Storage Configuration • All disk space is partitioned in ~10 GPFS clusters served by ~170 servers • One cluster per each main experiment (LHC) • GPFS deployed on the SAN implements a full HA system • System scalable to tens of PBs and able to serve thousands of concurrent processes with an aggregate bandwidth of tens of GB/s • GPFS coupled with TSM offers a complete HSM solution: GEMSS • Access to storage granted through standard interfaces (posix, srm, xrootd and soon webdav) • FS directly mounted on WNs Andrea Chierici

  18. Storage research activities • Studies on more flexible and user-friendly methods for accessing storage over WAN • Storage federation implementation • cloud-like approach • We developed an integration between GEMSS Storage System and Xrootd in order to match the requirements of CMS and ALICE, using ad-hoc Xrootd modifications • CMS modification was validated by the official Xrootd integration build • This integration is currently in production • Another alternative approach for storage federations, based on http/webdav (Atlas use-case), is under investigation Andrea Chierici

  19. LTDP • Long Term Data preservation (LTDP) for CDF experiment • FNAL-CNAF Data Copy Mechanism is completed • Copy of the data will follow this timetable: • end 2013 - early 2014 → All data and MC user level n-tuples (2.1 PB) • mid 2014 → All raw data (1.9 PB) + Databases • Bandwidth of 10 Gb/s reserved on transatlantic Link CNAF ↔ FNAL • “code preservation” issue to be addressed Andrea Chierici

  20. Common services

  21. Installation and configuration tools • Currently Quattor is the tool used at INFN-T1 • Investigation done on an alternative installation and management tool (study carried on by storage group) • Integration between two tools: • Cobbler, for installation phase • Puppet, for server provisioning and management operations • Results of investigation demonstrate Cobbler + Puppet as a viable and valid alternative • currently used within CNAF OpenLAB Andrea Chierici

  22. Grid Middleware status • EMI-3 update status • Argus, BDII, Cream CE, UI, WN, Storm • Some UIs still at SL5 (will be upgraded soon) • EMI-1 phasing-out (only FTS remains) • VOBOX updated to WLCG release Andrea Chierici

More Related