1 / 11

GRIF Status grif.fr

GRIF Status http://grif.fr. Michel Jouvin LAL / IN2P3 jouvin@lal.in2p3.fr. Objectives. Build a Tier2 facility for simulation and analysis in Paris Region 80% LHC 4 experiments, 20% EGEE and local LHC : analysis (2/3) and MC simulation (1/3) Be ready at LHC startup (2 nd half of 2007)

nira
Download Presentation

GRIF Status grif.fr

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GRIF Statushttp://grif.fr Michel Jouvin LAL / IN2P3 jouvin@lal.in2p3.fr

  2. Objectives • Build a Tier2 facility for simulation and analysis in Paris Region • 80% LHC 4 experiments, 20% EGEE and local • LHC : analysis (2/3) and MC simulation (1/3) • Be ready at LHC startup (2nd half of 2007) • Resource goals(end of 2007) • CPU : 1500 kSI2K (1kSI2K ~ P4 Xeon 2,8 Ghz) • Storage : 350 TB of disks (disk only, no MSS) • Network : 10 Gb/s backbone inside Tier2, 1 Gb/s external link GRIF Tier2 - HEPix - SLAC 2005

  3. Members • Project started by DAPNIA (CEA), LAL (IN2P3, Orsay) and LPNHE (IN2P3, Paris), Fall 2004 • DAPNIA and LAL involved in Grid effort since beginning of EDG • 3 EGEE contracts (2 for operation support) • No lab big enough to run a T2 by itself • LLR (IN2P3, Palaiseau) and IPNO (IN2P3, Orsay) joined the project in Sept. 05 • IPNO : nuclear physics (Alice + Agatha) • LLR : CMS GRIF Tier2 - HEPix - SLAC 2005

  4. Organization • 1 EGEE/LCG site, distributed over all labs • Computing and storage resources in each lab • Computing rooms and financing • IPNO wil concentrate on non LHC resources funding • 1 Gb/s link for IPNO, LAL, LPNHE, “soon” for DAPNIA • Technical Committee : people from every lab • 5 FTE in 2005, 6-7 in 2006, more in 2007 • Currently 15-20 people involved (several part time) • M. Jouvin (chairman), P. Micout, P.F. Honoré… • Scientific Committee (fund raising) • J.P. Meyer (DAPNIA/Atlas, chairman), 1 person / lab GRIF Tier2 - HEPix - SLAC 2005

  5. Finances • Total budget estimated to 1,6 M€ (2005-2007) • 30% from Region council • 30% from National Research Agency (ANR) • 40% from the labs (CEA, CNRS, Paris6 university) • No significant support from IN2P3 / LCG France (focused on T1) • ½ budget still uncertain… First answers soon… • Progressive investment : no HW replacement before 2009 • 2005 : 150 K€, 2006 : 450 K€, 2007 : 1 M€ • If necessary, could use 2008 to spread the effort • 2009+ : 300 K€/year expected from IN2P3/LCG France GRIF Tier2 - HEPix - SLAC 2005

  6. Current Status • EGEE/LCG GRIF site created • IN2P3-LAL decommissionned, resources moved to GRIF • 2 sites with resources, 2 sites ordering • DAPNIA : 20 WNs CPUs, 12 TB, installation in progress • LAL : 26 WNs CPUs, 8 TB (SRM/DPM), LCG services • 4,5 TB on order • LPNHE : 15 WNs CPUs, 5 TB ordered soon • IPNO : 20 WN CPUs (dual core blades) • End of 2005 : 80 WNs CPUs, 25 TB • Separate CE/SE on each site GRIF Tier2 - HEPix - SLAC 2005

  7. 2005 Main Activities… • Setup of resources on each site • Global configuration consistency : Quattor choosen • Flexible site customization inside a unique database • Setup of a multi-site technical team • Tutorials for new sites administrators • Sharing management load (ex : middleware upgrade) • Write documentation for sharing information and expertise (Trac) GRIF Tier2 - HEPix - SLAC 2005

  8. … 2005 Main Activites • Evaluate DPM as a storage solution • Successful so far, easy to setup and manage • Quattor component written to manage DPM configuration • Plan to evaluate a multi-site configuration • Disk servers on several sites • Current lack of srmcp is a problem with CMS/Phedex • Participation to LCG SC3 • Throughput phase : 35 MB/s sustained 4 days • Plan to join service phase mid-november GRIF Tier2 - HEPix - SLAC 2005

  9. 2006 : Mini Tier2 • Main goal : setup 20+% of final configuration • 300 WNs CPUs, 70 TB • Exact size wil depend on fund rising success… • Focus • Muti-site or mono-site CE/SE resources • Final choice for batch scheduler : evaluation of LSF and SGE • Final choice for SE architecture (DPM only, DPM + LUSTRE) • Setup of monitoring tools : Nagios ?, Lemon ?, others ? • Integration with local operations on each site • Miscellanous • Continue active participitation to SC • Evaluation of 10 Gb/s link feasibality and effectiveness • Computer rooms requirements (electrical power, air cooling…) GRIF Tier2 - HEPix - SLAC 2005

  10. Storage Challenge • Efficient use and management of a large amount of storage seen as the main challenge • Access to data from 1000+ CPUs, no staging • Decided to start partnership with HP on LUSTRE in the Grid (LCG) context • Performance with a large number of clients • Geographically distributed LUSTRE configuration • Replication of critical datas (metadatas) among sites • SRM and/or xrootd integration • Funds requested to ANR, answer soon… • Uncertainty with HP troubles in France… GRIF Tier2 - HEPix - SLAC 2005

  11. Batch Scheduler • 1 unified T2 means 1 batch scheduler • Required for a coherent view/publishing of resources • Main requirements • Efficient use of distributed resources • Handle 1000+ running jobs, 10Kjobs in queues • Torque may not be appropriate • Scalability and rosbustness, lack of dynamic reconfiguration • Looking at LSF • LAL has experience for its internal use (and contacts…) • Multicluster may offer the flexibility for global unified resource but maintaining some job/resources affinity at each site • Evaluation to start soon : 1 cluster+CE per site + cross submission • Other candidates : SGE, Condor ? GRIF Tier2 - HEPix - SLAC 2005

More Related