140 likes | 150 Views
23 January 2003 Sverre Jarp Openlab Chief Technologist IT Division – CERN. CERN opencluster Project : Enterasys Review Meeting. The openlab “advantage”. openlab will be able to build on : CERN-IT’s technical talent CERN’s existing computing environment
E N D
23 January 2003 Sverre Jarp Openlab Chief Technologist IT Division – CERN CERN openclusterProject:Enterasys Review Meeting January 2003
The openlab “advantage” openlab will be able to build on: • CERN-IT’s technical talent • CERN’s existing computing environment • The size and complexity of the LHC computing needs • CERN’s strong role in the development of GRID “middleware” • CERN’s ability to embrace emerging technologies January 2003
IT Division • 250 people, ~about 200 engineers • New Leader: W. von Rüden • 10 groups: • Advanced Projects’ Group (part of DI) • (Farm) Architecture and Data Challenges (ADC) • Communications Services (CS) • Grid Deployment (GD) • Databases (DB) • Fabric Infrastructure and Operations (FIO) • Internet (and Windows) Services (IS) • User Services (US) • Product Support (PS) • (Detector) Controls (CO) January 2003
Expected LHC needs Moore’s law (basedon 2000) It is the network that will tie everything together ! January 2003
Back to: opencluster • Industrial Collaboration • Enterasys, HP, and Intel are ourpartners today • Additional partner(s) may be joining soon • Disk/storage subsystem • Technology aimed at the LHC era • Network infrastructureat 10 Gigabits • Rack-mounted HP servers • Itanium processors • Cluster evolution: • 2002: Cluster of 32 systems (64 1 GHz Itanium-2 processors) • 2003: 64 systems (“Madison” processors) • 2004: Possibly 128systems (“Montecito” processors) January 2003
opencluster - phase 1 • Establish the openCluster • 32 nodes + development nodes • Rack-mounted DP Itanium-2 systems • RedHat 2.1 Advanced Workstation • OpenAFS, LSF • GNU, Intel Compilers (+ ORC?) • Database software (MySQL, Oracle?) • CERN middleware: Castor data mgmt • CERN Applications • Porting, Benchmarking, Performance improvements • CLHEP, GEANT4, ROOT, Sixtrack, (CERNLIB?) • Cluster benchmarks • 10 Gigabit network Estimated time scale: 6 months January 2003
The compute nodes • HP rx2600 • Rack-mounted (2U) systems • Two Itanium-2 processors • 900 or 1000 MHz • Field upgradeable to next generation • 4 GB memory (max 12 GB) • 3 hot pluggable SCSI discs (36 or 73 GB) • On-board 100 and 1000 Mbit Ethernet • 4 full-size 133 MHz/64-bit PCI-X slots • Built-in management processor • Accessible via serial port or Ethernet interface January 2003
Cluster Interconnect • Urgent need for 10 Gb switches • Attach all 32 “phase-1” systems • How many ports can be at 10 Gb ? • Intel is providing 32 10-Gb cards • Demonstrate peak speed: • CPU-server CPU server • CPU-server Disk server • Minimum desired throughput per server: • 200 – 400 MB/s • Rationale: • Feed 2 (then 4, maybe even 8) processors per box • Stream 100++ MB/s per processor • Make data available as if they were in local memory ! January 2003
The “perfect” collaboration • Every openlab partner will be sollicited: • Enterasys • 10 Gb infrastructure • Intel • Processors • Network cards • Hewlett-Packard • Server hardware • Memory bus and banks, PCI-X bus • CERN • Expertise • Network, Linux (kernel, drivers), Clustering (CPU serves, Disk servers) • Tuning effort on all fronts • Project coordination All must work in very close collaboration ! January 2003
Interconnect outlook • Phase 2 (also this year – second half) • We will double the size of the cluster • 64 systems (HP) • Combination of 1 GHz and 1.5 GHz processors (Intel) • All with 10 Gb cards (Intel) • Need to understand how to interconnect • Same questions: • What will be our 10 Gb infrastructure by then ? • How many ports can run at 10 Gb ? • What kind of aggregate throughput can we expect? January 2003
opencluster - phase 2 • Focus on EDG: European Data Grid • Integrate OpenCluster alongside EDG testbeds • Porting, Verification • Relevant software packages (hundreds of RPMs) • Understand chain of prerequisites • Interoperability with WP6 • Integration into existing authentication scheme • GRID benchmarks • To be defined later “Golden” opportunity to interconnect high-speed LAN with a high-speed WAN ! Estimated time scale: 9 months (may be subject to change!) January 2003
opencluster time line Order/Install 32 nodes Systems experts in place – Start phase 1 Complete phase 1 Start phase 2 Order/Install G-2 upgrades and 32 more nodes Complete phase 2 Start phase 3 openClusterintegration Order/Install G-3 upgrades; Add nodes EDG inter-operability LCG interoperability Jan.03 Jan.04 Jan.05 Jan.06 January 2003
openlab technical coordination • Technical meetings • One-on-one (as today) • Desired frequency? • A need for plenary technical meetings • Proposal: During week of February 17th (18th ?) • Then: Every 3 months (possibly also as phone conference) • Topical workshops • 2 – 3 per year • Coming up: • Data and Storage Management (March 17th/18th) • Annual events • May/June ? January 2003
BACKUP January 2003