170 likes | 244 Views
IC HEP Computing Review. Geoff. Fayers G.J.Fayers@ic.ac.uk. Aleph BaBar CMS Dark Matter (Boulby) Diamond-based applied research D0. LHCb Silicon-based detector development (esp. CMS) Zeus. Current Experiment Programme. DEC4610 AXP 5 DEC AXP W/stns 4 more? 2 SUN Enterprise Svrs
E N D
IC HEP Computing Review Geoff. Fayers G.J.Fayers@ic.ac.uk
Aleph BaBar CMS Dark Matter (Boulby) Diamond-based applied research D0 LHCb Silicon-based detector development (esp. CMS) Zeus Current Experiment Programme
DEC4610 AXP 5 DEC AXP W/stns 4 more? 2 SUN Enterprise Svrs 2 SUN W/stns 1 NT Server+BDC ~37 PCs ~18 X-terms Ageing Macs + some new G3+later 2 RIOS PPCs (lynx) ~150 assigned IP addrs ~1.25Tb storage excl. some local PC disks Printers mostly HP 1 VAXStn 4000/60 Current Computing Hardware
DUX 4.0D on all AXP Solaris 2.6 on Enterprises (BaBar) Solaris 2.7 otherwise SAMBA NT4 + Office97/2000 =eXceed at desktop NT4 Server Linux5.2 (D0 FNAL variety) AFS on all UNIX Europractice Selection incl. Synopsys, HSPICE, Xilinx, et al. AutoCAD2000 ++ Ranger PCB design MS, Adobe packages for NT (Optivity) Dr. Solomon’s AVtk Current Software
Some Features • Homogeneous systems, NIS, NIS+, DNS • Automount distributed file systems • Quotas on home dirs. • User-based scratch areas, 7-day files’ lifetime • User-based stage areas • Per-experiment data areas on rwx rwx r-x permissions • Recent news headlines at login • Optional news via e/mail • Process priority monitor
Concerns - 1 • Non-convergence of Linux versions for LHC, US? • Potential fragmentation of resources • Non-scalability of current storage deployment • Management effort for NT: SMS, other? • No NT YP map capability (except Solaris6/NT4) • Vagaries of A/V software • Possible management of all http traffic by ICCS • Costs of MS licences and network management tools • Will 64-bit Linux (e.g IA64) supplant PC arch.? • Possible withdrawal of ULVC Europractice support
Concerns - 2 • Poor US connectivity esp. to FNAL for D0 • Network management tools? • Short investment cycles for PCs • Metaframe expensive thin client solution • Uncertain future PCI replacement • Security, security, security, ……...
Networking • < Summer 1999: • = 10Base5, 10Base2 • 16-port 10BaseT Switch, HDx only • NFS traffic via 2nd NICs through switch • 5/10/20/80m non-standard custom braided fly leads only • Limited division of 10Base2 collision domain to ease congestion • Tolerable response (mostly) • WAN via DECnis600 router: 10Base5 to AGS4+ on Campus FDDI (IP routing only)
Networking Upgrade • Upgrade Summer 1999 • Decisions: • Expected lifetime • Likely technologies and services required • Density of service ports • Future campus connectivity • Supporting hardware
Expected Lifetime • Existing 10Base5/2 10 years old new system at least 10 years • DECnis600 ~6 years new network hardware at least 5 years
Future Technologies & Services - 1 • Summer 1999: • 803.1d bridging standard updated and fully ratified as 802.1D-1998 (all 802.* technologies embraced) • GBE Adaptors at commodity prices, but... • PCs too slow • PCI-X interim solution • GBE as 1000-BaseSX and 1000-BaseLX only • GBE over Class D/E UTP expected Mar. 2000 • Proprietary Class E solutions emerging for 4 x 250MHz • Class F as 4 x 300MHz on horizon (but irrelevant for HEP?)
Future Technologies & Services - 2 • IEEE HSSG already convened (appendix) • LANe still complicated • DWDM very likely in SJ4 - ATM will wane
Current aggregated b/w > 3.7 GB/sec, so Catalyst 5000 and some competitors useless (received wisdom: 2 x aggregate) Increasing use of laptops Outlet density to ISO standards - high No ATM adaptors for CoreBuilder 3500’s, so…... ICCS ATM spine strategy ditched……. Campus CellPlex7000’s dumped in favour of 9300’s (GBE) Stuck with CoreBuilder 3500 100BaseFX uplinks (poor performance, 802.3q implementation lagging) Possible need for ISDN DECnis600 replacement routing capability The Other Issues
Structured Cabling Decision • Full Cat5e compatibility and compliance mandatory • Mixture of Class D and Class E proprietary UTP • MMF and SMF unterminated in corridor ceiling voids • Insist on Class E performance testing to 250MHz • Insist on full compliance to draft Class E standard • Cat6 Guarantee underpinned by Manufacturer Indemnity • Only tied Certified Installers considered • Leverage early take-up and prestige potential
Implementation • 8 ‘Cat6’ passive component manufacturers evaluated • Only one testing fully in 200 - 300 MHz envelope • Got flood Cat6 at 10% more than Cat5 • 17.5GB/sec non-blocking Extreme Summit48, level 2 and level 3 switching, 4-levels QoS (upgradable) • 2 x 1000BaseSX ports, optional 1000BaseLX trcvrs • HP ProCurve 2424M 10/100 switches • HP ProCurve 24 10/100 hubs • 100BaseFX uplinks to Campus spine
Problems • Completed Dec 1999 • Test results presented on CD OK but…. • Manufacturing fault on some ports - resolution in hand…... • HP 2424M fault…….HP to replace (lifetime on-site warranty)
Future • RAID-based storage, possibly home-grown solution • Non-proprietary SAN based on dual cpu 64-bit server, GBE to Extreme • Linux Farm(s) with 100BaseT cpus • DHCP • WAN connectivity resilience via FOIL? • Possible Campus Spine bypass? • Managed bandwidth pilot (BaBar) • H.323/320 VC