130 likes | 148 Views
Cambridge HEP Group - site report April 2005. John Hill. Group hardware (non-GRID). Three file servers: 1 ALPHAserver (DIGITAL UNIX 4.0F) – soon to be decommissioned (no other D.U. left in group). 2 DELL (Intel) servers (one is running Linux Redhat 7.3, the other SLC 304).
E N D
Cambridge HEP Group - site reportApril 2005 John Hill HEP SYSMAN meeting
Group hardware (non-GRID) • Three file servers: • 1 ALPHAserver (DIGITAL UNIX 4.0F) – soon to be decommissioned (no other D.U. left in group). • 2 DELL (Intel) servers (one is running Linux Redhat 7.3, the other SLC 304). • total of ~10TB disk space served • 3TB IDE-SCSI RAID • 5.8TB SCSI-SCSI RAID • Rest JBODs • DLT8000 autoloader • LTO-3 autoloader on order • all UPS-protected HEP SYSMAN meeting
Group hardware (non-GRID) • RAID experiences: • IDE-SCSI: very painful! Individual disk “failures” daily when array being used heavily. About 2 years too late – find this seems to be a feature of IDE disks. Very few of these failures are terminal – disk works fine when reintroduced into the array. Only a handful of disks replaced in 2.5 years. • SCSI-SCSI: so far so good. 2 disk failures (out of 48 in use) in last 4-5 months, neither due to heavy usage. At present I’m assuming the bathtub model. • Of course, the downside is that the SCSI-SCSI disks are much more expensive, and it is not a route we can continue down. HEP SYSMAN meeting
Group hardware (non-GRID) • 69 desktop PCs (give-or-take...) • 41 Linux RedHat 7.3 (soon SLC3) • 27 Windows XP/2000 • 1 Windows 95 (for our probestation) • wide range of performance (400MHz Pentium II up to 3.2 GHz Pentium 4) • 1 MAC desktop to complicate matters! • 33 PC and 6 MAC laptops registered (but certainly not all are real machines...). Purchased with a confusing mix of group, college and private funds. • Also a few PDAs starting to appear. HEP SYSMAN meeting
Group Hardware (non-GRID) • Printers • 2 LexMark B&W laser printers (20ppm+) – 1 supports A3 for (e.g.) CAD drawings • HP Color LaserJet 4550 (for paper) • 2 HP Business DeskJets (for transparencies) • Epson A2 colour inkjet (specialised printing: CAD, posters,...) HEP SYSMAN meeting
Hardware for GRID work • Long-standing: 20-CPU analysis farm (MIMCluster from Workstations UK) • 1.13 GHz Pentium III CPUs • Linux SL3 • Recently added further 10 systems: • 2.8 GHz Xeon dual-CPU DELL PowerEdge 1850 servers • Linux SLC3 • Also 4 systems provided by GridPP: • 2.8GHz Xeon dual-CPU Streamline Computing servers running SL3 • Intended as front-end boxes for LCG2 deployment. • 3TB IDE-SCSI RAID specifically for GRID work (ie. in addition to that mentioned earlier). • GRID computers are being used both for HEP grids and for CamGrid – this is creating a few configuration headaches! HEP SYSMAN meeting
Network • Wired: • departmental network based on switches from Extreme Networks • Gigabit Ethernet fibre-optic backbone between switches • a minimum of Fast Ethernet to all desktops (with Gigabit available, though not yet deployed) • Gigabit connection to campus backbone • Gigabit Ethernet on campus backbone (likely to be upgraded to 10Gbps soon) • Campus network connects to EastNET via a ~8Gbps link and hence to SuperJANET. • departmental network currently has a mainly “flat” topology - but VLANs are being slowly introduced. HEP SYSMAN meeting
Network • Wireless: • Group bought a cheap and cheerful interim solution using Buffalo wireless hub • Use limited to registered laptops and PDAs • 19 PC and 5 MAC laptops, 3 PDAs registered in hub • Intended initially for convenience of users meeting in group library (where hub is sited) but find that used heavily from offices also. • Range is limited (~10-15m, which helps with security!) due to metal framework of building, so not all group members can see the hub. • Department considering a rollout of wireless, but not a high priority (mainly because many influential groups will not permit it in their areas, as it interferes with experiments). HEP SYSMAN meeting
Campus Group Hub Central switch departmental City SuperJANET switch PoP Router Hub router Group switch Hub Hub Hub 1Gbps optical fibre Printer Server Physics Computer 100Mbps UTP Department SuperJANET connection via EastNET (~8Gbps total) HEP SYSMAN meeting
Video Conferencing • “mid-range” system (Zydacron Z360 (H.323) and ZC206 (ISDN) cards) • Sony EVI D31 Camera (pan, tilt, zoom) • hosted by (aging) 500MHz Pentium III PC • use (existing) data projector to display video on projection screen • OK for up to ~12 people (though best for 6 or fewer!) • in use for nearly 4 years now, so beginning to consider a replacement • Possible that the department may provide a VC system within the next year. HEP SYSMAN meeting
Software • Nothing special… • AutoCAD, CADENCE etc for mechanical and electronic CAD work – all CAD work now PC-based (SUN decommissioned last Christmas). • Moving to SLC3 on desktops as soon as feasible. • Departmental license for Mathematica – allows home use for no extra cost. • XFree86 on Windows XP for X11 provision. • Group continues to run its own mail server (using Exim) – probably would be better to use campus facilities, but users not convinced! • GRID nodes use Condor for batch – common pool for LCG and CamGrid. Conflicting networking requirements, as well as Condor’s immaturity in some areas is causing some difficulty with this arrangement at present. HEP SYSMAN meeting
Future plans • As usual, our plans are very fluid – the world changes quickly, and physicists rarely know what they really want ahead of time! Hence try to avoid large purchases where possible. • Cycle of desktop replacement is slowing – even 3-4 year-old PCs are adequate and the performance of new PCs is relatively static at present. • Extra disk space will be needed: ~6-8TB/year is current best guess. • Continue to enhance our farm provision – partly by taking advantage of CamGrid for HEP use. HEP SYSMAN meeting
Concerns • Main concern is how we manage all the extra kit in the medium term –especially as the system management team will become heavily embroiled in ATLAS/LHCb commissioning very soon. • Security obviously a permanent concern. Departmental procedures are improving (from a very low base...). Most recent incidents have been due to human error rather than software loopholes. • Also a problem coping with experimental bloatware – which is not confined to LHC experiments HEP SYSMAN meeting