140 likes | 147 Views
This computing cluster at Sheffield University consists of interactive and batch clusters, equipped with self-built Linux boxes, AMD Athlon XP CPUs, and ample RAM. The cluster is used for various scientific research projects, making use of software such as Geant4, ROOT, and Atlas 10.0.1.
E N D
HEP Computing Status Sheffield University Matt Robinson Paul Hodgson Andrew Beresford
Interactive Cluster • 30 self built linux boxes • AMD Athlon XP cpu’s, 256/512 meg ram • OS Scientific Linux 303 • 100 megabit network • Use NIS for authentication, NFS mount /home etc • System install using kickstart + post install scripts • Separate backup machine • 15 Laptops mostly dual boot • Some MAC’s and one Windows Box • 3 Disk servers mounted as /data1 /data2 etc (few TB)
Batch Cluster • 100 cpu farm Athlon XP 2400/2800 • OS Scientific Linux 303 • NFS mounted /home and /data • OpenPBS batch system for job submission • Gigabit Backbone with 100 MBit to worker nodes • Disk server provides 1.3 TB as /data Raid5 • Entire cluster assembled in house from OEM components for less than 50k • Hard part was finding air-conditioned room with sufficient power
Software • PAW, CERNLIB etc • Geant4 • ROOT • Atlas 10.0.1 • FLUKA • ANSYS, LS-DYNA
Comments - Issues • Have tightened up security in last year • Strict firewall policy, limited machine exemption • Blocking scripts prevent ssh access after 3 authentication failures within 1 hour • Cheap disks allow construction of large disk arrays • Very happy with SL3 for desktop machines • Use FC3 for Laptops – 2.6 kernel
Division of Hardware 162 x AMD Opteron 250 (2.4 GHz) 4 GB RAM/box (2 GB/CPU) 72 GB U320 10K RPM local SCSI disk Currently running 32 bit SL303 for maximum compatibility with grid. ~2.5 TB storage for experiments. Middleware: 2.4.0 Probably the most purple cluster in the grid.
Usage so far • We can take quite a bit more.
Monitoring • Ganglia with modified webfrontend to present queue information
Installation • Service nodes connected to VPN and Internet • PXE Installation via VPN allows complete control of dhcpd and named • RedHat kickstart + post install script • ssh servers not exposed • RGMA always the hardest part • Stumbled across routing rules. • WN install takes about 30 minutes, can do up to 40 simultaneously.
Matt Robinson: Future plans • Keep up with middleware updates • Increase available storage as required in ~3-4 TB steps • Fix things that break • Try not to mess anything up by screwing around • Look toward operating with 64 bit OS.