230 likes | 397 Views
Liverpool HEP - Site Report May 2012. John Bland, Robert Fay. Staff Status. New Grid administrator: Mark Norman Who joins: Steve Jones Plus HEP sys admins : John Bland, Robert Fay. Current Hardware - Users. Desktops ~100 Desktops: Scientific Linux 5.5, Windows 7+XP, Legacy systems
E N D
Liverpool HEP - Site Report May 2012 John Bland, Robert Fay
Staff Status • New Grid administrator: • Mark Norman • Who joins: • Steve Jones • Plus HEP sys admins: • John Bland, Robert Fay
Current Hardware - Users • Desktops • ~100 Desktops: Scientific Linux 5.5, Windows 7+XP, Legacy systems • Minimum spec of 2.66GHz Q8400, 4GB RAM + TFT Monitor • Recently upgraded, clean installs, single platform • Opportunistic batch usage (~60cores) • Laptops • ~60 Laptops: Mixed architecture • Windows+VM, MacOS+VM, Netbooks • Printers • Samsung and Brother desktop printers • Various HP model heavy duty group printers
Current Hardware – ‘Tier 3’ Batch • ‘Tier3’ Batch Farm • Software repository (0.5TB), storage (3TB scratch, 13TB bulk) • ‘medium64’, ‘short64’ queues consist of 9 64bit SL5 nodes (2xL5420, 2GB/core) • Plus ‘desk64’ for aforementioned desktop usage and ‘all64’ • 2 of the 9 SL5 nodes can also be used interactively • 5 older interactive nodes (dual 32bit Xeon 2.4GHz, 2GB/core) • Using Torque/PBS/Maui+Fairshares • Used for general, short analysis jobs • Grid jobs are occasionally run opportunistically on this cluster (not much recently due to steady local usage)
Current Hardware – Servers • ~40 core servers (HEP+Tier2) • Some rack Gigabit switches • 1 High density Force10 switch (400 ports) • Moving to 10Gb • Console access via KVMoIP (when it works) + IPMI • KVM installed • LCG Servers • Most services on Virtual machines (lcg-CE, CREAM*2, sBDII, APEL, Torque, ARGUS, test cluster) • Still more to come (UI) • KVM • Redundant KVM servers for load sharing and testing
Current Hardware – Virtual • Heavy duty KVM servers for grid and HEP services HVM1 24cpu HGVM3 24cpu Production NFS iSCSI Shared Storage HGVM1 8cpu HGVM2 8cpu Testing
Current Hardware – Nodes • HAMMER • Mixed nodes: • 7 x dual L5420 • 64 x dual E5620 • 16 x dual X5650 • Lots of room to work in • IPMI+KVM makes life so much easier • Also hot swap drives and mboards • IPMI monitoring not entirely trustworthy • Sensor reading spikes • Occasionally needs rebooting
Configuration and deployment • Kickstart used for OS installation and basic post install • Used with PXE boot for some servers and all desktops • Puppet used for post-kickstart node installation (glite-WN, YAIM etc) • Also used for keeping systems up to date and rolling out packages • And used on desktops for software and mount points • Custom local testnode script to periodically check node health and software status • Nodes put offline/online automatically • Keep local YUM repo mirrors, updated when required, no surprise updates
HEP Network • Grid cluster is on a sort-of separate subnet (138.253.178/24) • Shares some of this with local HEP systems • Most of these addresses may be freed up with local LAN reassignments • Monitored by Cacti/weathermap, Ganglia, Sflow/ntop (when it works), snort (sort of) • Grid site behind local bridge/firewall, 2G to CSD, 1G to Janet • Shared with other University traffic • Upgrades to 10G for WAN soon (now in progress!) • Grid LAN under our control, everything outside our firewall controlled by our Computing Services Department (CSD) • We continue to look forward to collaborating with CSD to make things work
Network Monitoring • Ganglia on all worker nodes and servers • Cacti used to monitor building switches and core Force10 switch • Throughput and error readings + weathermap • Ntop monitors core Force10 switch, but still unreliable • sFlowTrend tracks total throughput and biggest users, stable • LanTopolog tracks MAC addresses and building network topology • arpwatch monitors ARP traffic (changing IP/MAC address pairings).
Monitoring - Cacti • Cacti Weathermap
Current Hardware – Network • Getting low on cables • Network capacity our biggest upcoming problem • Moving to 10Gb • Hurrah! • Bonding was getting ridiculous as density increased • But worked very well
HEP Network topology 1G 2G 1G 1-3G 10G 192.168 Research VLANs 2G x 24 1500MTU 9000MTU
WAN Old 1G Network 2G E600 Chassis 6G 3G Servers Storage 1G 1G Nodes
Current Hardware – Network • New 10Gb kit • Switches! • 3 x S4810 core switches • 10 x S55 rack switches • Optics! • 4x LR • 2x SR • NICs! • 40 x SolarFlare dual port low-latency 10G • Boxes! • 3x Dell R610s • 1x Dell R710
WAN New 10G Network 20G 20G 80G 80G S4810 S4810 S4810 Future… 30G 30G S55 Servers S55 Storage 2G 2G Nodes
Current Hardware – Network • Initial testing • Jumbo frames still of benefit to bandwidth and CPU usage • Packet offloading turned on makes a bigger difference, but should be on by default anyway • 10Gb switches now installed in racks • Relocation of servers upcoming
Storage • Majority of file stores using hardware RAID6. • Mix of 3ware, Areca SATA controllers and Adaptec SAS/SATA • Arrays monitored with 3ware/Areca software and nagios plugins • Software RAID1 system disks on all servers. • A few RAID10s for bigger arrays and RAID0 on WNs • Now have ~550TB RAID storage in total. Getting to be a lot of spinning disks (~700 enterprise drives in WNs, RAID and servers). • Keep many local spares • Older servers were upgraded 1TB->2TB • Trickle down of 1TB->0.75TB->0.5TB->0.25TB upgrades • Also beefed up some server local system/data disks
Central Clusters • Spent last few years trying to hook up UKI-NORTHGRID-LIV-HEP to NWGRID over at the Computing Services Department (CSD) • Never really worked • Too many problems with OS versions • SGE awkward • Shared environment and configuration by proxy is painful • New £1 million procurement now in progress • We’re on the procurement committee • Physics requirements included from the outset • Steering and technical committees will be formed • Looks promising
Security • Network security • University firewall filters off-campus traffic • Local HEP firewalls to filter on-campus traffic • Monitoring of LAN devices (and blocking of MAC addresses on switch) • Single SSH gateway, Denyhosts • Snort and BASE (still need to refine rules to be useful, too many alerts) • Physical security • Secure cluster room with swipe card access • Laptop cable locks (occasionally some laptops stolen from building) • Promoting use of encryption for sensitive data • Parts of HEP building publically accessible • We had some (non-cluster) kit stolen • Logging • Server system logs backed up daily, stored for 1 year • Auditing logged MAC addresses to find rogue devices
Plans and Issues • Research Computing • Research now more of an emphasis • Various projects in pipeline • GPGPU work • Tesla box (dual Tesla M2070s) • Looking forward to MIC • Cluster room cooling still very old, regular failures • Bit more slack after the MAP2 removal but failures increasing • However central Facilities Management maintenance has improved • University data centre strategy continues to evolve • There were plans to move CSD kit into the room, but this is not happening as things stand • While we’re on the subject of cooling, we had an incident…
It’s getting hot in here… • One Saturday morning, a large compressor in the building basement blew a fuse • This took out the main building power • Our air-conditioning is on the main building power • But our cluster power is not… • Ambient temperature hit ~60C before building power was restored • A bit warmer than Death Valley and Africa at their hottest • We had some measures in place • Some systems shut down • But there were some problems • External network link went down with building power failure • Need to automatically switch off systems more aggressively • Aftermath • Some hard drives failed • Some insist on telling us about this one time it got really hot (smartctl health check return code has bit 5 set, indicating passed check but attribute has been <= threshold)
Conclusion • New kit in (nodes and 10Gb networking) • New CSD cluster share coming (we hope) • Research becoming much more of a focus • Despite the odd issue (60C is no temperature to run a cluster at), things continue to run well