1 / 23

John Bland, Robert Fay

Liverpool HEP - Site Report May 2012. John Bland, Robert Fay. Staff Status. New Grid administrator: Mark Norman Who joins: Steve Jones Plus HEP sys admins : John Bland, Robert Fay. Current Hardware - Users. Desktops ~100 Desktops: Scientific Linux 5.5, Windows 7+XP, Legacy systems

shiro
Download Presentation

John Bland, Robert Fay

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Liverpool HEP - Site Report May 2012 John Bland, Robert Fay

  2. Staff Status • New Grid administrator: • Mark Norman • Who joins: • Steve Jones • Plus HEP sys admins: • John Bland, Robert Fay

  3. Current Hardware - Users • Desktops • ~100 Desktops: Scientific Linux 5.5, Windows 7+XP, Legacy systems • Minimum spec of 2.66GHz Q8400, 4GB RAM + TFT Monitor • Recently upgraded, clean installs, single platform • Opportunistic batch usage (~60cores) • Laptops • ~60 Laptops: Mixed architecture • Windows+VM, MacOS+VM, Netbooks • Printers • Samsung and Brother desktop printers • Various HP model heavy duty group printers

  4. Current Hardware – ‘Tier 3’ Batch • ‘Tier3’ Batch Farm • Software repository (0.5TB), storage (3TB scratch, 13TB bulk) • ‘medium64’, ‘short64’ queues consist of 9 64bit SL5 nodes (2xL5420, 2GB/core) • Plus ‘desk64’ for aforementioned desktop usage and ‘all64’ • 2 of the 9 SL5 nodes can also be used interactively • 5 older interactive nodes (dual 32bit Xeon 2.4GHz, 2GB/core) • Using Torque/PBS/Maui+Fairshares • Used for general, short analysis jobs • Grid jobs are occasionally run opportunistically on this cluster (not much recently due to steady local usage)

  5. Current Hardware – Servers • ~40 core servers (HEP+Tier2) • Some rack Gigabit switches • 1 High density Force10 switch (400 ports) • Moving to 10Gb • Console access via KVMoIP (when it works) + IPMI • KVM installed • LCG Servers • Most services on Virtual machines (lcg-CE, CREAM*2, sBDII, APEL, Torque, ARGUS, test cluster) • Still more to come (UI) • KVM • Redundant KVM servers for load sharing and testing

  6. Current Hardware – Virtual • Heavy duty KVM servers for grid and HEP services HVM1 24cpu HGVM3 24cpu Production NFS iSCSI Shared Storage HGVM1 8cpu HGVM2 8cpu Testing

  7. Current Hardware – Nodes • HAMMER • Mixed nodes: • 7 x dual L5420 • 64 x dual E5620 • 16 x dual X5650 • Lots of room to work in • IPMI+KVM makes life so much easier • Also hot swap drives and mboards • IPMI monitoring not entirely trustworthy • Sensor reading spikes • Occasionally needs rebooting

  8. Configuration and deployment • Kickstart used for OS installation and basic post install • Used with PXE boot for some servers and all desktops • Puppet used for post-kickstart node installation (glite-WN, YAIM etc) • Also used for keeping systems up to date and rolling out packages • And used on desktops for software and mount points • Custom local testnode script to periodically check node health and software status • Nodes put offline/online automatically • Keep local YUM repo mirrors, updated when required, no surprise updates

  9. HEP Network • Grid cluster is on a sort-of separate subnet (138.253.178/24) • Shares some of this with local HEP systems • Most of these addresses may be freed up with local LAN reassignments • Monitored by Cacti/weathermap, Ganglia, Sflow/ntop (when it works), snort (sort of) • Grid site behind local bridge/firewall, 2G to CSD, 1G to Janet • Shared with other University traffic • Upgrades to 10G for WAN soon (now in progress!) • Grid LAN under our control, everything outside our firewall controlled by our Computing Services Department (CSD) • We continue to look forward to collaborating with CSD to make things work

  10. Network Monitoring • Ganglia on all worker nodes and servers • Cacti used to monitor building switches and core Force10 switch • Throughput and error readings + weathermap • Ntop monitors core Force10 switch, but still unreliable • sFlowTrend tracks total throughput and biggest users, stable • LanTopolog tracks MAC addresses and building network topology • arpwatch monitors ARP traffic (changing IP/MAC address pairings).

  11. Monitoring - Cacti • Cacti Weathermap

  12. Current Hardware – Network • Getting low on cables • Network capacity our biggest upcoming problem • Moving to 10Gb • Hurrah! • Bonding was getting ridiculous as density increased • But worked very well

  13. HEP Network topology 1G 2G 1G 1-3G 10G 192.168 Research VLANs 2G x 24 1500MTU 9000MTU

  14. WAN Old 1G Network 2G E600 Chassis 6G 3G Servers Storage 1G 1G Nodes

  15. Current Hardware – Network • New 10Gb kit • Switches! • 3 x S4810 core switches • 10 x S55 rack switches • Optics! • 4x LR • 2x SR • NICs! • 40 x SolarFlare dual port low-latency 10G • Boxes! • 3x Dell R610s • 1x Dell R710

  16. WAN New 10G Network 20G 20G 80G 80G S4810 S4810 S4810 Future… 30G 30G S55 Servers S55 Storage 2G 2G Nodes

  17. Current Hardware – Network • Initial testing • Jumbo frames still of benefit to bandwidth and CPU usage • Packet offloading turned on makes a bigger difference, but should be on by default anyway • 10Gb switches now installed in racks • Relocation of servers upcoming

  18. Storage • Majority of file stores using hardware RAID6. • Mix of 3ware, Areca SATA controllers and Adaptec SAS/SATA • Arrays monitored with 3ware/Areca software and nagios plugins • Software RAID1 system disks on all servers. • A few RAID10s for bigger arrays and RAID0 on WNs • Now have ~550TB RAID storage in total. Getting to be a lot of spinning disks (~700 enterprise drives in WNs, RAID and servers). • Keep many local spares • Older servers were upgraded 1TB->2TB • Trickle down of 1TB->0.75TB->0.5TB->0.25TB upgrades • Also beefed up some server local system/data disks

  19. Central Clusters • Spent last few years trying to hook up UKI-NORTHGRID-LIV-HEP to NWGRID over at the Computing Services Department (CSD) • Never really worked • Too many problems with OS versions • SGE awkward • Shared environment and configuration by proxy is painful • New £1 million procurement now in progress • We’re on the procurement committee • Physics requirements included from the outset • Steering and technical committees will be formed • Looks promising

  20. Security • Network security • University firewall filters off-campus traffic • Local HEP firewalls to filter on-campus traffic • Monitoring of LAN devices (and blocking of MAC addresses on switch) • Single SSH gateway, Denyhosts • Snort and BASE (still need to refine rules to be useful, too many alerts) • Physical security • Secure cluster room with swipe card access • Laptop cable locks (occasionally some laptops stolen from building) • Promoting use of encryption for sensitive data • Parts of HEP building publically accessible • We had some (non-cluster) kit stolen • Logging • Server system logs backed up daily, stored for 1 year • Auditing logged MAC addresses to find rogue devices

  21. Plans and Issues • Research Computing • Research now more of an emphasis • Various projects in pipeline • GPGPU work • Tesla box (dual Tesla M2070s) • Looking forward to MIC • Cluster room cooling still very old, regular failures • Bit more slack after the MAP2 removal but failures increasing • However central Facilities Management maintenance has improved • University data centre strategy continues to evolve • There were plans to move CSD kit into the room, but this is not happening as things stand • While we’re on the subject of cooling, we had an incident…

  22. It’s getting hot in here… • One Saturday morning, a large compressor in the building basement blew a fuse • This took out the main building power • Our air-conditioning is on the main building power • But our cluster power is not… • Ambient temperature hit ~60C before building power was restored • A bit warmer than Death Valley and Africa at their hottest • We had some measures in place • Some systems shut down • But there were some problems • External network link went down with building power failure • Need to automatically switch off systems more aggressively • Aftermath • Some hard drives failed • Some insist on telling us about this one time it got really hot (smartctl health check return code has bit 5 set, indicating passed check but attribute has been <= threshold)

  23. Conclusion • New kit in (nodes and 10Gb networking) • New CSD cluster share coming (we hope) • Research becoming much more of a focus • Despite the odd issue (60C is no temperature to run a cluster at), things continue to run well

More Related