200 likes | 328 Views
UKI-SouthGrid Overview. Pete Gronbech SouthGrid Technical Coordinator GridPP 25 - Ambleside 25 th August 2010. Seven(-teen) Sisters. UK Tier 2 reported CPU – Historical View to present. SouthGrid Sites Accounting as reported by APEL.
E N D
UKI-SouthGrid Overview Pete Gronbech SouthGrid Technical Coordinator GridPP 25 - Ambleside 25th August 2010
UK Tier 2 reported CPU – Historical View to present SouthGrid August 2010
SouthGrid SitesAccounting as reported by APEL Sites Upgrading to SL5 and recalibration of published SI2K values SouthGrid August 2010
Site Resources SouthGrid August 2010
JET • Stable operation, (SL5 WNs) • Could handle more opportunistic LHC work 1772HS06 1.5TB SouthGrid August 2010
Birmingham Just purchased 40TB Storage total storage to 10TB + 6*20 + 2*40 = 210 TB in a week or two Two new 64 bit servers (SL5) Site BDII + monitoring VMs (SL5) DPM head node Everything (except mon) is SL5 Both clusters have dual lcg-CE/CreamCE front ends Sluggish response/instabilities with GPFS on Shared Cluster Installed 4TB NFS mounted file server for experiment software/middleware/user areas Taken on someone else's proprietary (non SL5) smart phone. He couldn't get signal in there either. SouthGrid August 2010 8
Birmingham SouthGrid August 2010 9
Bristol LCG • StoRM SE with gpfs, 102TB 90% full of CMS data • StoRM developers are finishing testing 1.5.4 on SL5 64bit, plan to provide 1.5.4 both for slc4 ia32 and sl5 x86_64 to Early Adopters this month (August). Bristol is waiting for stable well-tested StoRM v1.5 SL5 64-bit release . In the meantime Bristol's StoRM v1.3 (32-bit on SL4) working very well! • On 1Gbps network, getting good bandwidth utilization • Servers (StoRM & gridftp) very responsive despite load:
HDFS with StoRM • Prior WN: Intel XEON 2.0GHz; Dec2009 new WN: AMD 2.4GHz each AMD WN = 2 x 1TB drive, part of 1 disk = WN space • Dr Metson experimenting with HDFS using rest of 1 disk + 2nd disk, working with INFN on possibility of StoRM on top of HDFS • Also experimenting with using Hadoop to process CMS data In Other News... • Swingeing IT staff cuts being planned at U Bristol (and downgrades for those few remaining) • Started planning that SouthGrid will take over Bristol LCG Site Admin from April 2011 • Consolidate & reduce PP servers so Astro admin can inherit • PP Staff will best-effort support Bristol AFS server (IS won't)
Bristol • Plan to try to run the ce’s and other control nodes on Virtual machines using an identical setup to Oxford, to enable remote management. • The StoRM SE on GPFS will be run by Bob Cregan on site. SouthGrid August 2010
Cambridge • 32 cores CPU installed April 2010: bought from GridPP3 tranche 2. • Server to host several virtual machines (BDII, Mon, etc.) just delivered. • Network upgraded last November to provide gigabit ethernet to all GRID systems. • Storage is still 140TB; CPU will be increased due to the purchase in the first point. • Atlas production is the main VO running on this site. • Investigating current under utilisation, possible Accounting issues? SouthGrid August 2010
RALPP • We believe we are now through all the messing about with air conditioning, with our machine room now running on the refurbished/upgraded AC plant. Happy days, all except for the leaks shortly after they turned it on! • We've been running well below nominal capacity for most of this year, but are pretty much back now. • Joining with the Tier 1 for the tender process. • Testing argus and glexec • RGMA and site BDII now moved to SL5 VMs • Working on setting up a test instance of dCache, working with the Tier 1, using Tier 2 hardware. SouthGrid August 2010
Oxford • Last 6 months cluster running with very high utilisation. • Completed the tender for new kit and placed orders in July. Unfortunately the orders had to be cancelled due to manufacturing delays on the particular motherboard we ordered and a pricing problem. Now re-evaluating all suppliers with updated quotes. • New Argus server installed. (Report by Kashif) • ‘Installing Argus was easy and configuring was also OK once I understood the basic concept of policies but it took me a considerable time because of a bug in Argus which is partly due to old style of host certificate issued by UK CA. The same issue was responsible for gridpp voms server problem. I have reported this to UK CA. • Argus uses glexec on the WN, it is being tested the glexec installed on t2wn41. • Details on gridpp wiki http://www.gridpp.ac.uk/wiki/Oxford’ • Oxford has become an early adopter for CREAM and ARGUS. SouthGrid August 2010
T2wn41 t2argus02 T2wn40 -87 Grid Cluster setup CREAM ce & pilot setup Oxford t2ce02 CREAM t2ce06 CREAM glexec enabled Glite 3.2 SL5 Glite 3.2 SL5 SouthGrid August 2010
Oxford Dashboard SouthGrid August 2010 Thanks to Glasgow for the idea / code
Oxford’s Atlas dashboard SouthGrid August 2010
Conclusions • SouthGrid sites utilisation generally improving • Many had recent upgrades for hardware using Gridpp3 second tranche, others putting out tenders, some delays following issues with vendor at Oxford • RALPPD back to full strength following AC upgrade • Monitoring for production running improving • Concerns over reduced manpower at sites as we move into GridPP 4 SouthGrid August 2010
Future Meetings • Look forward to GridPP 26 in Sheffield next April • If you look in the right places the views are as good as here in the lakes.