110 likes | 241 Views
London Tier2 Status. Olivier van der Aa. LT2 Team M. Aggarwal, D. Colling, A. Fage, S. George, K. Georgiou, W. Hay, P. Kyberd, A. Martin, G. Mazza, D. McBride, H. Nebrinsky, D. Rand, G. Rybkine, G. Sciacca, K. Septhon, B. Waugh. Back in September. Back in September there where two important
E N D
London Tier2 Status Olivier van der Aa LT2 Team M. Aggarwal, D. Colling, A. Fage, S. George, K. Georgiou, W. Hay, P. Kyberd, A. Martin, G. Mazza, D. McBride, H. Nebrinsky, D. Rand, G. Rybkine, G. Sciacca, K. Septhon, B. Waugh
Back in September • Back in September there where two important • issues in the LT2 • All available CPU where not online: • Brunel had a 128 CPU. • IC-LeSC had a 400 CPU cluster running sge. • All available disk was not online: • QMUL had a 18 TB storage which needed to be online. London Tier 2 Status
CPU: Brunel • Beginning of November the Argo cluster went gradually online. • Difficulties with the setup that uses a windows pxe server. • Part of the machines had a power supply problem that needed to be fixed. • We have setup a second CE for the Argo cluster. Proven to be quick to do. London Tier 2 Status
CPU: IC-LeSC • Integration of sge job manager with the LCG job manger • Introduced the concept of virtual queues. • Sge does not have the concept of queues. It works a la condor where you give your requirements and it will allocate cpu for you • Each virtual queue maps to a set of requirements • Sge is flexible to make the accounting • The information system was the most difficult part and it took some time to have it stable. See the plot. • Documentation http://wiki.gridpp.ac.uk/wiki/LCG-on-SGE London Tier 2 Status
Online CPU evolution IC-LeSC Brunel London Tier 2 Status
Online Storage • After the upgrade of QMUL to LCG260 Alex and Guiseppeinstalled an DPM on theirpool fs file system • The pool fs is doing the work of distributingfile writes to different disks. DPM is only usedwith one pool. • Need to check how poolfs will scale with the load on the DPM SRM. Installing DPM almost doubled the storage available in the LT2 London Tier 2 Status
SRM in the LT2 • We decided to have all sites with DPM • Size of the disk ressources • Ease of configuration • But we kept an eye on dCache with the IC-HEP setup. London Tier 2 Status
Service Challenge 4 • Need to work with a tight schedule to finish the SRM installation in the LT2 by end of Jan. 50% of sites should have done 1TB transfer tests. • Have started transfer test between T1 and IC-HEP. • Poor bandwidth 80 Mb/s • Attributed to one pool that was degrading theoverall transfer rate. • Have now upgraded with dCache 1.6.6 but have problems with the information system not publishing the available size correctly. • Will use Glasgow to make DPM-DPM transfer tests. London Tier 2 Status
Issues • RHUL and Brunel unlikely to have a Gb connections for SC4. • Need apel accounting to be installed on our sge setup. • Need to see how DPM setups will behave under load. Some sites have chosen to have it installed on a NFS/poolfs file system. • Upgrade with the coming 2.7.0 release • Need to have a clear schedule for the upgrade within the LT2. How can we help each other efficiently ? London Tier 2 Status
Conclusion • Last quarter was productive for • Bringing more ressource online • Increase the CPU by 49% • Increase the disk by 100% • Preparing for the SC4 in terms of SRM. • We need to focus more on the production quality of our setup. • Need to keep up with the SC4 preparation and detailed testing of the DPM SRM • Focus on the transfer tests. 50% of sites by end of Jan London Tier 2 Status
LT2 Thanks to all of the Team M. Aggarwal, D. Colling, A. Fage, S. George, K. Georgiou, W. Hay, P. Kyberd, A. Martin, G. Mazza, D. McBride, H. Nebrinsky, D. Rand, G. Rybkine, G. Sciacca, K. Septhon, B. Waugh London Tier 2 Status