150 likes | 273 Views
Southgrid Status Report. Rhys Newman: September 2004 GridPP 11 - Liverpool. Southgrid Member Institutions. Oxford RAL PPD Cambridge Birmingham Bristol Warwick. Tier 2 Management Board. Tier 2 Board meets regularly every 3 months.
E N D
Southgrid Status Report Rhys Newman: September 2004 GridPP 11 - Liverpool
Southgrid Member Institutions • Oxford • RAL PPD • Cambridge • Birmingham • Bristol • Warwick
Tier 2 Management Board • Tier 2 Board meets regularly every 3 months. • Where possible this is a face to face meeting, although a couple of people “phone in”. • MOU Status is still ongoing. • The most difficult tier 2 to organise as it has the most institutes. • Many concerns about imposing security policy on to the institute • Confusion as to who in each site is authorised to sign the MOU. • Final signatures are being collected as I speak.
Status at Warwick • A recent addition to Southgrid. • Third line institute – no resources as yet but remain interested in being involved in the future. • Will not receive GridPP resources and so does not need to sign the MOU yet.
Operational Status • RAL PPD • Cambridge • Bristol • Birmingham • Oxford
Status at RAL PPD • Always on the leading edge of software deployment (co-located with RAL Tier 1) • Currently (10 Sept) up to LCG 2.2 • CPUs: 24 2.4 GHz, 18 2.8GHz • 100% Dedicated to LCG • 0.5 TB Storage • 100% Dedicated to LCG
Status at Cambridge • Constantly the first institute to keep up with LCG releases. • Currently LCG 2.1.1 (since date of release), will upgrade by October. • CPUs: 32 2.8GHz – increase to 40 soon. • 100% Dedicated to LCG • 3 TB Storage • 100% Dedicated to LCG
Status at Bristol • Limited involvement for last 6 months due to manpower shortage. • Current plans to switch BaBar farm to LCG by October. • 1.25 FTE computer support to be filled soon and should improve the situation (Bristol Initiative not GridPP) • CPUs: 80 866MHz PIII (Planned BaBar) • Shared with LHC under an LCG install. • 2 TB Storage (Planned) • Shared with LHC under an LCG install. • Possible new computing centre (>500 CPUs) still ongoing. • Possible new post still ongoing.
Status at Birmingham • Second line institute, reliably up to date with software within about 6 weeks of release. • Currently LCG 2.2 (since mid August). • Southgrid’s “Hardware Support Post” to be allocated here to assist. • CPUs: 22 2.0GHz Xenon (+48 soon) • 100% LCG • 2 TB Storage awaiting “Front End Machines” • 100% LCG.
Status at Oxford • Second line institute, have only recently come online. Until May had limited resource. • Currently LCG 2.1.1 (since early August). • Hosted LCG2 Administrator’s Course which impacted installation timeline. • CPUs: 80 2.8 GHz • 100% LCG • 1.5 TB Storage – upgrade to 3TB planned • 100% LCG.
Resource Summary • CPU (3GHz equiv) • 155.2 Total • Storage (TB) • 7 TB Total
LCG2 Administrator’s Course • Main activity at Oxford for the weeks leading up to July. • Received very well – lack of machines was identified as a problem, even though we used 20 servers! • A good measure of the complexity: • An expert could do a LCG install in 1 day • A novice could do it with expert help in 3 days. • A novice alone could take weeks! • A lot of interest in a repeat, especially when the 8.5 “Hardware Support” posts are filled (suggestions welcome).
Ongoing Issues • Complexity of the installation. Can’t compare with “Google Compute” – is winning a PR exercise useful? • Difficulty sharing resources – almost all of those listed are 100% LCG due to difficult sharing issues. • How will we manage clusters without LCFGng? Quattor has a learning curve (uses a new language) – should we all get training?
Future Issues • We need 100 000 1GHz machines “… to scale up the computing power available by a factor of ten…” – Tony Doyle (GridPP summary of All Hands meeting). • What are we learning now? The gLite (aka EGEE1) may be completely different? • Can’t we get some cycle stealing? 20000 “decent” machines in Oxford University alone!
LHC At Home!! (Thanks Mark) LHC at Home: http://lhcathome.cern.ch • Started 1st September • Still beta. • 1004 Computers already • How can we leverage this???