210 likes | 326 Views
GridPP3 project status. Sarah Pearce 24 April 2010 GridPP25 Ambleside. Skiddaw. The 4 th highest mountain in England (or the 3 rd , depending) The “simplest of the mountains of this height to ascend” A well trodden tourist track
E N D
GridPP3 project status Sarah Pearce 24 April 2010 GridPP25 Ambleside
Skiddaw • The 4th highest mountain in England (or the 3rd, depending) • The “simplest of the mountains of this height to ascend” • A well trodden tourist track • The first summit of the ‘Bob Graham Round’ fell running challenge • The view from the top is ‘panoramic’
Since the last meeting • LHC continues to take data – see Pete’s talk • EGEE finished and EGI started – see Jeremy and Andy’s discussion • Tier-1 running well • Tier-2s procuring more equipment from the 2nd round of hardware grants • GridPP4 proposal reviewed and accepted – see Dave’s talk
Tier-1 • CPU hardware delivered and commissioned in time to meet WLCG pledge • One tranche of disk delivery still going through acceptance • Procurements for next round of CPU and disk have started • Testing for upgrade to CASTOR 2.1.9 (from 2.1.7) • Operations very stable
Tier-2s • RHUL cluster successfully running in new RHUL machine room • UCL-Central removed from list of UK sites • All grants for 2nd tranche of hardware issued: sites procuring hardware to meet 2010 pledge. • Several sites made significant upgrades, including: • Sheffield (inc. air con/ temperature monitoring equipment) • Lancaster kit for new machine room • Cambridge increased disk and CPU • IC moved site outside firewall – x2 improvement in performance • Some issues with staffing (Durham, likely at Bristol) • Discussion today at PMB/ DB on how to cover sites with small amounts (or no) dedicated staff
EGI, EGI-Inspire etc. • EGI started operations on 1 May 2010 • Governed by EGI Council • Executive Board reports to Council – Neil Geddes elected member of the EB • Key staff now in Amsterdam (except Neasan) • First Technical Forum will Sept 14-17 in Amsterdam • EGI-InSPIRE also started • Grant Agreement with EC not signed yet – so no money so far • e-ScienceTalk will start 1 September • funds UK staff at IC and QMUL
UKI CPU contribution (LHC) Since April 2010 Country stats CPU August 2010 – GStat2.0
UKI VOs Since April 2010 Previous year
UKI Tier-1 & Tier-2 contributions Since April 2010 Previous year
Storage • From GStat (and previous talks…) September 2008 March 2009 September 2009 April 2010 • From GStat2.0 (today)
Project map - statistics Metrics Milestones
Experiments • ATLAS • T1 data acceptance from CERN, T1s and T2s up from 79% to 96% • Data availability in T2 storage is green, but this hides quite significant SE issues at some sites • LHCb • Sharp drop in the proportion of production computing taking place in the UK, from 28% to 16% - early user jobs at CERN • Issue with data transfer from the T2s to RAL (1.2.5) • Ganga milestone delayed (Integrate XML job summary from Dirac into Ganga) due to setting up new DAST • CMS • Some data loss at T1 and T2 but not considered significant by CMS • Going well – CMS recognises the UK’s contribution • Other experiments • MINOS, D0 and Babar mainly this quarter • Red milestones for experiment satisfaction/user support questionnaire – waiting on ATLAS reply
Grid services • Operations • 2.1.3 Fraction job slots used (Target 80%, achieved 37%). Overall occupancy low this quarter. • Security • No incidents this quarter • Networking • No red metrics. Second (resilient) OPN link from RAL is operational • Data and storage • Record FTS transfer rates (2.4.4), with an average over 370 MB/s sustained over the whole quarter • Still questions over published storage values
Tier-1 • T1 operating extremely well. Nearly all metrics for front-end systems at 100%. • CASTOR SAM tests at 100% for the first time (3.4.8) • Red metrics for farm occupancy (43%, against a target of 80%, 3.2.11) • Red milestone for 2009 disk hardware accepted. One tranche of disk capacity failed acceptance – firmware fix and running again. • Red milestone on moving out of Atlas centre – revised and will be met next quarter
Tier-2s • % of promised CPU available – green for all Tier-2s (metric 2). % of disk red for NorthGrid, but procurements underway. Next quarter will be measured against 2010 pledge. • SAM availability and reliability tests green or orange (so above 90%) for most Tier-2s (metrics 3&4). Range of issues at SouthGrid sites. • Other red metrics: • CPU utilisation (wall clock time & CPU time, metrics 7/8) LondonGrid, SouthGrid – but generally low • Number of management meetings NorthGrid (metric 11) • Staff changes at several sites (Durham, Glasgow, Manchester, QMUL)
Management and external Project execution – red metrics • All quarterly reports in by target time (though some earlier than others…) • Red metric for no. of UB meetings Rest of Map • No red metrics • EGEE/EGI metrics being revised to reflect EGI start
Risk register • 3 high level risks • Recruitment and retention – more of an issue as we get closer to GridPP4 • Sudden loss of key staff – as above • Uncertain long term funding. GridPP4 approved, but government funding an issue everywhere
Finances • Substantial reduction in the Tier-1 FY10 hardware line • STFC requested reduced capital spend of £1.1m • New experiment resource requirements from C-RRB in April 2010. Overall (to 2015) reduction in disk and CPU but increase in custodial storage. • Second tranche of Tier-2 hardware grants all issued • Bridging posts for EGEE-funded staff • Travel costs £173k for 09/10 – within budget • Small amount of funding for R-GMA over 6 months
And the view is… Panoramic?