User Board Overview

User Board Overview Dan Tovey University Of Sheffield

Tier-1 Planning • Quarterly UB meeting in April (see minutes) updated Tier-1 planning figures • Shortfall of T1 resources in future years, (especially 2008) evident. • Will need to consider if expt. requirements can be met by Tier-2 resources  need to demonstrate clear need for Tier-1 functionality. • Requests which can be met by Tier-2 to be discussed with Tier-2 board. • ‘Other Experiments’ line removed from Tier-1 Schedule following detailed Tier-1 board plan  all users must make representation to UB to get access to resources

Tier-1 Planning • Tier-1 utilisation figures frequently fall significantly short of both requests and allocations • sends the wrong message • Often not fault of experiments (e.g. middleware / operational problems) but experiments must work to produce more realistic estimates • Move to strict allocation of Disk resources (no over-allocation)  helps Tier-1 team. • Also synchronise with spending cycle  aim to ensure complete use of all new resources as soon as on-line

DB Links • Stronger links with Deployment Board are seen as vital  standing invitation for DB representation at UB meetings.

UB Concerns • How are experiments that globally are not moving to the Grid to be handled? • Site stability & User support • Balance of effort at Tier-1: much used for CMS (SRM) and later LCG SC, but what about smaller user communities? • What about ‘non-standard’ OS at Tier-2 sites  can render useless to some experiments. UB and Tier-2 board need to persuade to work towards standardisation.

Questionnaire • User Board questionnaire updated for latest OsC process. • No big changes from February • Some new comments/concerns: • fragmented support structure • All stick and no carrot • held up by problems with establishing the VO • Not all experiments supported by large Tier-2s • Further details at: • http://www.gridpp.ac.uk/eb/workdoc/gridusebyexpts_0605.doc

Pleasure: LHCb Shared data (LHCb RTTC production May/June) The data reported are preliminary (accuracy at 5%) 5% produced with plain DIRAC sites 95% produced with LCG sites

Pleasure: ATLAS • Using the Grid for 100% of Simulation, Digitisation and Reconstruction. • 8.5M fully simulated ATLAS events produced • 20% of LCG jobs in UK • Overall throughput good, and improving …

Pain: ATLAS • But … experience has been painful! • Significant throughput problems experienced in January/February • production goals descoped (15M events planned vs. 8.5M ev. actual). • Identified problems (highlights – see also questionnaire): • System appears to function best when only one person submitting jobs! • Lack of a distributed mechanism for prioritising jobs • Lack of inter-operability between LCG and other Grids: load balancing and data replication have to be done 'by hand'. Leads to production errors (e.g. same sample produced multiple times on different grids) • Too much human intervention required to set, adjust and enforce priorities • Could not saturate CPU resources on LCG easily (rate doubled with a simple change of scripts/person!): production time does not scale with cpu requirements • Job definition/submission very (expert) labour intensive • Absolute need for a SE/SRM solution for small files. • Urgent need for VOMS, integrated with other grid tools for resource allocation/access/monitoring/accounting

H1 Tests

H1 Tests 30 Jobs failed: 22 due to Grid problems (gridproxy/misc.)

H1 Tests

User Board Overview

User Board Overview

Presentation Transcript

User Fee Overview

User Interface overview

PH SPace Overview Board Space POlicy Board

Overview Board

Finance Dashboard User Overview

User Office Status Overview

User Board Input

User Board or User Bored?

User Board

Budget Overview Board Retreat

Smart Board Overview

Overview of the Board

User Board - Supporting Other Experiments

Board Governance Overview

Board Overview and Credentialing