220 likes | 226 Views
A detailed plan for GridPP3 project encompassing timeframe, budget, hardware requirements, and operational strategies. It includes cost estimates, proposed hardware allocation, and funding considerations for Tier-1 and Tier-2 resources.
E N D
Preliminary Project Plan for GridPP3 David Britton 15/May/06
Boundary Conditions Timeframe: GridPP2+ Sep 07 to Mar 08 GridPP3 Apr 08 to Mar 11 Budget Line: Unknown exactly. Scale set by exploitation review input. (both from GridPP input and Experiments) Exploitation Review input from GridPP: FY07: £7,343k FY08-FY10: £29,302k Total: £36,643k GridPP3
(Global Requirements) (Number of Tier1s) (Global Requirements) X (Global T1 author frac.) (Global Requirements) (Number of Tier1s) ?? ~50% X ~ UK Authorship fraction LHC Hardware Requirements GridPP Exploitation Review input: Took Global Hardware requirements and multiplied by UK authorship fraction. ALICE 1% ATLAS 10% CMS 5% LHCB 15% Problematic using “Authors” in the denominator when not all Authors (globally) have an associated Tier-1. Such an algorithm applied globally would not result in sufficient hardware. GridPP has asked the experiments for requirements and their input (relative to global requirements) is: ALICE ~1.3% ATLAS ~13.7% CMS ~10.5% LHCb ~16.8% GridPP3
Proposed Hardware The proposal from the User Board is that that the hardware requirements in the GridPP3 proposal are: • Those defined by the LHC experiments; • plus those defined by BaBar (historically well understood); • plus a 5% provision for “Other” experiments at the Tier-2s only. GridPP3
Proposal • We propose to use the UB input to define the Hardware request (and not include alternative scenarios). • We will note that these hardware requirements are not very elastic. Strategic decisions on the UK obligations, roles, and priorities will need to be made if the Hardware is to be significantly reduced. • (Internally, we should continue to discuss how to respond to lower funding scenarios). GridPP3
Hardware Costs Hardware costs are rather uncertain. We have previously quantified this uncertainty as 10% per year of extrapolation (10% in 2007, 20% in 2008, etc). Translates to an uncertainty of about £3.8m in the proposal. (Actual numbers here will be updated – these are a few months old) GridPP3
Tier-1 Hardware (Work in progress: numbers are still evolving!) GridPP3
Running Costs (Work in progress) GridPP3
Running Costs • Running costs traditionally charged indirectly (at institutes and CCLRC). Normally averaged over larger communities which tends to be to the advantage of particle physics. • We hope this continues as long as possible. • Exploitation review input contained ~£1.8m running costs split between Tier-1 and Tier2 which is only 50% of the current estimate. • Should we avoid explicitly include running costs in the GridPP3 proposal (on the basis that it is not known how these will be charged)? Instead, include a footnote pointing out the assumption that running costs are funded by other mechanisms (SLA, FEC). GridPP3
Tier-2 Resources • In GridPP2 we paid for staff in return for provision of hardware, which is not a sustainable model. Need a transition to a sustainable model that generates sufficient (but not excessive) hardware, which institutes will buy into. • Such a model should: • Acknowledge that we are building a Grid (not a computer centre). • That historically Tier2s have allowed us to lever resources/funding. • That Tier2 are designed to provide different functions and different levels of service from the Tier1. • Dual funding opportunities may continue for a while. • Institutes may have strategic gain by continuing to be part of the "World's largest Grid" GridPP3
Tier-2 Resources A possible model: - GridPP funds ~15 FTE at the Tier-2s (same as Tier-1). - Tier-2 Hardware requirements are defined by the UB request. - That GridPP pays the cost of purchasing hardware to satisfy the following years requirements at the current year price, divided by the nominal hardware lifetime (4 years for disk; 5 years for CPU). E.g. 2253 TB of Disk is required in 2008. In January 2007, this would cost ~1.0k£/TB. With a life-time of 4 years, the 1-year “value” is 2253/4 = £563k. Note: This does not necessarily reimburse the full cost of the hardware because in subsequent years, the money GridPP pays depreciates with the falling cost of hardware, whereas the Tier2s who actually made a purchase, have been locked into a cost determined by the purchase date. However, GridPP does pay cost up to 1-year before the actual purchase date, and institutes which already own resources can delay the spend further. GridPP3
Tier-2 Resources • Sanity Checks: • Can apply the model and compare cost of hardware at the Tier-1 and Tier-2 integrated over the lifetime of the project: • Total cost of ownership: Can compare total cost of the Tier-2 facilities with the cost of placing the same hardware at the Tier-1 (estimate that doubling the Tier-1 hardware requires a 35% increase in staff). Tier-1 Tier-2 CPU (K£/KSI2K-year): 0.071 0.043 DISK (K£/TB-year): 0.142 0.107 TAPE (K£/TB-year): 0.05 Including staff and hardware, the cost of the Tier-2 facilities is 80% of cost of an enlarged Tier-1. Question: Would institutes be prepared to participate at this level? GridPP3
Staff Effort Currently using the GridPP input to the exploitation review as the baseline (with the addition of Dissemination + Industrial Liaison) . GridPP3 GridPP2+ GridPP3
Staff Costs GridPP3
Tier-1 Staff (Exploitation review input) The staff required will be 15 FTE to run and operate the CPU, disk, tape, networking and core services as well as provide Tier-1 operations, deployment and experiments support managed in an effective manner. Support will be during daytime working hours (08:30-17:00 Monday to Friday) with on call cover outside this period. CCLRC may provide additional effort to underpin the service. In order to provide staff present on-site for 24x7 (weekend) cover a further 5 FTE (2 FTE) would be needed. 9 FTE in GridPP1; 13.5 FTE in GridPP2 GridPP3
Tier-2 Staff (Exploitation review input) Currently GridPP provides 9.0 FTE of effort for hardware support at the Tier-2s (London 2.5, NorthGrid 4.5, ScotGrid 1.0 and SouthGrid 1.0). This is acknowledged to be too low and operating the Tier-2s is a significant drain on rolling-grant funded System Managers and Physicist Programmers. Large facilities require at least one FTE per site, whereas smaller sites need at least a half FTE. On the basis of currently available hardware an allocation for HEP computing would be 4 FTE to London (5 sites), 6 FTE to NorthGrid (4 Sites), 2 to ScotGrid (3 sites) and 3 to SouthGrid (5 sites) making a total of 15 FTE. GridPP3
Grid Support Staff (Exploitation review input) From the middleware support side, at least one FTE is required for each of the following areas: security support; storage systems; workload management; networking; underlying file transfer and data management systems; and information systems and monitoring (where an additional FTE of effort is anticipated ensuring that our main contribution to EGEE is supported in the longer term). It would be inappropriate to reduce to this level of effort abruptly at precisely the time that LHC is expected to start producing data in 2007. Rather it is advised to phase the reduction to this level over FY08 and FY09 thereby sustaining a necessary and appropriate level of support at this critical time. …there will remain core Grid application interfaces supporting the experiment applications that will continue into the LHC running period. These stand to some extent independent of the experiment-specific programmes, although they serve them. A total of 7 FTEs is required for these common application interface support tasks. It should be noted that the proposed effort in this combined area is a significant reduction from the current effort in these Grid developments of more than 30FTEs. GridPP3
Grid Operations (Exploitation review input) In order to operate and monitor the deployment of such a Grid, a further 8 FTEs of effort is needed, corresponding to the Production Manager, 4 Tier-2 Regional Coordinators and 3 members of the Grid Operations Centre. GridPP3
Future Management (Last CB –Steve’s slide) • Project Leader appointed by CB search Committee • Others by Project Leader? • 2.5 1.5 over time? • What about CB itself? • What about Dissemination? GridPP3
Dissemination 4. The bid (s) should : a) show how developments build upon PPARC’s existing investment in e-Science and IT investment, leverage investment by the e-science Core programme and demonstrate close collaboration with other science and industry and with key international partners such as CERN. It is expected that a plan for collaboration with industry will be presented or justification if such a plan is not appropriate. For exploitation review it was assumed dissemination was absorbed by PPARC. Unlikely at this point! Presently we have effectively 1.5 FTE working on dissemination alone (Sarah Pearce plus events officer). Want to maintain a significant dissemination activity (insurance policy) so adding in industrial liaison suggests maintaining the level at 1.5 FTE. GridPP3
Full Proposal (work-in-progress) Compares with exploitation review input of £36,643k which included £1,800k running costs excluded above. GridPP3