130 likes | 259 Views
The Lab’s Computing Support Strategy for CDF and D0 Victoria White, Associate Lab Director for Computing and CIO October 3, 2011. Our job in the Computing Sector.
E N D
The Lab’s Computing Support Strategy for CDF and D0Victoria White, Associate Lab Director for Computing and CIOOctober 3, 2011
Our job in the Computing Sector • Is to enable science and to optimize the support (human and technological) of the scientific programs of the lab (including the Experiment program) • Within funding and resource contraints • In the face of growing demands • To meet emerging needs • To deal with rapidly changing technology • We also have to provide computing to support the lab’s operations and provide all the standard services that an organization needs (and often expects 24x7) Computing Support Strategy for CDF and D0
Computing Division -> Computing Sector Computing Support Strategy for CDF and D0
Feynman Computing Center (FCC) • High availability services – e.g. core network, email, etc. • Tape Robotic Storage (3 10000 slot libraries) • UPS & Standby Power Generation • ARRA project: upgrade cooling and add HA computing room - completed • Grid Computing Center (GCC) • High Density Computational Computing • CMS, RUNII, GridFarm batch worker nodes • Lattice HPC nodes • Tape Robotic Storage (4 10000 slot libraries) • UPS & taps for portable generators • Lattice Computing Center (LCC) • High Performance Computing (HPC) • Accelerator Simulation, Cosmology nodes • No UPS Fermilab Computing Facilities EPA Energy Star award 2010 Computing Support Strategy for CDF and D0
Facilities: more than just space power and cooling – continuous planning ARRA funded new high availability computer room in Feynman Computing Center Computing Support Strategy for CDF and D0
Cooling problems at GCC this summer The air intake to the condensers can reach temps of 120F causing the cooling to shutdown (20-25F above ambient on pad) • $650–950k to move condensers to a platform for Comp.Rooms B and C • Rough estimate from FESS • Does not include Computer Room A • Better estimate when the study is complete in November Soaker hoses to cool concrete condenser pad Increased computer room operating temperatures Numerous air management improvements inside the computer room, including cold aisle containment test Extended monitoring outside to the condenser pad Executed load shed plan twice during hottest days Rented portable air conditioning for use in CRB & outside under the condensers (the latter was effective, not efficient) Computing Support Strategy for CDF and D0
Need to fix Grid Computing Center quickly – ready for next summer • Need to be able to use the computer rooms as designed and plan for that going forward. • Need to move forward with CRA renovations for greater power per rack. • We cannot do this and run everyone ragged and be unreliable every summer Computing Support Strategy for CDF and D0
Run II Computing Strategy • Production processing and Monte-Carlo production capability after the end of data taking • Ability to do some reprocessing if needed • Monte Carlo production at the current rate through mid-2013? • Analysis computing capability for at least 5 years, but diminishing after end of 2012 • Push for 2012 conferences for many results –no large drop in computing requirements through this period • Continued support for up to 5 years for • Code management and science software infrastructure • Data handling for production (+MC) and Analysis Operations • Curation of the data: > 10 years with possibly some support for continuing analyses Computing Support Strategy for CDF and D0
We have pushed/insisted on sharing strategies for computing for many years –why? Cost Coherent technical approaches and architectures Support over the entire lifecycle of an experiment/project Computing Support Strategy for CDF and D0
Experiment/Project Lifecycle and funding Expt or Project specific Project specific Shared services Shared services Shared services Shared services Mature phase Construction, Operations, Analysis Early Period R&D, Simulations LOI, Proposals Final data-taking and beyond Final analysis, Data preservation and access Computing Support Strategy for CDF and D0
Sharing via the Grid – FermiGrid User Login & Job Submission TeraGrid WLCG NDGF Open Science Grid FermiGrid Infrastructure Services FermiGrid Monitoring/Accounting Services FermiGrid Authentication/Authorization Services FermiGrid Site Gateway CMS 7485 slots D0 6916 slots CDF 5600 slots GRIDFarm 3284 slots Computing Support Strategy for CDF and D0
Budget/resource allocation for 2012 + • There is always upward pressure for computing • more disk and more cpu leads to faster results and greater flexibility • more help with software & operations is always requested • Within a fixed budget each experiment can usually optimize between tape drives, tapes, disk, cpu, servers • assuming basic shared services are provided. • With so many experiments in so many different stages we intend to convene a “Scientific Computing Portfolio Management Team” to examine the needs/computing models of the different Fermilab based experiments and help in allocating the finite dollars to optimize scientific output. Computing Support Strategy for CDF and D0
“Data Preservation” for Tevatron data • Data will be stored and migrated to new tape technologies for ~ 10 years • Eventually 16 PB of data will seem modest • If we want to maintain the ability to reprocess and do analysis on the data there is a lot of work to be done to keep the entire environment viable • Code, access to databases, libraries, I/O routines, Operating Systems, documentation….. • If there is a goal to provide “open data” that scientists outside of CDF and Dzero could use there is even more work to do. • 4th Data Preservation Workshop at Fermilab in May • The collaboration has to decide – soon if we need to do more than maintain data for collaboration use. Computing Support Strategy for CDF and D0