110 likes | 124 Views
Scientific Cluster Support Program. SCS Steering Committee Report. Overview. Growing interest in the use of Linux clusters for scientific research at Berkeley Lab Hard to efficiently manage a multi-node compute cluster
E N D
Scientific Cluster Support Program SCS Steering Committee Report ITSD/CSAC Retreat March 3, 2004
Overview • Growing interest in the use of Linux clusters for scientific research at Berkeley Lab • Hard to efficiently manage a multi-node compute cluster • Findings from the Berkeley Lab Midrange Computing Workshop (March 2002) • and subsequent discussions with scientists identified a need for • affordable centralized support • The ultimate goal: • to increase the use of scientific computing to Lab research projects • to introduce parallel computing to Berkeley Lab researchers • to develop efficient, cost-effective methods for managing production clusters. Four year program started 1/03 ITSD/CSAC Retreat March 3, 2004
Mid Range Computing Gap ITSD/CSAC Retreat March 3, 2004
Scientific Cluster Support Program • Ten research projects from seven of the Lab's scientific Divisions were selected • to participate in the 4 year Laboratory-funded program • after a Lab-wide application process that was completed in September 2002. • These projects are eligible to receive the following services: • Pre-purchase consulting • Procurement assistance • Setup and configuration • Ongoing systems administration and cyber security • Computer room space with networking and cooling ITSD/CSAC Retreat March 3, 2004
SCSC • The SCS Steering Committee is a working group of CSAC. As such, it informs CSAC of SCS project status and issues on a quarterly basis, or as needed, and solicits input from CSAC members as needed to aid in decision making and priority setting. In turn, CSAC members, as representatives of their divisions and CSAC, have the responsibility to communicate information to the Steering Committee that is important to its role in governance of the SCS Project. • The SCS Steering Committee chartered by this document will work with the full CSAC committee and the ITSD project team to ensure success and visibility of the first phase (implementation phase) of the SCS Project. This phase will conclude once all of the clusters have been purchased or integrated into the project, and the support activities for the clusters have become routine, anticipated March 2004. The Steering Committee will meet bi-monthly or as-needed during this first period. • The Steering Committee is responsible for governance of the implementation phase of the project. ITSD is responsible for day-to-day management. This governance includes the following activities: - Provide oversight to ensure accountability - Participate in decision making - Participate in priority setting ITSD/CSAC Retreat March 3, 2004
SCSC Members Committee Chair - Alessandra Ciocio - Physics, CSAC and MRC Working GroupPaul D. Adams - PBD, CSAC and MRC Working GroupShane Canon - NERSCTom Daley - ESDDamir Sudar - LSD, CSACGary Jung - SCS Project ManagerTammy Welcome - SCS Project DirectorJim Triplett (advisor) ITSD/CSAC Retreat March 3, 2004
Status • SCS Program progressing well despite complexity of project scheduling • due to availability of each project's procurement funds and differences in readiness. • 6 of 10 clusters in production • (Chakraborty, Gadgill/Brown, Hoversten/Majer, Miller, Lester, Eisen) • 2 clusters in progress • (Adams/Kim/Holbrook/Brenner, Cooper/Tainer) • 1 cluster upcoming • (Head-Gordon) • 1 cluster opting out (White) ITSD/CSAC Retreat March 3, 2004
Progress • Development of Linux cluster expertise has encouraged the use of Linux clusters • at Berkeley Lab. • These are funded on recharge and include: • PBD Berkeley Structural Genomics Center (14 processors) • LANL Tuberculosis Structural Genomics Consortium (24 processors) • Yucca Mountain Project (12 processors) • Yucca Mountain Project (32 processors) ITSD/CSAC Retreat March 3, 2004
Developments • ITSD has developed high quality cost-effective Linux cluster support for Berkeley Lab • - Development of standard toolsets and procedures • Warewulf Cluster Implementation Toolkit • Developed locally by Greg Kurtzer • GPL licensed through Lab Tech Transfer • Showcased at Supercomputing 2003 • Used widely outside the Lab (Univ of Kentucky supercomputer) • - Higher level of cybersecurity (SecureID cards) • - Continual review of support costs • Summary: 360 processors in production (278 SCS, 82 Non-SCS) ITSD/CSAC Retreat March 3, 2004
Past and present Issues for the SCS Steering Committee • Conforming to developed standards (cluster distribution, architecture) • US Export Controls (deemed export) • How do we select SCS replacements or additions? • - What to do beyond SCS program? • - Extensibility • - Many other projects are interested • - Revisit institutional computing resource • - What about GRIDs? ITSD/CSAC Retreat March 3, 2004