1 / 12

Scientific Cluster Support Program

The report highlights the impact and progress of the Scientific Cluster Support Program at Berkeley Lab, focusing on cluster management, support services, project selection, and challenges for future development. The document discusses achievements, factors contributing to success, and new challenges in scientific computing.

hoskinsr
Download Presentation

Scientific Cluster Support Program

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Scientific Cluster Support Program SCS Steering Committee Report Report to CSAC - March 4, 2005

  2. Overview • Growing interest in the use of Linux clusters for • scientific research at Berkeley Lab (2000) • Hard to efficiently manage a multi-node compute cluster • Findings from the Berkeley Lab Midrange Computing Workshop (March 2002) • and subsequent discussions with scientists identified a need for • affordable centralized support • The ultimate goal: • to increase the use of scientific computing to Lab research projects • to introduce parallel computing to Berkeley Lab researchers • to develop efficient, cost-effective methods for managing production clusters. $1.3M Four-year program started January 2003 Report to CSAC - March 4, 2005

  3. Mid Range Computing Gap Report to CSAC - March 4, 2005

  4. Scientific Cluster Support Program • Ten research projects from seven of the Lab's scientific Divisions were selected • to participate in the 4 year Laboratory-funded program • after a Lab-wide application process that was completed in September 2002. • These projects are eligible to receive the following services: • Pre-purchase consulting • Procurement assistance • Setup and configuration • Ongoing systems administration and cyber security • Computer room space with networking and cooling Report to CSAC - March 4, 2005

  5. * Replaced with Nuclear Science, PI: I-Yang Lee, Gretina Detector, 16 AMD Opteron processors Report to CSAC - March 4, 2005

  6. SCSC • The SCS Steering Committee is a working group of CSAC. As such, it informs CSAC of SCS project status and issues on a quarterly basis, or as needed, and solicits input from CSAC members as needed to aid in decision making and priority setting. In turn, CSAC members, as representatives of their divisions and CSAC, have the responsibility to communicate information to the Steering Committee that is important to its role in governance of the SCS Project. • The SCS Steering Committee will work with the full CSAC committee and the ITSD project team to ensure success and visibility of the first phase (implementation phase) of the SCS Project. This phase will conclude once all of the clusters have been purchased or integrated into the project, and the support activities for the clusters have become routine, anticipated March 2004. The Steering Committee will meet bi-monthly or as-needed during this first period. • The Steering Committee is responsible for governance of the implementation phase of the project. ITSD is responsible for day-to-day management. This governance includes the following activities: - Provide oversight to ensure accountability - Participate in decision making - Participate in priority setting Report to CSAC - March 4, 2005

  7. SCSC Members Committee Chair - Alessandra Ciocio - Physics, CSAC and MRC Working GroupPaul D. Adams - PBD, CSAC and MRC Working GroupShane Canon - NERSCTom Daley - ESDDamir Sudar - LSD, CSACGary Jung - SCS Project ManagerTammy Welcome - SCS Project DirectorJim Triplett (advisor) Report to CSAC - March 4, 2005

  8. PAST Challenges • Technical: • Scheduling • SCS Program progressed well despite complexity of project scheduling • due to availability of each project's procurement funds and differences in readiness. • Security • Export Control, On-time password token, Firewall • Software • Licensing • Conforming to developed standards (cluster distribution, architecture) • Strategic: • How do we select SCS replacements or additions? • What to do beyond SCS program? • Extensibility • Many other projects are interested • Revisit institutional computing resource • What about GRIDs? Report to CSAC - March 4, 2005

  9. Accomplishments • Development of Linux cluster expertise has encouraged the use of Linux clusters • at Berkeley Lab • - 14 clusters in production • 10 SCS funded, 3 fully recharged, 1 ITSD test cluster • 698 processors online (360 a year ago) • Development of high quality cost-effective Linux cluster support for Berkeley Lab • - Warewulf cluster software • - Standard SCS cluster distribution • Enabling science • -T-cell discovery (Oct 2003) • - Work on Photosynthesis (Nov 2004) • Driving costs down • Standardization • Outsourcing • Competitive bid procurement Report to CSAC - March 4, 2005

  10. Factors to SCS Success • Initial funding was key to get started • • Prominent scientists were the customers • • Talented, motivated staff • – Creative, but focused on production use • – Development of technical depth • • Adherence to standards • • Supportive Steering Committee • • Positive feedback Report to CSAC - March 4, 2005

  11. New Challenges • Larger systems – Scalability issues - e.g. parallel filesystems – Moving up the technology curve - Infiniband, PCI Express – Assessing integration risks • Increasing cluster utilization • Charting path forward Report to CSAC - March 4, 2005

  12. What’s next? • Identification of path forward for scientific cluster support at LBNL • - Form new charter for new group • - Membership (CSAC, cluster users) • Review process and timeline for activity • - Gather information* Jan-Jun 05 • - Analyze and synthesize data July-Aug 05 • - Identify options Oct 05 • - Select pathforward Nov 05 • - Communicate to gain support Dec 05 • - Submit proposal Spring 2006 *includes the science that get accomplished by having a well support cluster Report to CSAC - March 4, 2005

More Related