1 / 34

Implementation and experience with a 20.4 TFLOPS IBM BladeCenter cluster

Implementation and experience with a 20.4 TFLOPS IBM BladeCenter cluster. Craig A. Stewart Matthew Link, D. Scott McCaulay, Greg Rodgers, George Turner, David Hancock, Richard Repasky, Faisal Saied, Marlon Pierce, Ross Aiken, Matthias Mueller, Matthias Jurenz, Matthias Lieber.

dorothyruiz
Download Presentation

Implementation and experience with a 20.4 TFLOPS IBM BladeCenter cluster

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Implementation and experience with a 20.4 TFLOPS IBM BladeCenter cluster Craig A. Stewart Matthew Link, D. Scott McCaulay, Greg Rodgers, George Turner, David Hancock, Richard Repasky, Faisal Saied, Marlon Pierce, Ross Aiken, Matthias Mueller, Matthias Jurenz, Matthias Lieber 26 June 2007

  2. Outline • Background about Indiana University • Brief history of implementation • System architecture • Performance analysis • User experience and science results • Lessons learned to date

  3. Introduction - IU in a nutshell • ~$2B Annual Budget • One university with 8 campuses; 90.000 students, 3.900 faculty • 878 degree programs, including nation’s 2nd largest school of medicine • President Elect: Michael A. McRobbie • IT organization: >$100M/year IT budget, 7 Divisions • Research Technologies Division - responsible for HPC, grid, storage, advanced viz • Pervasive Technology Labs (Gannon, Fox, Lumsdaine) • Strategic priorities: life sciences and IT

  4. Big Red - Basics and history • Spring 2006: assembled in 17 days at IBM facility, disassembled, shipped to IU, reassembled in 10 days. • 20,4 TFLOPS peak theortical, 15,04 achieved on Linpack. 23rd on June 2006 Top500 List. • In production for local users on 22 August 2006, for TeraGrid users 1 October 2006 • Best Top500 rankingin IU history • Upgraded to 30,72 TFLOPS Spring 2008, ??? on June 2007 Top500 List • Named after nickname for IU sports teams

  5. Motivations and goals • Initial goals for 20,4 TFLOPS system: • Local demand for cycles exceeded supply • TeraGrid Resource Partner commitments to meet • Support life science research (Indiana Metabolomics and Cytomics Initiative - MetaCYT) • Support applications at 100s to 1000s of processors • 2nd phase upgrade to 30,7 TFLOPS • Support economic development in State of Indiana

  6. TeraGrid Motivation for being part of TeraGrid: • Support national research agendas • Improve ability of IU researchers to use national cyberinfrastructure • Testbed for IU computer science research • Compete for funding for larger center grants

  7. Processing power per node Density, good power efficiency relative to available processors Why a PowerPC-based cluster? • Possibility of performance gains through use of Altivec unit & VMX instructions • Blade architecture provides flexibility for future • Results of RFP

  8. HPCC and Linpack Results (510 nodes)

  9. IBM e1350 vs Cray XT3 per process (core) IBM e1350 vs Cray XT3 per processor

  10. IBM e1350 vs HP XC4000

  11. Difference: 4 KB vs 16 MB page size

  12. Comparative performance-NAMD

  13. Simulation of TonB-dependent transporter (TBDT) • Used systems at NCSA, IU, PSC • Modeled mechanisms for allowing transport of molecules through cell membrane • Work by Emad Tajkhorshid and James Gumbart, of University of Illinois Urbana-Champaign • Mechanics of Force Propagation in TonB-Dependent Outer Membrane Transport. Biophysical Journal 93:496-504 (2007)

  14. ChemBioGrid • Analyzed 555,007 abstracts in PubMed in ~ 8,000 CPU hours • Used OSCAR3 to find SMILES strings -> SDF format -> 3D structure (GAMESS) -> into Varuna database and then other applications • “Calculate and look up” model for ChemBioGrid

  15. WxChallenge • Over 1,000 undergraduate students, 64 teams, 56 institutions • Usage on Big Red: • ~16,000 CPU hours on Big Red (most of any TeraGrid resource) • 63% of processing done on Big Red • Most of the students who used Big Red couldn’t tell you what it is

  16. Overall usage to date

  17. Overall user reactions • NAMD, WRF users very pleased • Some community codes essentially excluded • Porting from Intel instruction set a significant perceived challenge in a cycle-rich environment • MILC optimization with VMX not successful so far

  18. Overall evaluation & conclusions • The manageability of the system is excellent • For a select group of applications, Big Red provides excellent performance and reasonable scalability • We are likely to expand the 10GigE from Big Red to the rest of the IU cyberinfrastructure • We are installing a 7 TFLOPS Intel cluster; model in future to be Intel-compatible processors as “default entry point,” more specialized systems for highly scalable codes

  19. Pace of change • The most powerful system attached to the TeraGrid has changed 3 times since June 2006 • Absolute rate of change feels very fast

  20. Conclusions • A 20.4 TFLOPS system with “not the usual” processors was successfully implemented serving local Indiana University researchers, and the national research audience via the TeraGrid (IU is 5th in providing cycles to TeraGrid at present) • We had excellent success in some regards with the system; excellent response in some niches • In the future Science Gateways may be more and more important in improving usability: • It’s impossible to expect most scientists to chase after the fastest available system when the fastest system is changing 3 times a year • Programmability of increasingly unusual architectures not likely to become easier • For applications with broad potential user bases, or extreme scalability on specialized systems, Science Gateways can successfully hide complexity from researchers

  21. Acknowledgements - funding agencies • IU’s involvement as a TeraGrid Resource Partner is supported in part by the National Science Foundation under Grants No. ACI-0338618l, OCI-0451237, OCI-0535258, and OCI-0504075 • The IU Data Capacitor is supported in part by the National Science Foundation under Grant No. CNS-0521433. • This research was supported in part by the Indiana METACyt Initiative. The Indiana METACyt Initiative of Indiana University is supported in part by Lilly Endowment, Inc. • This work was supported in part by Shared University Research grants from IBM, Inc. to Indiana University. • The LEAD portal is developed under the leadership of IU Professors Dr. Dennis Gannon and Dr. Beth Plale, and supported by grants ###___ • The ChemBioGrid Portal is developed under the leadership of IU Professor Dr. Geoffrey C. Fox and Dr. Marlon Pierce and funded via the Pervasive Technology Labs (supported by the Lilly Endowment, Inc.) and the National Institutes of Health ###))) • Many of the ideas presented in this talk were developed under a Fulbright Senior Scholar’s award to Stewart, funded by the • Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF), National Institutes of Health (NIH), Lilly Endowment, Inc., or any other funding agency

  22. Acknowledgements - People • Malinda Lingwall deserves thanks for most of the .ppt layout work • Maria Morris contributed to the graphics used in this talk • Marcus Christie and Surresh Marru of the Extreme! Computing Lab contributed the LEAD graphics • John Morris (www.editide.us) and Cairril Mills (cairril.com Design & Marketing) contributed graphics • This work would not have been possible without the dedicated and expert efforts of the staff of the Research Technologies Division of University Information Technology Services, the faculty and staff of the Pervasive Technology Labs, and the staf of UITS generally. • Thanks to the faculty and staff with whom we collaborate locally at IU and globally (via the TeraGrid, and especially at Technische Universitaet Dresden)

  23. Author affiliations Craig A. Stewart; stewart@iu.edu; Office of the Vice President and CIO, Indiana University, 601 E. Kirkwood, Bloomington, IN Matthew Link; mrlink@indiana.edu; University Information Technology Services, Indiana University, 2711 E. 10th St., Bloomington, IN 47408 D. Scott McCaulay, smccaula@indiana.edu, University Information Technology Services, Indiana University, 2711 E. 10th St., Bloomington, IN 47408 Greg Rodgers; rodgersg@us.ibm.com; IBM Corporation, 2455 South Road, Poughkeepsie, New York 12601 George Turner; turnerg@indiana.edu; University Information Technology Services, Indiana University, 2711 E. 10th St., Bloomington, IN 47408 David Hancock; dyhancoc@iupui,edu; University Information Technology Services, Indiana University — Purdue University Indianapolis, 535 W. Michigan Street, Indianapolis, IN 46202 Richard Repasky; rrepasky@indiana.edu, University Information Technology Services, Indiana University, 2711 E. 10th St., Bloomington, IN 47408 Peng Wang; pewang@ikndiana.edu; University Information Technology Services, Indiana University — Purdue University Indianapolis, 535 W. Michigan Street, Indianapolis, IN 46202 Faisal Saied; faied@purdue.edu; Rosen Center for Advanced Computing, Purdue University, 302 W. Wood Street, West Lafayette, Indiana 47907 Marlon Pierce; Community Grids Lab, Pervasive Technology Labs at Indiana University, 501 N. Morton Street, Bloomington, IN 47404 Ross Aiken; rmaiken@us.ibm.com; IBM Corporation, 9229 Delegates Row, Precedent Office Park Bldg 81, Indianapolis, IN 46240; Matthias Mueller; matthias.mueller@tu-dresden.de; Center for Information Services and High Performance Computing (ZIH) Dresden University of Technology D-01062 Dresden, Germany Matthias Jurenz; Matthias.Jurenz@tu-dresden.de; Center for Information Services and High Performance Computing (ZIH) Dresden University of Technology D-01062 Dresden, Germany Matthias Lieber; Matthias.Lieber@tu-dresden.de;Center for Information Services and High Performance Computing (ZIH) Dresden University of Technology D-01062 Dresden, Germany

  24. Thank you • Any questions?

More Related