1 / 45

SURA Presentation for IBM HW for the SURA GRID for 2007

SURA Presentation for IBM HW for the SURA GRID for 2007. Janis Landry-Lane janisll@us.ibm.com IBM World Wide Deep Computing. AGENDA for the presentation. IBM pSERIES offering for SURA for 2007 dileep@us.ibm.com IBM e1350 HS21 Blade offering for SURA for 2007 jpappas@us.ibm.com

alaqua
Download Presentation

SURA Presentation for IBM HW for the SURA GRID for 2007

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SURA Presentation for IBM HW for the SURA GRID for 2007 Janis Landry-Lane janisll@us.ibm.com IBM World Wide Deep Computing

  2. AGENDA for the presentation • IBM pSERIES offering for SURA for 2007 • dileep@us.ibm.com • IBM e1350 HS21 Blade offering for SURA for 2007 • jpappas@us.ibm.com • INTEL Clovertown performance benchmarks • Omar.stradella@intel.com • Michael.greenfield@intel.com • SURA services/collaboration • frankli@us.ibm.com • janisll@us.ibm.com • INTEL/IBM partnership • mark.e.spargo@intel.com

  3. Power5+ p575 High Performance Computing Solutions for SURA Dileep Bhattacharya Product Manager High end System p Servers E-mail : dileep@us.ibm.com

  4. Power5+, p575 Server and Rack General Description

  5. P5 575 System Solution Characteristics • Robust Hardware with High Reliability Components • 16 CPU scalability within a node • Low Latency High Performance Switch Technology • Industrial Strength OS and HPC Software Subsystems • High Compute Density Packaging • Ability to scale to very large configurations

  6. BPA 2U 16W 16W 16W 16W 16W 16W 16W 16W 4U Fed SW 0.97 TFlop Solution For SURA

  7. BPA 16W 2U 16W 16W 16W 16W 16W 16W 16W 16W 16W 16W 16W 16W 16W 4U Fed SW 1.7 TFlop Solution for SURA

  8. p5 575 Software • AIX 5.3 • General Parallel File System (GPFS) with WAN Support • LoadLeveler • Cluster Systems Management (CSM) • Compilers (XL/FORTRAN, XLC) • Engineering and Scientific Subroutine Library (ESSL) • IBM’s Parallel Environment (PE) • Simultaneous Multi-Threading (SMT) Support • Virtualization, Micro-Partitioning, DLPAR

  9. Growing the SURAgridIBM e1350 Cluster Offerings John Pappas IBM Southeast Linux Cluster Sales March 12, 2007

  10. Outline • IBM HPC Offerings • e1350 Overview • SURA Offerings Approach • SURA e1350 Configurations

  11. IBM SURA Offering in 2006 and 2007 SURA Offering in 2007 H C R U6 IBM IBM HPC Foundation Offerings

  12. What is an IBM Cluster 1350

  13. An IBM portfolio of components that have been cluster configured, tested, and work with a defined supporting software stack. • Factory assembled • Onsite Installation • One phone number for support. • Selection of options to customize your configuration including Linux operating system (RHEL or SUSE), CSM, xCAT, GPFS and Deep Computing Visualization (DCV) IBM ~ Cluster 1350 Hardware IBM Servers Management Nodes Compute Nodes Storage Nodes 8-way Servers 3755 p5 520 IBM TotalStorage® 4-way Servers 3550, 3650, 3950 Storage Networking Fiber iSCSI 3455, 3655, 3755 Networks p5 505 Ethernet OpenPower 710 10 GbE Blade Servers Storage Software ServeRaid 1000 MbE HS21, LS21 10/100 MbE JS21 QS20 Infiniband Core Technologies • Processors • -Intel® • -AMD® • -CELL BE • PowerPC® • IBM POWER5™ Disk Storage 1X 4X SCSI Fiber Specialty SATA Myrinet®

  14. BladeCenter Efficiency • BladeCenter helps clients simplify their IT infrastructure and gain operational efficiency. Fewer Outages Fewer Fabric Connections Less Power Smarter Management Fewer Cables Less Cooling

  15. Introducing the IBM BladeCenter HS21 XM • Deliver leadership performance and efficiency for most applications • Double the memory of the HS21 30mm blade • Integer, Floating Point, many financial specific applications will benefit greatly • A new core architecture, dual core processors, and fully buffered DIMMs all lead to the gains • Supports new solid state Modular Flash Devices • Starting with the IBM 4GB Modular Flash Drive, solid state drives are optimized for durability, reliability and power efficiency • RAID arrays can potentially be simplified • Flash drives virtually eliminate yet another potential point of failure inside the BladeCenter • HS21 XM is designed to extract the most from multi-core processors • More memory, NICs, I/O, designed for diskless operation and low power • 64-bit matched with up to 32GB of memory support even the most demanding solutions • 10Gb enabled for BladeCenter H • Can support both PCI-X and PCI-Express I/O Cards • Supports all the traditional I/O cards in the BladeCenter family

  16. IBM BladeCenter HS21 XMA Closer Look 2.5” SAS HDD • 8 FB DIMMs • Up to 32GB of memory per blade • SAS HDD (36, 73, 146GB) • 2 NICs - Broadcom 5708S (TOE enabled) • Diskless ready: • iSCSI and SAN boot for all OS • Support for IBM Modular Flash Device 4GB • Dual and Quad-Core processors • 65W and 80W Woodcrest • 1.60-3.00GHz • 80W Clovertown • 1.60-2.67 GHz • Supports Concurrent KVM Mezzanine Card (cKVM) • Supports PEU2 and SIO Expansion Units • Support for the new MSIM Combo Form Factor (CFF) card to double port count per blade 8 Standard FB DIMMs CFF-VCard CFF-HCard IBM Modular Flash Drive MCH Nested Between CPUs HS21 XM availability: early March 2007

  17. Complete Systems Management for BladeCenter • Integrated management for resource efficiency • Automation for productivity Application Management • LoadLeveler • Compilers • Libraries AMM with BladeCenter AddressManager • MAC Address and WWN address (I/O) virtualization • Manage, control, install single pt • Concurrent KVM • RAS • PowerExecutive xCAT or CSM • Manage across chassis, other platforms • Virtualize and optimize, stateless Computing • Maintain and update • Manage/report policy

  18. CoolBlue™ from IBM:Energy management innovation • Power has become a difficult limitation for many clients to manage • It takes a holistic approach to power to make a different • IBM Mainframe inspired thinking put to work inside BladeCenter • Steps to extract the most from your data center • Save power with smartly designed servers • Plan better with accurate planning tools • Monitor and manage power with PowerExecutive • Use room level solutions as needed • Plan for the future Power Configurator PowerExecutive™ Rear Door Heat Exchanger Plan & Manage Budget Save Really interested in power? DownloadPower and Cooling Customer Presentation: http://w3-1.ibm.com/sales/systems/portal/_s.155/254?navID=f220s220&geoID=All&prodID=BladeCenter&docID=bcleadershipPowCool http://www-1.ibm.com/partnerworld/sales/systems/myportal/_s.155/307?navID=f220s240&geoID=All&prodID=System x&docID=bcleadershipPowCool

  19. SURA e1350 Offerings Approach • Surveyed the SURA members for useful cluster configurations • Keep it simple • Leverage the latest hardware and cluster technology • Minimize cost of ownership • Provide a complete, integrated ready-to-use cluster • Leverage the existing MOU with SURA for the pSeries

  20. SURA e1350 Offerings Architecture • Configuration Basis: New IBM BladeCenter-H, new HS21XM Blades and Intel Quad-Core Processors • Create a 3 TFLOP and a 6 TFLOP cluster configuration • 3 TFLOP – Cost Conscience Solution for HPC • One Rack solution utilizing GigE interconnect • 1GB/core • Combination Management/User node with storage • 6 TFLOP – Performance Focused Solution for HPC • Two Rack solution utilizing DDR Infiniband • 2GB/core • Combination Management/User node with storage • Optional SAN supporting 4Gbs storage at 4.6Tbytes

  21. Common e1350 Features • BladeCenter-H Based Chassis • Redundant power supplies and Fan Units • Advanced Management Module • Dual 10 Gbps Backplanes • Fully integrated, tested and installed e1350 Cluster • Onsite configuration, setup and skills transfer from our Cluster Enablement Team. • QUAD-CORE Intel Processors (8 Cores/Node) • Single Point of Support for the Cluster • Terminal Server connection to every Node • IBM 42U Enterprise Racks • Pull-out Console monitor, keyboard, mouse • Redundant power and fans on all nodes • 3 Year Onsite Warranty • 9x5xNext Day on-site on Compute Nodes • 24x7x4 Hour on-site on Management Node, switches, racks (optional Storage)

  22. 3 TFLOP e1350 Cluster - • 34 HS21XM Blade Servers in 3 BladeCenter H Chassis Dual Quad-Core 2.67 GHz Clovertown Processors 1 GB Memory per core 73 GB SAS Disk per blade GigE Ethernet to Blade with 10Gbit Uplink Serial Terminal Server connection to every blade Redundant power/fans • x3650 2U Management/User Node Dual Quad-Core 2.67 GHz Clovertown Processors 1 GB Memory per core Myricom 10Gb NIC Card RAID Controller with (6) 300GB 10K Hot-swap SAS Drives Redundant power/fans • Force10 48-port GigE Switch with 2 10Gb Uplinks • SMC 8-port 10Gb Ethernet Switch • (2) 32-port Cyclades Terminal Servers • RedHat ES 4 License and Media Kit (3 years update support) • Console Manger, Pull-out console, keyboard, mouse • One 42U Enterpise Rack, all cables, PDU’s • Shipping and Installation • 5 Days onsite Consulting for configuration, skills transfer • 3 Year Onsite Warranty

  23. 6 TFLOP e1350 Cluster - • 70 HS21XM Blade Servers in 5 BladeCenter H Chassis Dual Quad-Core 2.67 GHz Clovertown Processors 2 GB Memory per core 73 GB SAS Disk per blade GigE Ethernet to Blade DDR Non-Blocking Voltaire Infiniband Low Latency Network Serial Terminal Server connection to every blade Redundant power/fans • x3650 2U Management/User Node Dual Quad-Core 2.67 GHz Clovertown Processors 1 GB Memory per core Myricom 10Gb NIC Card RAID Controller with (6) 300GB 10K Hot-swap SAS Drives Redundant power/fans • DDR Non-Blocking Infiniband Network • Force10 48-port GigE Switch • (3) 32-port Cyclades Terminal Servers • RedHat ES 4 License and Media Kit (3 years update support) • Console Manger, Pull-out console, keyboard, mouse • One 42U Enterpise Rack, all cables, PDU’s • Shipping and Installation • 10 Days onsite Consulting for configuration, skills transfer • 3 Year Onsite Warranty

  24. 6 TFLOP e1350 Cluster Storage Option - • x3650 Storage Node Dual Quad-Core 2.67 GHz Clovertown Processors 1 GB Memory per core Myricom 10Gb NIC Card (2) 3.5" 73GB 10k Hot Swap SAS Drive (2) IBM 4-Gbps FC Dual-Port PCI-E HBA Redundant power/fans 3 Year Onsite 24x7x4Hour On-site Warranty • DS4700 Storage Subsystem 4 Gbps Performance (Fiber Channel) EXP810 Expansion System (32) 4 Gbps FC, 146.8 GB/15K Enhanced Disk Drive Module (E-DDM) Total 4.6 TB Storage Capacity

  25. Why IBM SURA Cluster Offerings • Outstanding Performance at 2.67 GHz; Industry Leading Quad-Core • 3 TFLOP – 2.89 Peak TFLOPS, 1.46 TFLOP Estimated Actual, 50% Efficiency • 6 TFLOP – 5.96 Peak TFLOPS, 4.29 TFLOPS Estimated Actual, 72% Efficiency • Switches are Chassis Based so modular growth is simpler • Redundant power and fans • 38% less power and cooling required for BladeCenter solution over a 1U rack-mount cluster • Smaller footprint • Stateless Computing with xCAT or CSM • Academic Initiative – CSM, LoadLeveler, Compilers • Complete, integrated, installed cluster. Onsite skills transfer • Application Migration Support from IBM • Single Point of Support for 3 Years

  26. HPC performance slides for SURA comparing Clovertown and other offerings Michael Greenfield Principal Software Engineer, Enterprise System Software Division, Intel, office: 253 371 7154

  27. Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel® products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, reference http://www.intel.com/performance/resources/benchmark_limitations.htm or call (U.S.) 1-800-628-8686 or 1-916-356-3104. * Other brands and names may be claimed as the property of others.

  28. Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel® products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, reference http://www.intel.com/performance/resources/benchmark_limitations.htm or call (U.S.) 1-800-628-8686 or 1-916-356-3104. * Other brands and names may be claimed as the property of others.

  29. Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel® products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, reference http://www.intel.com/performance/resources/benchmark_limitations.htm or call (U.S.) 1-800-628-8686 or 1-916-356-3104. * Other brands and names may be claimed as the property of others.

  30. Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel® products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, reference http://www.intel.com/performance/resources/benchmark_limitations.htm or call (U.S.) 1-800-628-8686 or 1-916-356-3104. * Other brands and names may be claimed as the property of others.

  31. Performance comparison of Intel® Xeon® Processor 5160 and Intel® Xeon® Processor X5355 for Life Sciences Applications Omar G. Stradella, PhD Life Sciences Applications Engineer Intel Corporation

  32. HPC Life Sciences Applications HPC Life Sciences Applications Computational Chemistry Sequence Analysisand Biological Databases Docking De novo Design Secondary Structure Prediction QSAR, QSPR Pharmacophore Modeling, ShapeMatching Homology Modeling Pathway Analysis Focus for Today’s Presentation X-Ray and NMR 7 applications in Computational Chemistry and Bioinformatics

  33. Summary of Comparison Platforms

  34. Intel relative performance for Life Sciences apps Clovertown relative performance compared to Woodcrest (one thread per core) Clovertown 34-70% better than Woodcrest Computational Chemistry Bioinformatics Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel® products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, reference http://www.intel.com/performance/resources/benchmark_limitations.htm or call (U.S.) 1-800-628-8686 or 1-916-356-3104. Higher is better Source: Intel Internal Measurement * Other brands and names may be claimed as the property of others.

  35. Gaussian* and GAMESS* relative performance Clovertown relative performance compared to Woodcrest (one thread per core) Gaussian* (Gaussian, Inc) Version: 03-D.01 GAMESS* (Iowa State University) Version: 12 Apr 2006 Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel® products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, reference http://www.intel.com/performance/resources/benchmark_limitations.htm or call (U.S.) 1-800-628-8686 or 1-916-356-3104. Higher is better Numbers on bars are elapsed times in seconds Source: Intel Internal Measurement * Other brands and names may be claimed as the property of others.

  36. Amber* and GROMACS* relative performance Clovertown relative performance compared to Woodcrest (one thread per core) Amber* (UCSF) Version: 9 GROMACS* (Groningen University ) Version: 3.3 Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel® products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, reference http://www.intel.com/performance/resources/benchmark_limitations.htm or call (U.S.) 1-800-628-8686 or 1-916-356-3104. Higher is better Numbers on bars are elapsed times in seconds Source: Intel Internal Measurement * Other brands and names may be claimed as the property of others.

  37. Intel scalability assessment for Life Sciences apps Parallel speedups vs 1 thread Computational Chemistry Bioinformatics Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel® products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, reference http://www.intel.com/performance/resources/benchmark_limitations.htm or call (U.S.) 1-800-628-8686 or 1-916-356-3104. Higher is better Source: Intel Internal Measurement * Other brands and names may be claimed as the property of others.

  38. Summary • On fully subscribed systems, Clovertown shows 34-70% better performance than Woodcrest • Clovertown and Woodcrest scalabilities on 2 and 4 cores are the same • Clovertown parallel speedups on 8 cores ranges from 5x to 7.8x (relative to 1 core)

  39. SURA Services– Our track recordPresentation for the SURA GRID for 2007 Frank N Li frankli@us.ibm.com IBM Deep Computing, Americas

  40. IBM SURA Install Team • IBM Architect and Solution Team: cross-brand architecture, solutioning and support – Brent Kranendonk, Frank Li • IBM Cluster Enablement Team (CET): implementing complex HPC systems, both pSeries and Linux x86 cluster, hardware (server, cluster, storage, tape) and software (OS, cluster management tool, scheduler, GPFS) – Glen Corneau, Steve Cary, Larry Cluck • IBM Advanced Technology Services (ATS) Team: system testing (HPL) and performance tuning -- Joanna Wong • IBM Deep Computing Technical Team: assist migrating mission-critical applications and benchmarking – Carlos Sosa, etc • IBM Grid Enablement for SURAgrid– Martin Maldonado and Chris McMahon • Successful SURA installation at Georgia State University (GSU) and Texas A & M (TAMU)

  41. Each CET project is professionally managed by a dedicated project manager who ensures an efficient delivery and deployment of your cluster Each offering includes rigorous burn-in testing The staging facility is conveniently located near our IBM’s manufacturing site CET offerings can be added to your next e1350 order using part number 26K7785 – 1Day of CET Consult Cluster Enablement Team The x-Series Linux Cluster Enablement Team (CET) is a full service enablement team providing customers with direct access to IBM experts skilled in the implementation of Linux clustering hardware and software technologies. The following type of clustering engagements are provided by CET: • Pre-configuration and cluster burn-in at our manufacturing site or customer’s location • Integration with existing clusters and cluster software upgrade • Software installation, including OS, cluster management, file system, compliers, schedulers or customer applications • Executing customer acceptance testing • Installing storage and GPFS front-ends • On site Project Management • Customer Training/Education

  42. IBM SURA Collaborations • IBM CET installations • LSU/IBM performance work with ADCIRC application • U. Miami/IBM commitment to work on parallel version of ADCIRC • U. Miami/IBM commitment to optimize HYCOM and WAM • TAMU request for help on MM5 and WRF • Phil Bogden presents in the IBM BOOTH at SC06

  43. SURA Partnership with INTEL and IBMPresentation for the SURA GRID for 2007 Mark Spargo mark.e.spargo@intel.com INTEL/IBM RELATIONSHIP EXECUTIVE

  44. Intel & IBM: Delivering Together • Proven track record of delivering innovative solutions • IBM fastest-growing Intel server vendor • IBM/Intel BladeCenter collaboration • Enterprise X-Architecture platform validation and development collaboration • Jointly delivering superior server solutions with exceptional price/performance. Collaboration spans: • Design • Development • Manufacturing • Marketing & sales

  45. Founders Geneseo Co-Inventors SMASH IBM & Intel: Industry Collaboration IDFSan Francisco & Taiwan Commitment to 4th gen. technology – supporting quad–core Xeon® MP BladeCenter:Products & Openness Virtualization:vConsolidate, others

More Related