1 / 12

SDSC RP Update TeraGrid Roundtable 01-14-10

SDSC RP Update TeraGrid Roundtable 01-14-10. Reviewing Dash. Unique characteristics: A pre-production/evaluation “data-intensive” supercomputer based on SSD flash memory and virtual shared memory Nehalem processors Integrating into TeraGrid: Add to TeraGrid Resource Catalog

sloane-chan
Download Presentation

SDSC RP Update TeraGrid Roundtable 01-14-10

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SDSC RP Update TeraGrid Roundtable 01-14-10

  2. Reviewing Dash • Unique characteristics: • A pre-production/evaluation “data-intensive” supercomputer based on SSD flash memory and virtual shared memory • Nehalem processors • Integrating into TeraGrid: • Add to TeraGrid Resource Catalog • Target friendly users interested in exploring unique capabilities • Available initially for start-up allocations (March 2010) • As it stabilizes and depending on user interest, evaluate more routine allocations at TRAC level • Appropriate CTSS kits will be installed • Planned to support TeraGrid wide-area filesystem efforts (GPFS-WAN, Lustre-WAN)

  3. Introducing Gordon(SDSC’s Track 2d System) • Unique characteristics: • A “data-intensive” supercomputer based on SSD flash memory and virtual shared memory • Emphasizes MEM and IO over FLOPS • A system designed to accelerate access to massive data bases being generated in all fields of science, engineering, medicine, and social science • Sandy Bridge processors • Integrating into TeraGrid: • Will be added to TeraGrid Resource Catalog • Appropriate CTSS kits will be installed • Planned to support TeraGrid wide-area filesystem efforts • Coming summer 2011

  4. The Memory Hierarchy Potential 10x speedup for random I/O to large files and databases Flash SSD, O(TB) 1000 cycles

  5. Gordon Architecture: “Supernode” • 32 Appro Extreme-X compute nodes • Dual processor Intel Sandy Bridge • 240 GFLOPS • 64 GB • 2 Appro Extreme-X IO nodes • Intel SSD drives • 4 TB ea. • 560,000 IOPS • ScaleMPvSMP virtual shared memory • 2 TB RAM aggregate • 8 TB SSD aggregate 4 TB SSD I/O Node 240 GF Comp. Node 64 GB RAM 240 GF Comp. Node 64 GB RAM vSMP memory virtualization

  6. Gordon Architecture: Full Machine • 32 supernodes = 1024 compute nodes • Dual rail QDR Infiniband network • 3D torus (4x4x4) • 4 PB rotating disk parallel file system • >100 GB/s SN SN SN SN SN SN SN SN SN SN SN SN SN SN SN SN SN SN SN SN SN SN SN SN SN SN SN SN SN SN SN SN D D D D D D

  7. Comparing Dash and Gordon systems Doubling capacity halves accessibility to any random data on a given media

  8. Data mining applicationswill benefit from Gordon • De novo genome assembly from sequencer reads & analysis of galaxies from cosmological simulations and observations • Will benefit from large shared memory • Federations of databases and Interaction network analysis for drug discovery, social science, biology, epidemiology, etc. • Will benefit from low latency I/O from flash

  9. Data-intensive predictive sciencewill benefit from Gordon • Solution of inverse problems in oceanography, atmospheric science, & seismology • Will benefit from a balanced system, especially large RAM per core & fast I/O • Modestly scalable codes in quantum chemistry & structural engineering • Will benefit from large shared memory

  10. We won SC09 Data Challenge with Dash! • With these numbers: • IOR 4KB • RAMFS 4Million+ IOPS on up to .750 TB of DRAM (1 supernode’s worth) • 88K+ IOPS on up to 1 TB of flash (1 supernode’s worth) • Speed up Palomar Transients database searches 10x to 100x • Best IOPS per dollar • Since that time we boosted flash IOPS to 540K hitting our 2011 performance targets

  11. Deployment Schedule • Summer 2009-Present • Internal evaluation and testing w/ internal apps – SSD and vSMP • Starting ~Mar 2010 • Dash would be allocated via startup requests by friendly TeraGrid users. • Summer 2010 • Expect to change status to allocable system starting ~October 2010 via TRAC requests • Preference given to applications that target the unique technologies of Dash. • Oct 2010 - June 2011 • Operate Dash as an allocable TeraGrid resource, available thru the normal POPS/TRAC cycles, with appropriate caveats about preferred applications and friendly-user status. • Help fill the SMP gap created by Altix’s being retired in 2010 • March 2011 – July 2011 • Gordon build and acceptance • July 2011 – June 2014 • Operate Gordon as an allocable TeraGrid resource, available thru the normal POPS/TRAC cycles

  12. Consolidating Archive Systems • SDSC has historically operated two archive systems: HPSS and SAM-QFS • Due to budget constraints, we’re consolidating to one: SAM-QFS • We’re currently migrating HPSS user data to SAM-QFS HPSS (R/W) HPSS (R) SAMQFS (R/W) SAMQFS Legacy: (R) Allocated: (R/W) SAMQFS Legacy: (R) Allocated: (R/W) SAMQFS (R) TBD Hardware 6 Silos 12 PB 64 Tape Drives No Change Hardware 2 Silos 6 PB 32 Tape Drives No Change Jul 2009 Mid 2010 Mar 2011 Jun 2013 …

More Related