1 / 14

Status Update TeraGrid Science Advisory Board Meeting July 19, 2010 Dr. Mike Norman, PI

Status Update TeraGrid Science Advisory Board Meeting July 19, 2010 Dr. Mike Norman, PI Dr. Allan Snavely, Co-PI. Gordon Objective. Deploy a computational resource to the national community that is specifically designed for data intensive computing

ardith
Download Presentation

Status Update TeraGrid Science Advisory Board Meeting July 19, 2010 Dr. Mike Norman, PI

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Status Update TeraGrid Science Advisory Board Meeting July 19, 2010 Dr. Mike Norman, PI Dr. Allan Snavely, Co-PI

  2. Gordon Objective • Deploy a computational resource to the national community that is specifically designed for data intensive computing • 245 Tflop, 1,024 node cluster based on the Intel Sandy Bridge Processor • 2TB, large shared memory “supernodes” composed of 32 compute nodes per supernode • High performance I/O subsystem based on enterprise class SSD’s • Low latency, high speed interconnect via a dual rail, QDR InfiniBand 3D torus network

  3. The Gordon Sweet Spot • Data Mining • De novo genome assembly from sequencer reads & analysis of galaxies from cosmological simulations and observations. • Federations of databases and Interaction network analysis for drug discovery, social science, biology, epidemiology, etc. • Predictive Science • Solution of inverse problems in oceanography, atmospheric science, & seismology. • Modestly scalable codes in quantum chemistry & structural engineering. Large Shared Memory; Low Latency, Fast Interconnect; Fast I/O system

  4. Gordon Aggregate Capabilities

  5. Gordon Supernode Architecture • 32 Appro Green Blade • Dual processor Intel Sandy Bridge • 240 GFLOPS • 64 GB/node • # Cores TBD • 2 Appro IO nodes/SN • Intel SSD drives • 4 TB ea. • 560,000 IOPS • ScaleMPvSMP virtual shared memory • 2 TB RAM aggregate (64GBx32) • 8 TB SSD aggregate(256GBx32) 4 TB SSD I/O Node 240 GF Comp. Node 64 GB RAM 240 GF Comp. Node 64 GB RAM vSMP memory virtualization

  6. Before Gordon There is Dash • Dash has been deployed as a risk mitigator for Gordon • Dash is an Appro cluster that embodies the core architectural features of Gordon and provides a platform for testing, evaluation, and porting/optimizing applications • 64 node, dual-socket, 4 core, Nehalem • 48GB memory per node • 4TB of Intel SLC Flash (X25E) • InfiniBand Interconnect • vSMP Foundation supernodes • Using Dash for: • SSD Testing (vendors, controllers, RAID, file systems) • 16-Way vSMP Acceptance • 32-Way vSMP Acceptance test • Early User Testing • Development of processes and procedures for systems administration, security, and networking

  7. Dash TeraGrid Resource • Two 16 node virtual clusters • SSD-only • 16 node; Nehalem, dual socket 8 core; 48GB ; 1 TB SSD (16) • SSD’s are local to the nodes • Standard queues available • vSMP + SSD • 16 nodes, Nehalem , dual socket, 8 core, 48GB; 960GB SSB (15) • SSD’s are local to the nodes, but presented as a single file system via vSMP • The supernode is treated as a single shared resource • GPFS-WAN • Additional 32 nodes will be brought online in early August after the vSMP 32-way acceptance testing is complete

  8. Gordon Timeline

  9. Dash Early User Success Stories Palomar Transient Factory (Astrophysics) Large, random queries with 100 new transients every minute - Increased performance of queries upto 161 % NIH Biological Networks Pathway Analysis Queries on graphical data producing a lot of random IO, requiring significant IOPS. DASH vSMP speedup: 186 % Protein Data Bank - Alignment Database Predictive Science with queries on pair-wise comparisons and alignments of protein structures: 69% DASH speedup Supercomputing Conference (SC) 2009 HPC Storage Challenge Winner, 2010 – Two Publications accepted. Finalist for Best paper and Best Student paper.

  10. Dash/ Gordon Documentation • Dash User Guide (SDSC site -->User Support --> Resources --> Dash) • http://www.sdsc.edu/us/resources/dash/ • TeraGrid Resource Catalog (TeraGrid site --> User Support --> Resources --> Compute & Viz Resources): https://www.teragrid.org/web/user-support/compute_resources • Gordon is mentioned under Dash's listing in the TG Resource Catalog as a future resource. It will have its own entry as the production date nears • TeraGrid Knowledge Base, two articles(TeraGrid site --> Help & Support --> KB --> Search on "Dash" or "Gordon"): • https://www.teragrid.org/web/user-support/kb • On the TeraGrid, what is Dash? • On the TeraGrid, what is Gordon?

  11. Dash Allocations This project is getting at the heart of the value of Gordon. Need more text here – why focus on this allocation?

  12. Gordon/Dash Education, Outreach and Training Activities Supercomputing Conference 2010, New Orleans, LANovember 13-19, 2010

  13. Grand Challenges in Data-Intensive Sciences October 26-29, 2010 San Diego Supercomputer Center , UC San Diego • Confirmed conference topics and speakers : • Needs and Opportunities in Observational Astronomy - Alex Szalay, JHU • Transient Sky Surveys – Peter Nugent, LBNL • Large Data-Intensive Graph Problems – John Gilbert, UCSB • Algorithms for Massive Data Sets – Michael Mahoney, Stanford U. • Needs and Opportunities in Seismic Modeling and Earthquake Preparedness - Tom Jordan, USC • Needs and Opportunities in Fluid Dynamics Modeling and Flow Field Data Analysis – Parviz Moin, Stanford U. • Needs and Emerging Opportunities in Neuroscience – Mark Ellisman, UCSD • Data-Driven Science in the Globally Networked World – Larry Smarr, UCSD

More Related