1 / 14

TeraGrid Coordination Meeting June 10, 2010

TeraGrid Forum Meeting June 16, 2010. TeraGrid Coordination Meeting June 10, 2010. The Gordon Sweet Spot. Data Mining De novo genome assembly from sequencer reads & analysis of galaxies from cosmological simulations and observations.

Download Presentation

TeraGrid Coordination Meeting June 10, 2010

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. TeraGrid Forum Meeting June 16, 2010 TeraGrid Coordination Meeting June 10, 2010

  2. The Gordon Sweet Spot • Data Mining • De novo genome assembly from sequencer reads & analysis of galaxies from cosmological simulations and observations. • Federations of databases and Interaction network analysis for drug discovery, social science, biology, epidemiology, etc. • Predictive Science • Solution of inverse problems in oceanography, atmospheric science, & seismology. • Modestly scalable codes in quantum chemistry & structural engineering. Large Shared Memory; Low Latency, Fast Interconnect; Fast I/O system

  3. The Usual (HPC)Suspects are, well, suspect.

  4. Typical HPC I/O has very little Random I/O – which is a sweet spot for SSD’s and Data Intensive Computing • For example, NERSC study * of 50 applications found: • Random access is rare for HPC applications; the I/O access is dominated by Sequential operations. • Applications I/O dominated by append-only writes • The majority of applications have adopted a one-file-per-processor approach to disk-I/O where each process of a parallel applications writes to its own separate file rather than using parallel/shared I/O API’s to write from all of the processors into a single file. * Source: Characterizing and Predicting the I/O Performance of HPC Applications Using a Parameterized Synthetic Benchmark (Shalf, et al, SC ‘08)

  5. Data Intensive WorkshopOctober 26-29, 2010 • Identify "Grand Challenges" in data-intensive science across a broad range of topics • Identify applications and disciplines that will benefit from Gordon's unique architecture and capabilities • Invite potential users of Gordon to speak and participate • Make leaders in data-intensive science aware of what SDSC is doing in this space • Raise awareness among disciplines poorly served by current HPC offerings • Better understand Gordon's niche in the data-intensive cosmos and potential usage modes • Logistics: • ~100 attendees; @SDSC; incl. 1-day hands-on; plenary speakers; astronomy, geoscience, neuroscience, physics, engineering, social science, and data-related technologies

  6. Gordon Highlights • 245TF; 1024 Nodes; 64GB/node (64TB) • Sandy Bridge processor • Dual socket • Core count TBD • 8 flops/clock/core via AVX instruction set • 256TB Enterprise Intel SSD via 64 Nehalem/Westmere I/O Nodes (4TB per node) • Dual rail, QDR 3D torus IB Interconnect • Shared memory supernodes via ScaleMP vSMP Foundation • 32 Compute nodes/supernode • 128 node version launching in fall • Message passing between supernodes coming • 4PB Data Oasis Disk

  7. Gordon Supernode Architecture • 32 Appro GreenBlade • Dual processor Intel Sandy Bridge • 240 GFLOPS • 64 GB/node • # Cores TBD • 2 Appro IO nodes/32 SN • Intel SSD drives • 4 TB ea. • 560,000 IOPS • ScaleMPvSMP virtual shared memory • 2 TB RAM aggregate (64GBx32) • 8 TB SSD aggregate(256GBx32) 4 TB SSD I/O Node 240 GF Comp. Node 64 GB RAM 240 GF Comp. Node 64 GB RAM vSMP memory virtualization

  8. Project Milestones • Dash is now a TeraGrid resource • Allocation processes • Allocated users • Account setup • Application Environment • 16-Way vSMP Acceptance Approved • SDSC is becoming a flash center of excellence in HPC. Working closely with Dr. Steve Swanson in UCSD’s Center for Magnetic Recording Research (CMRR) • Education, Outreach and Training • Data Intensive Workshop set for October 26-29 at SDSC. • NVM Workshop at UCSD in April • SC ‘10 Papers submitted • TeraGrid 2010 papers, tutorial, BOF submitted • Data intensive use cases being developed

  9. Production Dash as of April 1 • Two 16 node virtual clusters • SSD-only • 16 node; Nehalem, dual socket 8 core; 48GB ; 1 TB SSD (16) • SSD’s are local to the nodes • Standard queues available • vSMP + SSD • 16 nodes, Nehalem , dual socket, 8 core, 48GB; 960GB SSB (15) • SSD’s are local to the nodes • Treated as a single shared resource • GPFS-WAN • Additional 32 nodes will be brought online after the vSMP 32-way acceptance testing in July

  10. Gordon Timeline

  11. The Road Ahead • Understanding data intensive applications and how they can benefit from Gordon’s unique architecture • Identifying new user communities • Education, Outreach and Training • Managing to the schedule and milestones • Track and assess flash technology developments • Education, Outreach and Training • I/O performance • Parallel file systems • InfiniBand/3D torus routing • Individual roles and responsibilities • Systems management processes • Education, Outreach and Training • Staffing ramp-up in October • Have fun doing this! 

  12. TeraGrid Support has been Instrumental • Diane Baxter • Jeff Bennett • Leo Carson • Larry Diegel • Jerry Greenberg • Dave Hart • Jiahua He • Eva Hocks • Tom Hutton • Arun Jagatheesen • Adam Jundt • Richard Moore • Mike Norman • Wayne Pfeiffer • Susan Rathbun • Scott Sakai • Allan Snavely • Mark Sheddon • Shawn Strande • Mahidhar Tatineni • And many others…

  13. SDSC’s Summer Education Program • TeacherTech summer workshops http://education.sdsc.edu/teachertech • Conference of New Teachers in Genomics • Modeling Instruction in High School Physics: An Introduction • Introduction to Adobe Photoshop and the World of Digital Art • TeacherTECH Begins a Collaboration with UCSD-TV – Tune In! • Newton’s Laws of Gravity: From the Celestial to the Terrestrial • Earthquake Science: Beyond Static Images and Flat Maps • Student summer workshops http://education.sdsc.edu/teachertech/index.php?module=ContentExpress&func=display&ceid=18 • Exploring the World of Digital Art and Design • Introduction to Matlab: An Interactive Visual Math Experience • UCSD Biotechnology Academy • "Full Color Heroes" in Digital Art & Design: Comic Book Coloring! • 2D – 3D Insani-D! • 3D Photography: Experience It! • Photography + Photoshop = Fun! • Exploring Digital Photography and the Wonders of Photoshop • Introduction to Maya and 3D Modeling

  14. SDSC’s Summer Education Program (cont.) • Research Experience For High School Students (REHS) (21 students) http://education.sdsc.edu/teachertech/index.php?module=ContentExpress&func=display&ceid=37 • Supercomputer-based Workflow for Managing Large Biomedical Images • Refinement of Data Mining Software and Application to Space Plasmas for Data Analysis and Visualization • Sonification of UCSD Campus Energy Consumption • Visualization and 3D Content Creation • The Cooperative Association for Internet Data Analysis Web Development Intern • Documentation Assistant – Health Info Databases Project

More Related