290 likes | 412 Views
NCSA is the Leading Edge Site for the National Computational Science Alliance. www.ncsa.uiuc.edu. = Recent Computations by NSF Grand Challenge Research Teams. = Next Step Projections by NSF Grand Challenge Research Teams. = Long Range Projections from Recent Applications Workshop. 10 14.
E N D
NCSA is the Leading Edge Site for the National Computational Science Alliance www.ncsa.uiuc.edu
= Recent Computations by NSF Grand Challenge Research Teams = Next Step Projections by NSF Grand Challenge Research Teams = Long Range Projections from Recent Applications Workshop 1014 Turbulent Convection in Stars ASCI in 2004 MEMORY NSF in 2004 (Projected) BYTES 1012 QCD 2000 NSF Leading Edge Computational Cosmology 100 year climate model in hours 1010 1995 NSF Capability Atomic/Diatomic Interaction Molecular Dynamics for Biological Molecules 108 108 1012 1014 1016 1018 1020 1010 MACHINE REQUIREMENT IN FLOPS Scientific Applications Continue to Require Exponential Growth in Capacity From Bob Voigt, NSF
The Promise of the Teraflop - From Thunderstorm to National-Scale Simulation Simulation by Wilhelmson, et al.; Figure from Supercomputing and the Transformation of Science, Kaufmann and Smarr, Freeman, 1993
Access to ASCI Leading Edge Supercomputers Academic Strategic Alliances Program Data and Visualization Corridors Accelerated Strategic Computing Initiative is Coupling DOE Defense Labs to Universities http://www.llnl.gov/asci-alliances/centers.html
Comparison of the DoE ASCI and the NSF PACI Origin Array Scale Through FY99 Los Alamos Origin System FY99 5-6000 processors NCSA Proposed System FY99 6x128 and 4x64=1024 processors www.lanl.gov/projects/asci/bluemtn /Hardware/schedule.html
CM-5 CM-2 NCSA Combines Shared Memory Programming with Massive Parallelism Future Upgrade Under Negotiation with NSF
The Exponential Growth of NCSA’s SGI Shared Memory Supercomputers Doubling Every Nine Months! SN1 Origin Power Challenge Challenge
TOP500 Systems by Vendor 500 Other Japanese Other DEC 400 Intel Japanese TMC Sun DEC Intel HP 300 TMC IBM Sun Number of Systems Convex HP 200 Convex SGI IBM SGI 100 CRI CRI 0 Jun-93 Jun-94 Jun-95 Jun-96 Jun-97 Jun-98 Nov-93 Nov-94 Nov-95 Nov-96 Nov-97 TOP500 Reports: http://www.netlib.org/benchmark/top500.html
Why NCSA Switched From Vector to RISC Processors NCSA 1992 Supercomputing Community 150 Average Speed 70 MFLOPS Cray Y-MP4 / 64 March, 1992 - February, 1993 100 Average Performance, Users > 0.5 CPU Hour Number of Users Peak Speed Y-MP1 50 Peak Speed MIPS R8000 0 20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 Average User MFLOPS
500 400 MPP 300 Top500 Installed SC’s SMP/DSM 200 PVP 100 0 Jun-98 Jun-97 Jun-93 Jun-94 Jun-95 Jun-96 Replacement of Shared Memory Vector Supercomputers by Microprocessor SMPs TOP500 Reports: http://www.netlib.org/benchmark/top500.html
SMP + DSM Systems PVP Systems 300 300 Europe USA Japan USA 200 200 Number of Systems Number of Systems 100 100 0 0 Jun-93 Jun-94 Jun-95 Jun-96 Jun-97 Jun-98 Nov-93 Nov-94 Nov-95 Nov-96 Nov-97 Jun-98 Jun-93 Jun-94 Jun-95 Jun-96 Jun-97 Nov-93 Nov-94 Nov-95 Nov-96 Nov-97 Top500 Shared Memory Systems Vector Processors Microprocessors TOP500 Reports: http://www.netlib.org/benchmark/top500.html
Simulation of the Evolution of the Universe on a Massively Parallel Supercomputer 4 Billion Light Years 12 Billion Light Years Virgo Project - Evolving a Billion Pieces of Cold Dark Matter in a Hubble Volume - 688-processor CRAY T3E at Garching Computing Centre of the Max-Planck-Society http://www.mpg.de/universe.htm
Limitations of Uniform Grids for Complex Scientific and Engineering Problems Gravitation Causes Continuous Increase in Density Until There is a Large Mass in a Single Grid Zone 512x512x512 Run on 512-node CM-5 Source: Greg Bryan, Mike Norman, NCSA
Use of Shared Memory Adaptive Grids To Achieve Dynamic Load Balancing 64x64x64 Run with Seven Levels of Adaption on SGI Power Challenge, Locally Equivalent to 8192x8192x8192 Resolution Source: Greg Bryan, Mike Norman, John Shalf, NCSA
Extreme and Large PIs Dominant Usage of NCSA Origin January thru April, 1998
Disciplines Using the NCSA Origin 2000CPU-Hours in March 1995
Solving 2D Navier-Stokes Kernel - Performance of Scalable Systems Preconditioned Conjugate Gradient Method With Multi-level Additive Schwarz Richardson Pre-conditioner (2D 1024x1024) Source: Danesh Tafti, NCSA
A Variety of Discipline Codes -Single Processor Performance Origin vs. T3E
Alliance PACS Origin2000 Repository Kadin Tseng, BU, Gary Jensen, NCSA, Chuck Swanson, SGI John Connolly, U Kentucky Developing Repository for HP Exemplar http://scv.bu.edu/SCV/Origin2000/
NEC SX-5 32 x 16 vector processor SMP 512 Processors 8 Gigaflop Peak Processor IBM SP 256 x 16 RISC Processor SMP 4096 Processors 1 Gigaflop Peak Processor SGI Origin Follow-on 32 x 128 RISC Processor DSM 4096 Processors 1 Gigaflop Peak Processor High-End Architecture 2000-Scalable Clusters of Shared Memory Modules Each is 4 Teraflops Peak
Emerging Portable Computing Standards • HPF • MPI • OpenMP • Hybrids of MPI and OpenMP
Basket of Applications Average Performance as Percentage of Linpack Performance 22% Applications Codes: CFD Biomolecular Chemistry Materials QCD 25% 14% 19% 33% 26%
Harnessing Distributed UNIX Workstations - University of Wisconsin Condor Pool Condor Cycles CondorView, Courtesy of Miron Livny, Todd Tannenbaum(UWisc)
NT Workstation Shipments Rapidly Surpassing UNIX Source: IDC, Wall Street Journal, 3/6/98
First Scaling Testing of ZEUS-MP on CRAY T3E and Origin vs. NT Supercluster “Supercomputer performance at mail-order prices”-- Jim Gray, Microsoft access.ncsa.uiuc.edu/CoverStories/SuperCluster/super.html • Alliance Cosmology Team • Andrew Chien, UIUC • Rob Pennington, NCSA Zeus-MP Hydro Code Running Under MPI
NCSA NT Supercluster Solving Navier-Stokes Kernel Single Processor Performance: MIPS R10k 117 MFLOPS Intel Pentium II 80 MFLOPS Preconditioned Conjugate Gradient Method With Multi-level Additive Schwarz Richardson Pre-conditioner (2D 1024x1024) Danesh Tafti, Rob Pennington, Andrew Chien NCSA
Near Perfect Scaling of Cactus - 3D Dynamic Solver for the Einstein GR Equations Cactus was Developed by Paul Walker, MPI-Potsdam UIUC, NCSA Ratio of GFLOPs Origin = 2.5x NT SC Danesh Tafti, Rob Pennington, Andrew Chien NCSA
Parallel Computing on NT Clusters Briand Sanderson, NCSA Microsoft Co-Funds Development Features Based on Microsoft DCOM Batch or Interactive Modes Application Development Wizards Current Status & Future Plans Symbio Developer Preview 2 Released Princeton University Testbed NCSA Symbio - A Distributed Object Framework Bringing Scalable Computing to NT Desktops http://access.ncsa.uiuc.edu/Features/Symbio/Symbio.html
The Road to Merced http://developer.intel.com/solutions/archive/issue5/focus.htm#FOUR