1 / 68

CSE 190

CSE 190. Honors Seminar in High Performance Computing, Spring 2000 Prof. Sid Karin skarin@ucsd.edu x45075. Definitions History SDSC/NPACI Applications. Definitions of Supercomputers. The most powerful machines available. Machines that cost about 25M$ in year 2000 $.

rcarr
Download Presentation

CSE 190

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSE 190 Honors Seminar in High Performance Computing, Spring 2000 Prof. Sid Karin skarin@ucsd.edu x45075

  2. Definitions • History • SDSC/NPACI • Applications

  3. Definitions of Supercomputers • The most powerful machines available. • Machines that cost about 25M$ in year 2000 $. • Machines sufficiently powerful to model physical processes including accurate laws of nature and realistic geometry, and including large quantities of observational/experimental data.

  4. Supercomputer Performance Metrics • Benchmarks • Applications • Kernels • Selected Algorithms • Theoretical Peak Speed • (Guaranteed not to exceed speed) • TOP 500 List

  5. Misleading Performance Specifications in the Supercomputer Field David H.Bailey RNR Technical Report RNR-92-005 December 1,1992 http://www.nas.nasa.gov/Pubs/TechReports/RNRreports/dbailey/RNR-92-005/RNR-92-005.html

  6. Definitions • History • SDSC/NPACI • Applications

  7. Applications • Cryptography • Nuclear Weapons Design • Weather / Climate • Scientific Simulation • Petroleum Exploration • Aerospace Design • Automotive Design • Pharmaceutical Design • Data Mining • Data Assimilation

  8. Applications cont’d. • Processes too complex to instrument • Automotive crash testing • Air flow • Processes too fast to observe • Molecular interactions • Processes too small to observe • Molecular interactions • Processes too slow to observe • Astrophysics

  9. Applications cont’d. • Performance • Price • Performance / Price

  10. Data-intensive computing (mining) • Theory Experiment Data-intensive computing (assimilation) Numericallyintensive computing Simulation

  11. Supercomputer Architectures • Vector • Parallel Vector, Shared Memory • Parallel • Hypercubes • Meshes • Clusters • SIMD vs. MIMD • Shared vs. Distributed Memory • Cache Coherent Memory vs. Message Passing • Clusters of Shared Memory Parallel Systems

  12. The Cray - 1 • A vector computer that worked • A balanced computing system • CPU • Memory • I/O • A photogenic computer

  13. 1976: The Supercomputing “Island” Today:A Continuum Number of machines Performance

  14. The Cray X-MP • Shared Memory • Parallel Vector • Followed by Cray Y-MP, C-90, J-90, T90…..

  15. The Cray -2 • Parallel Vector Shared Memory • Very Large Memory (256 MW) • Actually 256K MW = 262 MW • One word = 8 Bytes • Liquid Immersion cooling

  16. Cray Companies • Control Data • Cray Research Inc. • Cray Computer Company Inc. • SRC Inc.

  17. Thinking Machines • SIMD vs. MIMD • Evolution from CM-1 to CM-2 • ARPA Involvement

  18. 1st Teraflops System for US Academia“Blue Horizon” Nov 1999 • 1 TFLOPs IBM SP • 144 8-processor compute nodes • 12 2-processor service nodes • 1,176 Power3 processors at 222 MHz • > 640 GB memory (4 GB/node), 10.6GB/s bandwidth, upgrade to > 1 TB later • 6.8 TB switch-attached disk storage • Largest SP with 8-way nodes • High-performance access to HPSS • Trailblazer switch (current ~115MB/s bandwidth) interconnect with subsequent upgrade

  19. UCSD Currently #10 on Dongarra’sTop 500 List • Actual Linpack benchmark sustained 558 Gflops on 120 nodes • Projected Linpack benchmark is 650 Gflops on 144 nodes • Theoretical peak 1.023 Tflops

  20. First Tera MTA is at SDSC

  21. Tera MTA • Architectural Characteristics • Multithreaded architecture • Randomized, flat, shared memory • 8 CPUs, 8 GB RAM now going to 16 (later this year) • High bandwidth to memory (word per cycle per CPU) • Benefits • Reduced programming effort: single parallel model for one or many processors • Good scalability

  22. SDSC’s road to terascale computing

  23. 12,000 sq ft ASCI Blue Mountain Site Prep 120 ft 100 ft

  24. 12,000 sq ft ASCI Blue Mountain Site Prep 120 ft 100 ft

  25. ASCI Blue Mountain Facilities Accomplishments • 12,000 sq. ft. of floor space • 1.6 MWatts of power • 530 tons of cooling capability • 384 cabinets to house the 6144 CPU’s • 48 cabinets for metarouters • 96 cabinets for disks • 9 cabinets for 36 HIPPI switches • about 348 miles of fiber cable

  26. ASCI Blue MountainSST System Final Configuration • Cray Origin 2000 - 3.072 TeraFLOPS peak • 48X128 CPU Origin 2000 (250MHz R10K) • 6144 CPUs: 48 X 128 CPU SMPs • 1536 GB memory total: • 32 GB memory per 128 CPU SMP • 76 TB Fibre Channel RAID disks • 36 x HIPPI-800 switch Cluster Interconnect • To be deployed later this year: • 9 x HIPPI-6400 32-way switch Cluster Interconnect

  27. ASCI Blue MountainAccomplishments • On-site integration of 48X128 system completed (including upgrades) • HiPPI-800 Interconnect completed • 18GB Fiber Channel Disk completed • Integrated Visualization (16 IR Pipes) • Most Site Prep completed • System integrated into LANL secure computing environment • Web based tool for tracking status

  28. ASCI Blue MountainAccomplishments-cont • Linpack - achieved 1.608TeraFLOPs • accelerated schedule-2 weeks after install • system validation • run on 40x126 configuration • f90/MPI version run of over 6 hours • sPPM - turbulence modeling code • validated full system integration • used all 12 HiPPI boards/SMP and 36 switches • used special “MPI” HiPPI bypass library • ASCI codes scaling

  29. Summary • Installed ASCI Blue Mountain computer ahead of schedule and achieved Linpack record two weeks after install. ASCI application codes are being developed and used.

  30. Half

  31. Rack

  32. Trench

  33. Network Design Principles • Connect any pair of the DSM computers through the crossbar switches • Connect directly only computers to switches, optimizing latency and bandwidth (there are no direct links DSM<==>DSM or switch<===>switch) • Support a 3-D toroidal 4x4x3 DSM configuration by establishing non-blocking simultaneous links across all sets of 6 faces of the computer grid • Maintain full interconnect bandwidth for subsets of DSM computers (48 DSM computers divided into 2, 3, 4, 6, 8,12, 24, or 48 separate, non-interacting groups)

  34. 18 16x16 Crossbar Switches 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 1 2 3 4 5 6 6 Groups of 8 Computers each 18 Separate Networks

  35. Problem Domain: (4x4x3 DSM layout 48 DSMs/6144 CPUs) sPPM Hydro on 6144 CPUs 1-HiPPI-800 NIC Router CPU Problem Subdomain: 8x4x4 process layout 128 CPUs/1 DSM) 12 HiPPI-800 NICs Router CPU on Neighbor SMP

  36. sPPM Scaling on Blue Mountain

  37. Definitions • History • SDSC/NPACI • Applications

  38. SDSC A National Laboratory for Computational Science and Engineering

  39. A Distributed National Laboratory for Computational Science and Engineering

  40. Continuing Evolution NPACI NPACI SDSC Resources Resources Education Outreach & Training Enabling technologies Technology & applications thrusts Applications Individuals Partners 1985 2000

  41. NPACI is a Highly Leveraged National Partnership of Partnerships 46 institutions 20 states 4 countries 5 national labs Many projects Vendors and industry Government agencies

  42. Mission Accelerate Scientific Discovery Through the development and implementationof computationaland computerscience techniques

  43. Vision Changing How Science is Done • Collect data from digital libraries, laboratories, and observation • Analyze the data with models run on the grid • Visualize and share data over the Web • Publish results in a digital library

  44. Goals: Fulfilling the Mission Embracing the Scientific Community • Capability Computing • Provide compute and information resources of exceptional capability • Discovery Environments • Develop and deploy novel, integrated, easy-to-use computational environments • Computational Literacy • Extend the excitement, benefits, and opportunities of computational science

More Related