1 / 26

Role of spectral turbulence simulations in developing HPC systems

Role of spectral turbulence simulations in developing HPC systems. YOKOKAWA, Mitsuo Next-Generation Supercomputer R&D Center RIKEN. Background. Experience of developing the Earth Simulator 40Tflops vector-type distributed-memory supercomputer system

mandel
Download Presentation

Role of spectral turbulence simulations in developing HPC systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Role of spectral turbulence simulations in developing HPC systems YOKOKAWA, Mitsuo Next-Generation Supercomputer R&D Center RIKEN

  2. Background • Experience of developing the Earth Simulator • 40Tflops vector-type distributed-memory supercomputer system • A simulation code for box turbulence flow was used in the final adjustment of the system • Large simulation on box turbulence flow was carried out. • A Peta-flops supercomputer project One-day Meeting, INI, September 26th, 2008

  3. Contents • Simulations on the Earth Simulator • A Japanese peta-scale supercomputer project • Trends of HPC system • Summary One-day Meeting, INI, September 26th, 2008

  4. Simulations on the Earth Simulator One-day Meeting, INI, September 26th, 2008

  5. The Earth Simulator • It was completed in 2002. • 35.86Tflops sustained in LINPACK benchmark was achieved. • It was chosen as one of 2002 best inventions by “TIME.” One-day Meeting, INI, September 26th, 2008

  6. Why I did? • It is important to make performance evaluation of the Earth Simulator at the final adjustment phase. • Suitable codes should be chosen • To evaluate performance of vector processor, • To measure performance all-to-all communication among compute-nodes through a crossbar switch, • To make an operation of the Earth Simulator stable. • Candidates • LINPACK Benchmark? • Atmospheric general circulation model (AGCM)? • Any other code? One-day Meeting, INI, September 26th, 2008

  7. Why I did? (cont’d) • Spectral turbulence simulation code • Intensive computational kernel & a lot of data communications • Simple code • Significance to computational science. • One of the grand challenges in computational science and high performance computing • A new spectral code for the Earth Simulator • Fourier spectral method for spatial discretization • Some techniques (mode truncation and phase shift techniques) for aliasing error in calculating nonlinear terms • Fourth-order Runge-Kutta method for time integration One-day Meeting, INI, September 26th, 2008

  8. Points of coding • Optimization to the Earth Simulator • Coordinated assignment of calculation to three-level of parallelism (vector processing, micro-tasking, and MPI parallelization) • Higher-radix FFT • B/F (data transfer rate between CPU and memories vs. operation performance) • Removal of redundant processes and variables One-day Meeting, INI, September 26th, 2008

  9. 3days by 512 PNs Calculation for one time step 100 30.7sec 10 3.21sec Wall time 1 0.1 0.01 64 128 256 512 Number of nodes One-day Meeting, INI, September 26th, 2008

  10. Performance 100 16.4Tflops 50% of the peak (single precision & analytical FLOP number) Tflops 10 1 64 128 256 512 Number of PNs One-day Meeting, INI, September 26th, 2008

  11. 10000 1000 100 10 1 1960 1970 1980 1990 2000 2010 Year Achievement of box turbulence flow simulations 1283 Jimenez et al.(1993) Caltech Delta machine K & I & Y (2002) Earth Simulator Kerr(1985) Cray-1S NCAR 5123 643 20483, 40963 Siggia(1981) Cray-1 NCAR 10243 Gotoh&Fukayama(2001) VPP5000/56 NUCC Number of grid points 323 Yamamoto(1994) Numerical Wind Tunnel Orszag(1969) IBM 360-95 2403 One-day Meeting, INI, September 26th, 2008

  12. A Japanese Peta-Scale Supercomputer Project One-day Meeting, INI, September 26th, 2008

  13. Next-Generation Supercomputer Project • Objectives are • to develop the world's most advanced and high-performance supercomputer • to develop and deploy its usage technologies as well as application software. as one of Japan's Key Technologies of National Importance. • Period & Budget: FY2006-FY2012, ~1 billion US$ (expected) • RIKEN (The Institute of Physical and Chemical Research) plays the central role of the project in developing the supercomputer under the law. One-day Meeting, INI, September 26th, 2008

  14. Goals of the project • Development and installation of the most advanced high performance supercomputer system with LINPACK performance of 10 petaflops. • Development and deployment of application software, which should be made to attain the system maximum capability, in various science and engineering fields. • Establishment of an “Advanced Computational Science and Technology Center (tentative)” as one of the Center of Excellences for research, personnel development and training built around the supercomputer. One-day Meeting, INI, September 26th, 2008

  15. Major applications for the system Grand Challenges One-day Meeting, INI, September 26th, 2008

  16. Configuration of the system • The Next-Generation Supercomputer will be a hybrid general-purpose supercomputer that provides the optimum computing environment for a wide range of simulations. • Calculations will be performed in processing units that are suitable for the particular simulation. • Parallel processing in a hybrid configuration of scalar and vector units will make larger and more complex simulations possible. One-day Meeting, INI, September 26th, 2008

  17. Roadmap of the project We are here. One-day Meeting, INI, September 26th, 2008

  18. Kobe Tokyo Location of the supercomputer site, Kobe-City 450km (280miles) west from Tokyo One-day Meeting, INI, September 26th, 2008

  19. Artists’ image of a building One-day Meeting, INI, September 26th, 2008

  20. Photo of the site (under construction) June 10, 2008 July 17, 2008 Aug. 20, 2008 Photo From South-Side One-day Meeting, INI, September 26th, 2008

  21. Trends of HPC system One-day Meeting, INI, September 26th, 2008

  22. Trends of HPC system • It will have the large number of processors around 1 million or more. • Each chip will be multi-core(8, 16, or 32), or many-core(more than 64) processor. • low performance for each core • small main memory capacity for each core • fine-grain parallelism • Each processor consumes low energy – low power processor • Narrow bandwidth between CPU and main memory • Bottleneck of the number of signal pins • Bi-sectional bandwidth among compute-nodes will be narrow. • One-to-one connection is very expensive and power-consuming One-day Meeting, INI, September 26th, 2008

  23. Impact to spectral simulations • High performance in LINPACK benchmark • The more the number of processors is, the higher the LINPACK performance is. • It is not necessary that LINPACK performance denotes real-world application performance, especially spectral simulations • Small memory capacity for each processor • fine-grain decomposition of space • increasing communication cost among parallel compute nodes • Narrow memory bandwidth and narrow inter-node bi-sectional bandwidth • memory wall problem and low all-to-all communication performance • necessity of a low B/F algorithm in place of FFT One-day Meeting, INI, September 26th, 2008

  24. Impact to spectral simulations (cont’d) • The trend does not completely fit doing 3D-FFT, i.e. box turbulence simulations are getting to be difficult to perform. • We can use more and more computational resource near future, … • But finer resolution simulation by spectral methods needs a long-time calculation time because of extremely slow of communications among parallel compute nodes, and we might not be able to obtain the final results in reasonable time. One-day Meeting, INI, September 26th, 2008

  25. Estimates for more than 40963 simulation • If simulation performance with 500TFlops sustained can be used, • 81923 simulation needs • 7 second for one-time step • 100TB total memory • 8 days for 100,000 steps and 1PBytes for a complete simulation • 163843 simulation • 1 min for one-time step • 800TB total memory • 3 months for 125,000 steps and 10PB in total for a complete simulation One-day Meeting, INI, September 26th, 2008

  26. Summary • Spectral methods is a very useful algorithm to evaluate the HPC system. • In this sense, the trend of HPC system architecture is going to worse. • Even if peak performance of the system is so high… • We cannot expect high sustained performance. • It may take a long time to finish a simulation due to very slow data transfer between nodes. • Can we discard spectral methods and change the algorithm? Or, we have to • put strong pressure on computer architecture community, and • think of any international collaboration for developing the supercomputer system which fit the turbulent study. • I would think of a HPC system as a particle accelerator like CERN. One-day Meeting, INI, September 26th, 2008

More Related