1 / 32

DMI Update WWW.DMI.DK

DMI Update WWW.DMI.DK. Leif Laursen ( ll@dmi.dk ) Jan Boerhout ( jboerhout@hpce.nec.com ). CAS2K3, September 7-11, 2003 Annecy, France. Danish Meteorological Institute. DMI is the national weather service for Denmark, Greenland and the Faeroes.

zelda
Download Presentation

DMI Update WWW.DMI.DK

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DMI Update WWW.DMI.DK Leif Laursen ( ll@dmi.dk ) Jan Boerhout ( jboerhout@hpce.nec.com ) CAS2K3, September 7-11, 2003 Annecy, France

  2. Danish Meteorological Institute • DMI is the national weather service for Denmark, Greenland and the Faeroes. • Weather forecasting, Oceanography, Climate Research and Environmental studies • Use of numerical models in all areas • Increased used of automatic products • Demanding high availability of systems

  3. GTS-observations ´ 2 4 processor SGI ORIGIN 200 ECMWF boundary files · data processing · graphics · verification operational Mass storage device database NEC-SX6 · preprocessing · analysis · initialisation · forecast · postprocessing 32 Kbyte/s 10 Mbyte/s

  4. 18Z ECMWF boundaries 06Z ECMWF boundaries 12Z ECMWF boundaries 00Z ECMWF boundaries

  5. Evolution in RMS for MSLP

  6. Quality of 24h forecasts of 10m wind speeds >= 8 m/s

  7. Weibull distributions for 24 hour forecasts E, D, ECMWF and UKMO is also shown as well as curve for the observations.

  8. The new NEC-SX6 computer at DMI

  9. Some events during the migration to the NEC-SX6 • Oct. 01: Signature of contract between NEC and DMI • April 02: Upgrade (advection scheme for q, CW and TKE) • May 02: Installation of phase 1 of SX6 • May 02: Parallel system on SX6 • June 02: DMI-HIRLAM-I (0.014 degree, 602x600 grid) on SX-6 • July 02: Stability test passed • Sep. 02: Operational suite on SX6, later removal of SX4 • Sep. 02: Testing of new developments (diff. and convection) • Dec. 02: Upgrade: 40 levels, reduced time step, AMSU-A data • Jan. 03: Revised contract between NEC and DMI • Mar. 03: Installation of phase 2 of SX6 • July 03: Stability test passed. • Sep. 03: Improvement in data-assimilation (FGAT, QuikScat etc.) • Early 04: New operational HIRLAM set-up using 6 nodes

  10. HIRLAM Scalability Optimization • Methods • Implementation • Performance

  11. Optimization Focus • Data transposition • from 2D to FFT distribution and reverse • from FFT to TRI distribution and reverse • Exchange of halo points • between north and south • between east and west • GRIB File I/O • Statistics

  12. Approach • First attempt: straight-forward conversion from SHMEM to MPI-2 put/get calls • it works, but: • too much overhead due to fine granularity • Redesign of transposition and halo swap routines • less and larger messages • independent message passing process groups

  13. latitude levels 5 0 1 2 3 4 7 8 9 10 11 6 longitude 2D Sub Grids • HIRLAM sub grid definition in TWOD data distribution • Processors:

  14. latitude levels longitude Original FFT Sub Grids • HIRLAM sub grid definition in FFT data distribution • Each processor handles slabs of full longitude lines

  15. 4 latitude levels longitude 2D↔FFT Redistribution Sub grid data to be distributed to all processors: send-receive pairs

  16. 5 4 3 latitude levels 3 4 5 longitude 2D↔FFT Redistribution • Sub grids in east-west direction form full longitude lines • nprocy independent sets of nprocx2 send-receive pairs, or: • send-receive pairs • nprocy x less messages

  17. latitude 5 2 9 3 0 6 7 4 1 11 8 10 levels 2 11 0 1 4 3 6 7 8 9 10 5 longitude 9 2 1 0 5 4 8 7 11 10 6 3 Transpositions 2D↔FFT↔TRI 2D FFT TRI

  18. MPI Methods • Transfer Methods • Remote Memory Access: mpi_put, mpi_get • Async Point-to-Point: mpi_isend, mpi_irecv • All-to-All: mpi_alltoallv, mpi_alltoallw • Buffering vs. direct • Explicit buffering • MPI derived types (Method selection by environment variables)

  19. Performance

  20. Parallel Speedup on NEC SX-6 • Cluster of 8 NEC SX-6 nodes at DMI • Up to 60 processors: • 7 nodes with 8 processors per node • 1 node with 4 processors • Parallel efficiency 78% on 60 processors

  21. Performance - Observations • New data redistribution method much more efficient (78% vs. 45% on 60 processors) • No performance advantage with RMA (one-sided MP) or All-to-All over plain Point-to-Point method • Elegant code with MPI derived types, but: • Explicit buffering faster

  22. Questions? • Thank you!

More Related