510 likes | 586 Views
BoM Modelling and Computing Update. Michael Naughton BMRC Acknowlegements: Bill Bourke & BMRC colleagues Phil Tannenbaum & HPCCC colleagues NEC/A applications support staff BoM/CSIRO HPCCC Supercomputer Upgrade New BoM Head Office MPI versions of BoM major applications codes
E N D
BoM Modelling and Computing Update Michael Naughton BMRC Acknowlegements: Bill Bourke & BMRC colleagues Phil Tannenbaum & HPCCC colleagues NEC/A applications support staff • BoM/CSIRO HPCCC Supercomputer Upgrade • New BoM Head Office • MPI versions of BoM major applications codes • BoM Modelling
BoM High Performance Computing from 1982 to 1996 • FACOM • CSIRO CYBER205 • ETA-10 • CRAY X-MP, Y-MP
BoM/CSIRO HPCCC • High Performance Computing and Communications Centre • established 1997 as joint high performance computing facility • 50:50 BoM-CSIRO partnership 1997-2003 • NEC SX-4/16 1997 • NEC SX-4/32 1997-2000 • NEC SX-4/32 & SX-5/16 2000-2001 • 2 X NEC SX-5/16 2001-2003 • Continued BoM-CSIRO partnership 2003-2007
BoM High Performance Computing from 1982 to 1996 • FACOM • CSIRO CYBER205 • ETA-10 • CRAY X-MP, Y-MP
BoM High Performance Computing from 1982 to 2003 • FACOM • CSIRO CYBER205 • ETA-10 • CRAY X-MP, Y-MP • NEC SX-4, SX-5
1. BoM Supercomputer Upgrade Timelines • MPI development versions of BoM major applications: 2000-2002 • Pre-tender benchmarks release: August 2002 • RFT: October 2002 • Tender deadline: February 2003
BoM Upgrade Phases • 4+2+2 Phases • Initial 4 year contract: 4 years from acceptance • First optional contract extension: 2008-2009 • Second optional contract extension: 2010-2011 • RFT Performance Target: 150 / 300 / 750 / 2000 Gflops sustained performance on BoM operational applications (cf. 35 Gflops on BoM SX-5)
BoM Supercomputer Upgrade Timelines • MPI development versions of BoM major applications: 2000-2002 • Pre-tender benchmarks release: August 2002 • RFT: October 2002 • Tender deadline: February 2003 • NEC selected as preferred vendor: April 2003 • Contract signed with NEC: June 2003 • Interim 2-node SX-6 machine for porting: September 2003 • Delivery: December 2003 • Installation and acceptance testing: Dec-Feb • Operational: March 2004
BoM High Performance Computing from 1982 to 2003 • FACOM • CSIRO CYBER205 • ETA-10 • CRAY X-MP, Y-MP • NEC SX-4, SX-5
BoM High Performance Computing from 1982 to 2012 • FACOM • CSIRO CYBER205 • ETA-10 • CRAY X-MP, Y-MP • NEC SX-4, SX-5 • NEC SX-6
BoM High Performance Computing from 1982 to 2012 • FACOM • CSIRO CYBER205 • ETA-10 • CRAY X-MP, Y-MP • NEC SX-4, SX-5 • NEC SX-6 Slope: approx. factor 2.5 every 2 years
New HPCCC Facility Summary • Initial installation: 4Q 2003 • on availability of new CCF • 18 node NEC SX-6 (SX-6/144M18) • multi-node vector processor based architecture • 8 CPUs x 8 Gfl peak performance per node • 1152 Gfl peak capacity • 2 x 12 CPU TX-7 Linux front-end • 1.3 GHz 64-bit Itanium processor • GFS server with failover • ~ 14 TB GFS disk storage • cross-compilation environment
New HPCCC Facility Summary • Initial: Dec 2003 • 18 node NEC SX-6 (SX-6/224M18) • 1152 Gfl peak capacity • 2 x 12 CPU TX-7 front-end • ~ 14 TB GFS disk storage • Upgrade: 4Q 2004 • 28 node NEC SX-6 (SX-6/224M28) • 1792 Gfl peak capacity • 2 x 16 CPU TX-7 front-end • ~ 22 TB GFS disk storage • Optional Extension 1: Jan 2008 • Optional Extension 2: Jan 2010
SX-5 Processors 16 CPU per Cabinet 4, 8, 10 GFLOPS Flavours Long Vector Tuned 200+ SX-5 Memory SDRAM 256 GB per Cabinet (128) IOP 32 bit PCI SUPER-UX OS SX-6 Processors 8 CPU per Cabinet 8 GFLOPS Flavour Short Vector Tuned 32+ SX-6 Memory DDR-SDRAM 64 GB per Cabinet IOP 64 bit PCI SUPER-UX OS Comparisons Between SX-5 and SX-6
SX-6 Processor Architecture Central Processing Unit Mask Mask Reg. Logically 8 Wide Vector Unit Logical Vector Registers and Vector Data Registers Multiply Load/Store Add/Shift Divide SHARED MAIN MEMORY Scalar Unit Scalar Registers Scalar Execution Cache Memory XMU Input/Output Processor (IOP)
IOP IOP IOP IOP IOP IOP IOP IOP IOP IOP IOP IOP HPCCC Configuration Internode Crossbar (IXS) 8 CPU 8 GB/s Bidirectional .... .... .... CPU CPU CPU CPU CPU CPU CPU CPU CPU ....... MM MM MM Node #2 Node #1 Node #18 Initial 18 nodes CPU : 144 CPUs (1152 GFLOPS) MMU : 1152 GB (64 GB/node) Upgrade 28 Nodes CPU : 224 CPUs (1792 GFLOPS) MMU : 1792 GB (64 GB/node)
SUPER-UX HP-UX Solaris AIX Data-Centric BoM CCF Forward Plan (2004-05) Linux GFS / SAN Storage
Scheduling • Some SMP Nodes • Traditional Scheduling Strategy • Some MPI Clusters • Huge Resources • Non Swapping Status • Similar Treatment to MPP
2. New BoM Head Office Background behind move • 29 years at “Celsius House”, 150 Lonsdale St, Melbourne • Inadequacy of building for current and future CCF needs • Floor loading problems for computers • Space limitation, esp. for dual operation during upgrades • Supercomputer CCF area: ~100 m2 to 1000 m2 • 25% reduction in individuals’ office space • e.g. mine: 16 m2 to 12 m2 • Decision to remain in Melbourne • Decision to continue to have co-located CCF and HO
New BoM Head Office • Henry Hunt Building • named after Australia’s first Commonwealth Meteorologist, 1907-1931 CCF
New BoM Head Office • 700 Collins Street, Docklands Precinct, Melbourne CBD
3. BoM Tender Benchmarks • Target Performance • 150 Gfl on BoM Operational Applications • Applications • GASP :: BoM Global Spectral NWP Model • T479L50 • LAPS :: BoM Regional NWP Model • 8km, 800x600, 29 lvls • ASSIM :: BoM MVSI Data Analysis • T479L50, 1780 analysis subvolumes • CLIMAT :: BoM Global Spectral Climate Model • T95L50 • OCEAN :: BoM version of MOM-2 Ocean Model • clog :: I/O tester and I/O interference program
BoM Tender Benchmarks • Best Effort calculations for each application, both • single run and • ensembles of several runs (to take account of possible scaling limitations in applications) • Workload Test • simulation of real-time combination of research jobs and pre-emptive, high priority operational suites. • Data and timing started and finished on a front end server to reflect that network speed matters for delivery of applications results. • MPI Developments for BoM Applications
MPI Version of Global Spectral Model (GASP) • Initial development done during research visit by Dr Atsushi Kubota, Hiroshima City University • Transpose method • grid point arrays distributed over latitude • spectral arrays distributed over zonal (Fourier) wavenumber • transpose Fourier coefficients from spectral to grid MPI distributions and v.v. during each spectral transform • hybrid approach: combination of MPI and multitasking parallelism
MPI Version of Global Spectral Model (GASP) • Other relevant computational features • multi-pass strategy used to allow memory saving in storage of grid point arrays; allowed T479L50 benchmark calculations to be developed and run on SX-5 during benchmark preparations • "slab physics" drivers combine multiple latitudes together to achieve longer vectors (especially beneficial on SX-5 for climate applications, i.e. at lower resolutions than NWP) • Eulerian dynamics MPI version completed, used in benchmarks • Semi-Lagrangian dynamics MPI version under development • 2-D decompositions possible within code architecture; considering 2-D version for future scalability extension
MPI Version of Australian Region Model (LAPS) • MPI distribution over latitudes • Semi-Lagrangian dynamics, semi-implicit timestepping MPI version completed, used in tender benchmarks • Eulerian dynamics, explicit timestepping MPI version now developed (corresponds to current operational LAPS) • Hybrid M/T and MPI
MPI Version of GenSI Data Analysis (ASSIM) • MPI distribution over analysis subvolumes • approximately 1800 subvolumes over the globe • Hybrid M/T and MPI • Observation space assimilation scheme • Uses iterative method to calculate localised inverse • use all obs that influence subvolume • update analysis for obs only inside subvolume • designed to support global and regional analysis systems
BoM Applications Performance on NEC SX-6 Sustained performance in optimum conditions • ~ 30% Peak on O(100) processors • 98% -- 99.7% scalable • Expect ~ 80% of this performance for operational models on busy system
BoM Applications Performance on NEC SX-6 Single processor % of peak • measure of vectorisation efficiency of BoM applications
BoM Applications Performance on NEC SX-6 Scalability of MPI versions • 99.7% parallel on up to O(100) processors => potentially scalable to O(300) processors. • MPI and multitasked scaling results very similar.
Percentage of peak performance comparison between SX-4 & SX-5 & SX-6 • GASP global NWP model results • SX-4 and SX-6 deliver better % peak than SX-5
Capability vs Capacity • Attempt to balance needs for high peak capability for operational systems together with good throughput capacity for research and trials. • Workload and Ensemble Best Effort Benchmarks both addressed this requirement. • Peak capability to meet deadlines for operational systems remains the most significant determinant.
4. BoM Modelling -- Current Systems Current operational or quasi-operational systems
4. BoM Modelling -- Current Systems Current operational or quasi-operational systems • Short Range • LAPS 0.375 deg Australian Region data assimilation and 3 day prediction system • LAPS 0.125 deg Australian Region 2 day prediction system • MESOLAPS 0.05 deg mesoscale 2 day prediction system • TCLAPS 0.15 deg 2 day Tropical Cyclone prediction system • AAQFS: Australian Air Quality Forecast System
BoM Modelling -- Current Systems • Medium Range • GASP T239L29 global data assimilation and 10 day prediction system • GASP 33 member T119L19 Ensemble Prediction System • Seasonal • POAMA coupled atmosphere-ocean 9 mths forecast system • T47L17 BMRC atmospheric model, ACOM-2 ocean model • Decadal and Climate Change • AMIP, CMIP, C20C, Climate Feedback research
BoM Modelling -- Future Plans • Short Range • non-hydrostatic mesoscale and regional model • LAPS Ensemble Prediction System • warm running assimilation cycle • demonstrate impact of increased vertical resolution • demonstrate impact of increased horizontal resolution
BoM Modelling -- Future Plans • Medium Range • T359L50 likely resolution for 2004-2005 • T479L50 likely resolution for 2006-2007 • Generalised Statistical Interpolation analysis scheme • Ensemble Kalman Filter assimilation scheme research • demonstrate impact of greater use of available observational data • demonstrate impact of increased vertical resolution • demonstrate impact of increased horizontal resolution • GASP EPS: experiment with increased number of members and resolution
BoM Modelling -- Future Plans • Seasonal • POAMA2: • increase ensemble size for hindcast phase • POAMA3: • update ocean model to MOM-4 • increase ocean & atmosphere resolution • Decadal, Climate Change • further development on coupled model for longer time scales applications, esp. to reduce coupled model biases
Summary • BoM is in process of upgrading its supercomputer facility for the next 4-8 year period. • NEC has been successful vendor in 2003 BoM tender. • BoM is moving to new Head Office in Melbourne Docklands with enhanced CCF facilities. • New MPI versions of BoM major applications codes have proved to be fast and scalable to O(100+) of processors. • BoM is continuing to expand modelling capability and applications in line with scientific advances in NWP, seasonal and climate modelling fields.
Usual Suspects -- where are they now? • Bill Bourke • BMRC modelling & HPC expert; invited last time, invited this time, apologies both times • Geoff Love • last time’s BoM presenter • newly appointed Director, BoM, 1 month ago • Phil Tannenbaum • ex HPC industry • BoM-CSIRO HPCCC Manager since mid-2002
WGNE Table of Operational NWP Centres (November 2002; ref. Kamal Puri)