Accelerated Prediction of Polar Ice and Global Ocean (APPIGO): Overview

Accelerated Prediction of Polar Ice and Global Ocean (APPIGO): Overview Phil Jones (LANL) Eric Chassignet (FSU) Elizabeth Hunke, Rob Aulwes (LANL) Alan Wallcraft, Tim Campbell (NRL-SSC) Mohamed Iskandarani, Ben Kirtman (Univ. Miami)

Arctic Prediction • Polar amplification • Rapid ice loss, feedbacks • Impacts on global weather • Human activities • Infrastructure, coastal erosion, permafrost melt • Resource extraction • Shipping • Security/safety, staging • Regime change • Thin ice leads to more variability Shell Kulluck Arctic oil rig runs aground in Gulf of Alaska (USCG photo) LNG carrier Ob River in winter crossing (with icebreakers)

Trump: ISIS route into N. America

Interagency Arctic efforts • Earth System Prediction Capability (ESPC) Focus Area • Sea ice prediction: up to seasonal • Sea Ice Prediction Network (SIPN) • Sea Ice Outlook • This project – enabling better prediction through model performance

Interagency Arctic efforts • Earth System Prediction Capability (ESPC) Focus Area • Sea ice prediction: up to seasonal • Seasonal prediction: Broncos vs Carolina in Super Bowl • Sea Ice Prediction Network (SIPN) • Sea Ice Outlook • This project

APPIGO • Enhance performance of Arctic forecast models on advanced architectures with a focus on: • Los Alamos CICE – sea ice model • HYCOM – global ocean model • WaveWatch III – wave model • Components of Arctic Cap Nowcast/Forecast System (ACNFS), Global Ocean Forecast System (GOFS)

Proposed Approach • Refactoring: incremental • Profile • Accelerate section (slower) • Expand sections • Can test along way • Try directive/other approaches • Optimized • Best possible for specific kernels • Abstractions, larger-scale changes (data structures) • In parallel: optimized operator library • Stennis (HYCOM, Phi/many-core), LANL (GPU, CICE, HYCOM), Miami (operators), FSU (validation, science)

APPIGO proposed timeline • Year 1 • Initial profiling • Initial acceleration (deceleration!) • CICE: GPU • HYCOM: GPU, Phi (MIC) • WW3: hybrid scalability • Begin operator libs • Year 2 • Continued optimization • Expand accelerated regions (change sign) • Abstractions, operator lib • Year 3 • Deploy in models and validate with science

Progress to Date

Focus on CICE: Challenges • CICE • Dynamics (EVP rheology) • Transport • Column physics (thermo, ridging, etc.) • Quasi-2d • Num of levels, thickness classes small • Parallelism • Not enough in just horiz domain decomp • Computational intensity • Maybe not enough work for efficient kernels • BGC and new improvements help

Accelerating CICE with OpenACC • Focused on dynamics • Halo updates presented signification challenge • Attempted to use GPUDirect to avoid extra GPU – CPU data transfers • What we tried • Refactored loops to get more computation on GPU • Fused separate kernels • Using OpenACC streams to get concurrent execution and hide data transfer latencies

HYCOM ProgressLarge Benchmark • Standard DoD HPCMP HYCOM 1/25 global benchmark • 9000 by 6595 by 32 layers • Includes typical I/O and data sampling • Benchmark updated from HYCOM version 2.2.27 to 2.2.98 • Land masks in place of do-loop land avoidance • Dynamic vs static memory allocation

HYCOM ProgressLarge Benchmark • On the Cray XC40: • Using huge pages improves performance by about 3% • Making the first dimension of all arrays a multiple of 8 saved 3-6% • Change a single number in the run-time patch.input file • ifort -align array64byte • Total Core hours per model day vs number of cores • 3 generations of Xeon cores • No single-core improvement, but 8 vs 12 vs 16 cores per socket

HYCOM on Xeon Phi • Standard gx1v6 HYCOM benchmark run in native mode on 48 cores of single 5120D Phi attached to Navy DSRC’s Cray XC30 • No additional code optimization • Compared to 24 cores of a single Xeon E5-2697v2 node • Individual subroutines run 6 to 13 times slower • Overall, 10 times slower • Memory capacity is too small • I/O is very slow • Native mode is not practical • Decided not to optimize for Knights Corner - Knights Landing very different • Self hosting Knights Landing nodes • Up to 72 cores per socket, lots of memory • Scalability of 1/25 global HYCOM make this a good target • May need additional vector (AVX-512F) optimization • I/O must perform well

Validation Case • CESM test case • HYCOM (2.2.35), CICE • Implementation of flux exchange • HYCOM, CICE in G compset • Three 50-year experiments • CORE v2 forcing • HYCOM in CESM w/ CICE • POP in CESM w/ CICE • HYCOM standalone w/ CICE

Lessons Learned • Hosted accelerators suck • Programming models, software stack immature • Inability to even build at Hackathon a year ago • Substantial improvement • Can build and run to break-even at 2015 Hackathon • OpenACC can compete with CUDA, 2-3x speedup • Based on ACME atmosphere experience • GPU Direct • Need to expand accelerated regions beyond single-routine to gain performance • We have learned a great deal and obtained valuable experience

APPIGO Final Year • CICE • Continue, expand OpenACC work • Column physics • HYCOM • Revisit OpenACC • Continue work toward Intel Phi • Continue validation/comparison • Coupled and uncoupled

APPIGO Continuation? • Focus on path to operational ESPC model • Continued optimization, but focus on coverage, incorporation into production models • CICE, HYCOM on Phi (threading), GPU (OpenACC) • WWIII? • Science application • Use coupled sims to understand Arctic regime change • Throw Mo under the bus: Abandon stencils • Too fine granularity

Accelerated Prediction of Polar Ice and Global Ocean (APPIGO): Overview

Accelerated Prediction of Polar Ice and Global Ocean (APPIGO): Overview

Presentation Transcript

The Polar Ice Caps and Global Warming

Chemistry of polar ice (part II)

Validation of US Navy Polar Ice Prediction (PIPS) Model using Cryosat Data

Polar Bears and Sea Ice: Implications of Global Warming

Polar Ice Cap

The Polar Ice Caps

Plumes, Ice and the Polar Oceans

Global Integrated Polar Prediction System (GIPPS)

Melting of Polar Ice Caps

The “Great Ocean Conveyor” and Polar Ice Melting

Sea Ice and Polar Bear Habitat

Ice Jam Prediction

Global Ocean Prediction Using HYCOM

U.S. GODAE: Global Ocean Prediction with

Application of HYCOM in Eddy-Resolving Global Ocean Prediction

SEA ICE AND POLAR OCEANOGRAPHY GROUP

U.S. Navy Global Ocean Prediction Update

A Global Hindcast of Ice and Ocean Conditions for 1958-2004

Polar Ice

U.S. GODAE: Global Ocean Prediction with

Polar Ice Biome

Polar Ocean Climate Processes