2005 ROMS/TOMS Workshop Scripps Institution of Oceanography La Jolla, CA, October 25, 2005

How Does One Build anAdjoint for ROMS/TOMS? Hernan G. Arango IMCS, Rutgers Bruce D. Cornuelle SIO, UCSD Emanuele Di Lorenzo Georgia Tech Arthur J. Miller SIO, UCSD Andrew M. Moore PAOS, U. Colorado 2005 ROMS/TOMS Workshop Scripps Institution of Oceanography La Jolla, CA, October 25, 2005

Long-Term Goals • To design, develop, and test an expert ocean modeling system for high-resolution scientific and operational applications over a wide range of scales from estuaries to regional to global. • To provide the ocean modeling community with analysis and prediction tools that are available in meteorology and Numerical Weather Prediction (NWP) . Are we closer to operational weather prediction systems?

Objectives • To explore the factors that limit the predictability of the circulation in regional models in a variety of dynamical regimes • To build 4D variational assimilation platforms: strong and weak constraint 4DVAR • To build a Generalized Stability Theory (GST) analysis platforms to study the dynamics, sensitivity and stabilityof the ocean circulationto naturally occurring perturbations • To build an ensemble prediction platform by perturbing forcing, initial, and boundary conditions with GST singular vectors

ROMS/TOMS Framework

Accomplishments • ROMS/TOMS version 2.2 released to full user community and version 3.0 released beta testers on May 26, 2005. Currently, ROMS and TOMS are identical. • Rewrote tangent linear (TLM), representer (RPM) and adjoint (ADM) models in Fortran 90 to improve the efficiency and multiple levels of nesting. • Parallelized TLM, RPM and ADM. • Designed a single makefile structure to facilitate compiling in any computer architecture: 477 files, 340514 lines of code, 1093905 words, 13440936 characters • Continued to develop web-based documentation • http://www.ocean-modeling.org/ • http://marine.rutgers.edu/po/models/roms/index.php

Build tangent linear model by linearizing the nonlinear model around a small perturbation NLM TLM • It is called tangent because the linearization is around a time-evolving solution which geometrically represents the tangent slopes to the NLM trajectory in phase space. • The TLM is hand-coded using Giering and Kaminski (1998) recipes. How to Build and Adjoint

The discrete adjoint model operator relative to the L2-norm can be derived by multiplying each line of the tangent linear model code by the corresponding adjoint variable, , and then differentiate with respect to the tangent linear variable, : ADM Adjoint Operator

A Simple Example • Let’s consider a simple 1D horizontal diffusion code • The hand-coding of the tangent linear, adjoint, and representer codes is illustrated in the following FLASH animations (Need to install Swiff Point Player to insert and play flash movies in PowerPoint, checkhttp://www.globfx.com) Alternatively, http://marine.rutgers.edu/po/adjoint/tlm.html http://marine.rutgers.edu/po/adjoint/adm.html http://marine.rutgers.edu/po/adjoint/rpm.html

Tangent Linear Model Recipes f= c f=x f= c *x f =xn f=x*y f= 1 /x f= 1 / sqrt(x) f= c /x f= x/ y f= 1 / (x+y) f= 1 / ((x+y) * (u+v)) f= (c +u) / (x+y) f= max(x,y) f=min(x,y) f= max(x, c) f= max(c,x) f=min(x, c) f= min(c,x) f= abs(x) f= sqrt(x) f=sqrt(x2+y2) f= log(x) f= 1 /log(x) f= exp(x) f=sin(x) f=cos(x) f=tan(x) f=sinh(x) f=cosh(x) f=asin(x) f=acos(x) f=atan(x) f=atan2(x,y) tl_f= 0 tl_f=tl_x tl_f= c *tl_x tl_f= n *xn-1 *tl_x tl_f= tl_x*y+x*tl_y tl_f= -f2 *tl_x tl_f= -f3 * 0.5 *tl_x tl_f= - c *tl_x / x2 =-f* tl_x*x tl_f= (y*tl_x– x*tl_y) /y2 tl_f= -f2 * (tl_x+tl_y) tl_f= -f2* ((tl_x+tl_y) * (u+v) -(x+y) * (tl_u +tl_v)) tl_f= +tl_u/ (x+y) -f2* (tl_x+tl_y) tl_f= (0.5 +sign(0.5,x-y)) *tl_x+ (0.5 – sign(0.5,x-y)) *tl_y tl_f= (0.5 + sign(0.5,y-x)) *tl_x+ (0.5 – sign(0.5,y- x)) *tl_y tl_f= (0.5 + sign(0.5,x- c)) *tl_x tl_f= (0.5 - sign(0.5, c –x)) *tl_x tl_f= (0.5 + sign(0.5, c -x)) *tl_x tl_f= (0.5 - sign(0.5,x– c)) *tl_x tl_f= sign(1.0,x) *tl_x tl_f= 0.5 *tl_x/ f tl_f= (x* tl_x+y* tl_y) / f tl_f= tl_x/x tl_f= -f2* (tl_x/ x) tl_f= exp(x) *tl_x tl_f= cos(x) *tl_x tl_f= -sin(x) *tl_x tl_f=tl_x/ (cos(x) *cos(x)) tl_f=cosh(x) *tl_x tl_f= -sinh(x) *tl_x tl_f= tl_x/ sqrt(1.0– x2) tl_f= -tl_x/ sqrt(1.0 –x2) tl_f=tl_x/sqrt(1.0 +x2) tl_f= (y*tl_x–x* tl_y) / (x2 +y2) Nonlinear Tangent

Representer Model Recipes Consider the product of two time-dependent state variablesxandyand let x=xo+ x’ and x’ =x - xo(1) y=yo+ y’ and y’ = y - yo(2) wherexoandyoare the basic state variables andx’andy’are their perturbations. Then, x* y=(xo+ x’)*(yo+ y’)=xo*yo+ x’ *yo+xo* y’ + x’ * y’ Eliminating prime terms using (1) and (2) and neglecting high order terms involvingprime variables yield x*y=xo*yo+(x-xo)*yo+xo*(y-yo) =xo*yo+x*yo-xo*yo+xo*y-xo*yo =x*yo+xo*y-xo*yo using the above previous nomenclature (tl_x=x,tl_y=y,x =xo,y=yo) we get f=x*ytl_f=tl_x*y+x*tl_y–x *y =tl_x*y+ x*tl_y–f Notice that to obtain the RPM from the TLM for the above expression, we just need tosubtract x*y which is the same asf.

f= c f=x f= c *x f=xn f =x *y f= x *y*u f= 1 /x f= 1 /sqrt(x) f= c /x f=x/y f= (x*y) /u f= 1 / (x*y) f= 1 / (x+y) f= 1 / ((x+y) * (u+v)) f= max(c,x) f=sqrt(x) f=sqrt(x2+y2) f= exp(x) f= exp(c *x) f=sin(x) tl_f= c tl_f= tl_x SAME! tl_f= c *tl_x SAME! tl_f= n *xn-1*tl_x– (n – 1) *f tl_f=tl_x*y+x*tl_y– f tl_f=tl_x*y*u+x* (tl_y* u+ y * tl_u) – 2*f tl_f= -f2 *tl_x+ 2*f tl_f= -f3* 0.5 *tl_x+ 1.5 *f=f* (1.5 -f2* 0.5 *tl_x) tl_f= -f* tl_x/x+ 2 *f= f*(2 -tl_x / x) tl_f= (y*tl_x–x*tl_y) /y2 +f tl_f= (u* (tl_x*y+x*tl_y) –x*y*tl_u) /u2SAME! tl_f= -f2* (tl_x*y+x*tl_y) + 3 *f tl_f= -f2* (tl_x+tl_y) + 2 *f tl_f= -f2* (tl_x+tl_y) * (u+v) -f2 * (x+y) * (tl_u +tl_v) + 2 *f tl_f= (0.5 - sign(0.5, c –x)) *(tl_x–x)+ (0.5 + sign(0.5, c -x)) * c tl_f= 0.5 *tl_x / f+ 0.5 *f= 0.5 * (f+tl_x/ f) tl_f= (x* tl_x + y* tl_y) /f SAME! tl_f=tl_x*f+ (1 –x) *f tl_f= c *tl_x*f+ (1 – c *x) *f tl_f=tl_x*cos(x) +f–xcos(x) Representer Model Recipes

Recursive expressions DOi=IstrU, Iend DC1(i,0)= DC(i,0)! intermediate cff1 = 1.0_r8 / ( CF(i,0) * on_u(i,j) ) DC(i,0)= ( DC(i,0) * on_u(i,j) - DU_avg1(i,j) ) * cff1! recursive END DO • Vertical integrations (pressure gradient) • Tri-diagonal algorithms (implicit vertical mixing, vertical sinking, parabolic spline reconstruction) • Rescaling of state variables (T*Hz) • Use of adjoint private arrays • Zeroing out of global and private adjoint variables after being used (TLM to ADM) • Redundant operations are wrong in adjoint codes • Model initialization (ini_fields: compute u and v) Beware

Parallelization • The NLM, TLM, and RPM can be run in either shared-memory (OpenMP) or distributed-memory (MPI) • The ADM can only be run in distributed-memory (ADM violates shared-memory collision rules) • Aggregation of variables for MPI communications CALL ad_mp_exchange2d (ng, iADM, 3, Istr, Iend, Jstr, Jend, & & LBi, UBi, LBj, UBj, & & NghostPoints, EWperiodic, NSperiodic, & &ad_Zt_avg1, ad_DU_avg1, ad_DV_avg1)

East-West MPI Communications Nonlinear With Respect To Tile R With Respect To Tile R Adjoint

North-South MPI Communications With Respect to Tile B Nonlinear Adjoint

Profiling Elapsed CPU time (seconds): Resolution: 0256x0128x030, Parallel Nodes: 4, Tiling: 002x002 Node 0 Node 1 Node 2 Node 3 Total CPU: 27.910 27.913 27.917 27.916 111.656 Model Elapsed Time Profile: Initialization ................................... 2.092 ( 1.8737 %) 2.086 ( 1.8680 %) Reading of input data ............................ 1.942 ( 1.7389 %) 1.943 ( 1.7398 %) Processing of input data ......................... 0.264 ( 0.2362 %) 0.264 ( 0.2360 %) Computation of vertical boundary conditions ...... 0.007 ( 0.0058 %) 0.008 ( 0.0072 %) Computation of global information integrals ...... 0.024 ( 0.0212 %) 0.032 ( 0.0288 %) Writing of output data ........................... 1.093 ( 0.9786 %) 1.106 ( 0.9908 %) Model 2D kernel..................................4.393 ( 3.9345 %) 5.927 ( 5.3084 %) 2D/3D coupling, vertical metrics ................. 0.067 ( 0.0602 %) 0.131 ( 0.1171 %) Omega vertical velocity .......................... 0.132 ( 0.1181 %) 0.160 ( 0.1436 %) Equation of state for seawater ................... 0.199 ( 0.1781 %) 0.320 ( 0.2863 %) 3D equations right-side terms .................... 0.382 ( 0.3422 %) 0.548 ( 0.4905 %) 3D equations predictor step ...................... 0.759 ( 0.6797 %) 1.104 ( 0.9888 %) Pressure gradient ................................ 0.414 ( 0.3704 %) 0.627 ( 0.5619 %) Harmonic stress tensor, S-surfaces ............... 0.159 ( 0.1422 %) 0.244 ( 0.2181 %) Corrector time-step for 3D momentum .............. 0.487 ( 0.4362 %) 0.569 ( 0.5097 %) Corrector time-step for tracers .................. 0.495 ( 0.4431 %)0.652 ( 0.5842 %) Total: 12.906 11.5589 15.791 14.1428 Message Passage profile: Message Passage: 2D halo exchanges ............... 0.790 ( 0.7072 %) 1.115 ( 0.9990 %) Message Passage: 3D halo exchanges ............... 0.272 ( 0.2437 %) 0.286 ( 0.2559 %) Message Passage: 4D halo exchanges ............... 0.028 ( 0.0254 %) 0.052 ( 0.0462 %) Message Passage: data broadcast .................. 0.024 ( 0.0217 %) 0.024 ( 0.0213 %) Message Passage: data reduction .................. 0.002 ( 0.0020 %) 0.010 ( 0.0092 %) Message Passage: data gathering .................. 0.371 ( 0.3321 %) 0.367 ( 0.3283 %) Message Passage: data scattering.................. 2.612 ( 2.3392 %) 2.603 ( 2.3316 %) Total: 4.099 3.6713 4.457 3.9915 TLM ADM

Makefile

CPP Options Adjoint it

Validation Tests • ROMS/TOMS is continuously evolving so we need a process to check the validity of all its algorithms • Currently, we have four drivers in ocean_control.F to test the correctness of the TLM, RPM, and ADM: • tlcheck_ocean.hTLM_CHECK • picard_ocean.hPICARD_TEST • grad_ocean.hGRADIENT_CHECK • pert_ocean.hINNER_PRODUCT orSANITY_CHECK

This is the most stringent test: it never fails! • It is highly recommended to run this test on all applications and CPP configurations • It checks the symmetry between TLM and ADM state vectors: A – (A )T = 0 within round off • If this difference is large, it tell you that either the TLM or ADM is incorrect Sanity Check

How To Run Sanity Check • Simply activate SANITY_CHECK CPP option in your application • Initialize from rest and force with desired nonlinear background state • Specify interior point to perturb with a delta function in ocean.in generic user parameters: • INT(user(1))TLM state variable to perturb (zeta, ubar, vbar, u, v, t, …) • INT(user(2))ADMstate variable to perturb ( 1 , 2 , 3 , 4, 5, 6, …) • INT(user(3))I-index ofTLMvariable to perturb • INT(user(4))I-index ofADMvariable to perturb • INT(user(5))J-index ofTLMvariable to perturb • INT(user(6))J-index ofADMvariable to perturb • INT(user(7))K-index ofTLMvariable to perturb • INT(user(8))K-index ofADMvariable to perturb

Sanity Check Examples Sanity Check - Tangent: tl_zeta perturbed at (i,j) = 50 50 Sanity Check - Adjoint: ad_zeta perturbed at (i,j) = 50 50 Sanity Check - Perturbing variable: zeta Sanity Check - Tangent: 1.624754814645E-05 at (i,j) 50 50 Sanity Check - Adjoint: 1.624754814645E-05 at (i,j) 50 50 Sanity Check - Difference: 9.893344823930E-19 at (i,j) 50 50 Sanity Check - Tangent: tl_u perturbed at (i,j,k) = 110 40 30 Sanity Check - Adjoint: ad_u perturbed at (i,j,k) = 110 40 30 Sanity Check - Perturbing variable: u Sanity Check - Tangent: 1.105644477697E-02 at (i,j,k) 110 40 30 Sanity Check - Adjoint: 1.105644477697E-02 at (i,j,k) 110 40 30 Sanity Check - Difference: -3.469446951954E-18 at (i,j,k) 110 40 30

Final Remarks • Maintenance of TLM, RPM, and ADM algorithms • Automatic differentiation • Linearization of physics • Non-differentiable algorithms (vertical mixing parameterizations) • Training and documentation

2005 ROMS/TOMS Workshop Scripps Institution of Oceanography La Jolla, CA, October 25, 2005