500 likes | 666 Views
Integrating GEOS-Chem into the NASA GEOS-5 GCM or: What I learned on my field trip to NASA/Goddard Space Flight Center Bob Yantosca Senior Software Engineer GEOS-Chem Support Team 10 Apr 2013
E N D
Integrating GEOS-Chem into the NASA GEOS-5 GCMor: What I learned on my field trip to NASA/Goddard Space Flight Center Bob YantoscaSenior Software Engineer GEOS-Chem Support Team10 Apr 2013 With: Mike Long, Christoph Keller, Melissa Payer (Harvard);Steven Pawson, Arlindo da Silva, Eric Nielsen (GSFC)
Integrating GEOS-Chem into the GEOS-5 GCM Table of Contents Motivation The project The GEOS-5 GCM Modifications to the existing GEOS-Chem Results Future directions
Integrating GEOS-Chem into the GEOS-5 GCM Motivation We'd like to perform these “Big Science” simulations: With ultra-high resolution (i.e. ¼ degree) With satellite data products With aerosol microphysics With coupled climate-chemistry With data assimilation All of these require lots of computational power!!
Integrating GEOS-Chem into the GEOS-5 GCM Motivation The current GEOS-Chem has limitations: Can only run w/ offline met fields Not enough memory for global simulations at high resolution (½ or ¼ degree) Only 1-way nesting is possible now Lots of legacy code; needs cleanup
Integrating GEOS-Chem into the GEOS-5 GCM The Project We proposed to do the following: To develop a Grid-Independent GEOS-Chem that could be embedded into the NASA GEOS-5 GCM model To use this combined GEOS-5 GCM/GEOS-Chem model for data assimilation simulations To also create a new standalone driver for the grid-independent GEOS-Chem
Grid-Independent GEOS-Chem Current GEOS-Chem The problem: Historical development of GEOS-Chem relied on a code structure that used 3D (lon, lat, alt) or 4D (lon, lat, alt, quantity) arrays. At ultra-fine horizontal resolution, the memory requirements of these large arrays can make global simulations impractical. • The solution: We have begun an innovative new project with NASA/GMAO to make GEOS-Chem grid-independent. We leverage the fact that all relevant chemical processes (e.g. chemistry, emissions, deposition, etc.) can be applied to a single atmospheric column.We shall send groups of individual columns to each CPU in a cluster computer to improve parallelization. We remove all hardwiring that would prevent GEOS-Chem from operating in this fashion. • Advantages of grid-independence • This approach shall facilitate the introduction of MPI parallelization and the Earth System Model Framework (ESMF) into GEOS-Chem. Goal: to run on 100’s or 1000’s of CPU’s simultaneously. • The grid-independent GEOS-Chem can be interfaced with existing data assimilation systems and Earth System Models (ESM’s). • The grid-independent GEOS-Chem can be driven by other met data products Vision: The grid-independent GEOS-Chem will become the standard GEOS-Chem model. GCST has already begun to add modifications for grid-independence into standard GEOS-Chem. This will ensure that the Chemistry on one column Ptop Psfc With IIPAR*JJPAR columns in the entire “world” grid
Integrating GEOS-Chem into the GEOS-5 GCM The Project Development is moving along these lines: “Chemistry Component” Will connect GEOS-Chem's chemistry solver, drydep, photolysis, and related routines to the GEOS-5 GCM “Emissions Component” Will read & regrid GEOS-Chem emissions, and pass them as inputs to the Chemistry Component
Integrating GEOS-Chem into the GEOS-5 GCM The Project For now we shall let the GEOS-5 GCM handle the following operations: Advection PBL mixing Cloud convection Wet scavenging Eventually these will also be handled by the standalone GEOS-Chem
Integrating GEOS-Chem into the GEOS-5 GCM About the GEOS-5 GCM The GEOS-5 GCM is developed by the Global Modeling and Assimilation Office (GMAO) at the NASA Goddard Space Flight Center in Greenbelt, MD. Mike and I visited there in May 2012 and in Feb 2013 to hand off GEOS-Chem code to GMAO. maps.google.com
Integrating GEOS-Chem into the GEOS-5 GCM About the GEOS-5 GCM GEOS met field products:Assim + Fcst GEOS-5 GCM The GEOS-5 GCM forms part of the GEOS-DAS assimilation system. Atmospheric observations from several different platforms are assimilated into the GCM. The DAS produces regularly- gridded met data fields, which can be used to drive CTMs such as GEOS-Chem and the GMI model.
Integrating GEOS-Chem into the GEOS-5 GCM About the GEOS-5 GCM The GEOS-5 GCM is a hierarchical structure of individual components. Each component can be run standalone, or can be coupled into the framework of the GCM. Components pass data between each other using the ESMF library and MPI parallelization. Red text denotes the path from the top level to the GEOS-Chem Chemistry Component.Some GCM components have been omitted for clarity. Also directory names end in _GridComp, which have been omitted from this chart. src GEOSgcs GEOSgcm GEOSagcm GEOSogcm GEOSphysics GEOSsuperdyn GEOSchem GEOSgwd GEOSmoist GEOSrad GEOSsurf GEOSturb GEOSCHEMchem GOCART CARMAchem MAMchem GAAS GMIchem STRATchem
Integrating GEOS-Chem into the GEOS-5 GCM About the GEOS-5 GCM The GEOS-5 GCM runs on the Discover supercomputer of the NASA Center for Climate Simulation (NCCS), also located at NASA/GSFC. As of November 2012, Discover was the 53rd fastest computer on the planet Earth.(cf top500.org)
Integrating GEOS-Chem into the GEOS-5 GCM About the GEOS-5 GCM More about the Discover supercomputer: # of racks: 67 # of CPUs: 43,240 CPUs are grouped into nodes, with 12 CPUs/node Computational speed: 1 Pflop/s Storage: 2.46 PB External archival storage system (Dirac) NOTE: P = Peta = 1015
Integrating GEOS-Chem into the GEOS-5 GCM Software layers in the GEOS-5 GCM between you and Discover Linux(Operating System) MAPL(Convenience library for ESMF) Earth System Model Framework (ESMF) library netCDF(I/O LIbrary) Message Passing Interface(Parallelization Library) Fortran 90(Programming Language)
Integrating GEOS-Chem into the GEOS-5 GCM Fortran-90 Pointers Pointers let you “alias” whole arrays or slices of other arrays. Advantage: you avoid making a duplicate copy of the data. REAL*8, POINTER :: p_STT(:,:,:) ! Declaration p_STT => STT(:,:,:,IDTNOx) ! Just get NOx p_STT => STT(:,:,LLPAR:1:-1,IDTCO) ! Get CO, flip ! in vertical NULLIFY( p_STT ) ! Free memory
Integrating GEOS-Chem into the GEOS-5 GCM Fortran-90 Derived Type Objects Derived type objects are a “bucket o' variables”. You get to define the data fields that constitute the object. TYPE myType INTEGER :: myInt! Integer value REAL*8 :: myDouble! REAL*8 value REAL*8, POINTER :: myArray(:,:)! Dynamic arrayEND TYPE myType CALL SUBROUTINE mySub( myObject ) TYPE(myType), INTENT(IN) :: myObject ... Var1 = myObject%myInt Var2 = myObject%myDouble ...etc...
Each CPU only can “see” a small part of the “world”. MPI Parallelization in the GEOS-5 GCM Arrays are decomposed horizontally such that each CPU only gets a small piece of the array to work on. But arrays represent geophysical data, so each CPU is working on a different part of the “world”. Each CPU runs a completely independent GEOS-5 simulation from start to finish using its assigned geographic domain. Start of run CPU0 CPU 1 CPU2 CPU3 CPU0 End of run
Earth System Model Framework Defines a standard set of objects and functions for passing data between components of an Earth System Model (GCM, CTM, etc.) Grids: Geospatial domain of the data Fields: A data array + a Grid States: Collections of Fields Gridded Components: Connects model code to other parts of the GCM
MAPL Makes ESMF “less wordy” You can specify inputs, outputs, and internal data in a text file (called “The Registry”). At compile time, MAPL generates fragments of code with the proper ESMF function calls for you! Introduces new concepts that ESMF lacks Internal State History Component (i.e. diagnostics)
Integrating GEOS-Chem into the GEOS-5 GCM The GEOS-5 GCM … again We again see the directory structure of the GEOS-5 GCM. Each individual piece of GEOS-5 has its own subdirectory. Furthermore, each subdirectory contains a Fortran module that defines an ESMF Gridded Component. src GEOSgcs GEOSgcm GEOSagcm GEOSogcm GEOSphysics GEOSsuperdyn GEOSchem GEOSgwd GEOSmoist GEOSrad GEOSsurf GEOSturb GEOSCHEMchem GOCART CARMAchem MAMchem GAAS GMIchem STRATchem
ESMF/MAPL Gridded Component MODULE GEOSCHEMchem_GridComp PRIVATE::Internal State Import State CONTAINS Export State SUBROUTINE Initialize SUBROUTINE Run SUBROUTINE Finalize
Import State The Import State specifies all of the inputs to the Gridded Component. For the G-C Chemistry Component, these are: Met fields Parameters for dry deposition Land surface info Leaf area indices Emissions From the G-C Emissions Component, when that is ready
Export State The Export State specifies all of the outputs from the Gridded Component. For the G-C Chemistry Component, these are: Diagnostic quantities Fluxes Prod/loss rates J-value rates Etc.
Internal State The Internal State specifies quantities that are “belong to” to the Gridded Component. Quantities in the Internal State will be automatically written to a restart file. For the G-C Chemistry Component, these are: Advected tracer concentrations Chemical species concentrations
Friendlies Any field in the Internal State can be declared “friendly” to the following operations: Dynamics (aka advection) Turbulence (aka PBL mixing) Cloud convection For the G-C Chemistry Component: Advected tracers: friendly to D, T, C Chemical species: friendly to themselves
The Initialize Subroutine The Initialize routine is called only at startup. It gets info from the ESMF/MAPL environment: Grid size, # of CPU's, start & end dates, timesteps, etc. It passes that info to GEOS-Chem INIT_* routines, which in turn: Allocate all GEOS-Chem module arrays Read the input.geos and chemistry input files Sets up tracer & species indices, etc.
The Run Subroutine The Run routine is called every timestep It gets information from the GCM: Met field etc. information from the Import State Tracer & species concentrations from the Internal State Current date & time Pressures at level edges (defines vertical grid) It passes that info to GEOS-Chem run routines: DO_DRYDEP: Calls the GEOS-Chem dry deposition DO_CHEMISTRY: Calls GEOS-Chem chemistry Pass modified concentrations back to Internal State
The Finalize Subroutine The Finalize routine is called at end-of-run. It calls GEOS-Chem routine CLEANUP to free memory from all GEOS-Chem module arrays It shuts down the ESMF/MAPL interface It stops all of the CPUs and closes any open files: Restart file Diagnostic files
MAPL History Component MAPL provides a diagnostic output writing capability for the GEOS-5 GCM Any field in the Export State or Internal State may be sent to netCDF files Diagnostic output options Instantaneous (a new file each hour) Daily-averaged (a new file each day)
Modifications to GEOS-Chem What stays the same: CMN_SIZE_mod.F Gets grid dimensions (IIPAR, JJPAR, LLPAR, etc) But these are now set by the GCM (they're not fixed) time_mod.F Passes the GCM time & date to GEOS-Chem routines pressure_mod.F Passes pressures at edges and centers of each grid box (from the GCM) to GEOS-Chem routines
Modifications to GEOS-Chem But G-C subroutines now look like this: SUBROUTINE mySub & ( am_I_Root, Input_Opt, State_Met, State_Chm, RC ) Where: am_I_Root: Logical flag to decide if we are on the root CPU Input_Opt: “Input Options” derived type object State_Met: “Meteorology State” derived type object State_Chm: “Chemistry State” derived type object RC: “Return code” status (=0 is success, otherwise failure)
Modifications to GEOS-Chem am_I_Root is used to restrict printing and/or ASCII file I/O to the root CPU IF ( am_I_Root ) THEN WRITE( 6, 100 ) FILENAME 100 FORMAT( 'Reading file ', a ) ENDIF
Modifications to GEOS-Chem Objects replace existing module arrays: Input_Opt: Replaces most things in logical_mod.F, tracer_mod.F, directory_mod.F, unix_cmds_mod.F State_Met: Replaces all module arrays from dao_mod.F State_Chm: State_Chm%TRACERS replaces STT State_Chm%SPECIES replaces CSPEC_FULL
Modifications to GEOS-Chem RC returns “success” or “failure” If success, we let the run proceed If failure, we do two things: Print an error traceback Stop the run gracefully NOTE: Error trapping not implemented 100% yet.
Results Movies from a 1-day simulation 1.25o lon x 1o lat grid (288 x 181 boxes) Chemistry, Dynamics, Turbulence, Convection KPP solver Emissions: turned off Drydep: turned off Strat chem: turned off Reminder: tracers are advected, species are not Simulation by Mike Long Movies by Bob Yantosca
Trop column OH (species) The footprint of OH follows the daylit side of the Earth,which is what we would expect.
Trop column Br2 (tracer) Br2 is a nighttime tracer. But there is some “striping” evident. Possibly an artifact in the solver or in the photolysis. We don't see this striping for other chemical species or tracers.
Trop column NOx (tracer) Nox seems to get titrated very quickly. Not sure if this is because we are running without emissions.
Trop column HNO3 (species) The HNO3 from the initial conditions contaned a signature of ship emissions.
Trop column NO2 (species) NO2 is sensitive to photolysis, as is evidenced in this animation.
Trop column NO (species) NO follows the daylit side of the world.
Trop Column Ox (adv. tracer) You can see how the Ox is transported by the GEOS-5 advection scheme. Wait until we get to ¼ degree resolution!
Surface SALA (adv. tracer) This is the sea surface aerosol (accum mode), which has chemistry applied to it outside of the KPP solver. It illustrates how GEOS-5 will advect tracers with an amazing level of detail.
Future Directions Activate strat chem when running in the GCM Emissions Component Bring into G-C std code and into GEOS-5 KPP / Flexchem Totally replaces SMVGEAR! Work has already begun! Flexible precision Pick REAL*4 or REAL*8 at compile time
Thank you for your attention! “It's a feature!” “Inside everylarge programis a smallprogram struggling toget out” “Inside everylarge programis a smallprogram struggling toget out” “The solution to a problem changes the problem” legacy code... “When allelse fails,read themanual” “Variables won't, constants aren't” Totalview GEOS-5!
netCDF Is a self-describing file format De facto standard for atmospheric applications Data + metadata are stored together Data can be compressed via zlib Data can be read/written in parallel netCDF library Contains functions that let you write data from Fortran to a netCDF file (or read data from a file)
Input Options object Definition: Holds user-defined options for GEOS-Chem that are read from “input.geos” and similar files. Intended to replace existing variables in logical_mod.F, tracer_mod.F, directory_mod.F, unix_cmds_mod.F. Derived Type: Headers/gigc_input_opt_mod.F90 Use Populated at startup, then passed as an input (readonly) to lower-level routines:
Meteorology State object Definition: Holds all GEOS-Chem met fields and derived quantities. Replaces the module arrays in GeosCore/dao_mod.F. Derived Type: Headers/gigc_state_met_mod.F90 Use Passed as an input (readonly) to most GEOS-Chem routines Except where met fields are read from disk or modified, etc.
Chemistry State object Definition: Holds fields required for the G-C chemistry routines (advected tracers, chemical species, emissions tendencies, strat chem fields) Replaces the STT and CSPEC arrays. Derived Type: Headers/gigc_state_chm_mod.F90 Use Passed as an input/output (read-write) to most GEOS-Chem routines