Skip to content

Example NEMODataTrees

In this section, we demonstrate how to construct NEMODataTrees using example global, regional and coupled NEMO ocean sea-ice outputs.

Users can also explore the following examples using the example_nemodatatrees.ipynb Jupyter Notebook available in the recipes directory.

To get started, let's begin by importing the Python packages we'll be using:

import xarray as xr
import nemo_cookbook as nc
from nemo_cookbook import NEMODataTree

Global Ocean Sea-Ice Models:


1. AGRIF_DEMO

Let's start by creating a NEMODataTree using example outputs from the global AGRIF_DEMO NEMO reference configuration.

AGRIF_DEMO is based on the ORCA2_ICE_PISCES reference configuration with the inclusion of 3 online nested domains.

Here, we will only consider the 2° global parent domain.

Further information on this reference configuration can be found here.


NEMO Cookbook includes a selection of example NEMO model output datasets accessible via cloud object storage.

nemo_cookbook.examples.get_filepaths() is a convenience function used to download and generate local filepaths for an available NEMO reference configuration.

filepaths = nc.examples.get_filepaths("AGRIF_DEMO")

filepaths

The get_filepaths() function will download each of the files to your local machine, returning a dictionary of filepaths for the chosen configuration (AGRIF_DEMO):

{'domain_cfg.nc': '/Users/me/Library/Caches/nemo_cookbook/AGRIF_DEMO/domain_cfg.nc',
 '2_domain_cfg.nc': '/Users/me/Library/Caches/nemo_cookbook/AGRIF_DEMO/2_domain_cfg.nc',
 '3_domain_cfg.nc': '/Users/me/Library/Caches/nemo_cookbook/AGRIF_DEMO/3_domain_cfg.nc',
 'ORCA2_5d_00010101_00010110_grid_T.nc': '/Users/me/Library/Caches/nemo_cookbook/AGRIF_DEMO/ORCA2_5d_00010101_00010110_grid_T.nc',
 ...
 '3_Nordic_5d_00010101_00010110_icemod.nc': '/Users/me/Library/Caches/nemo_cookbook/AGRIF_DEMO/3_Nordic_5d_00010101_00010110_icemod.nc'
 }

Next, we need to define the paths dictionary, which contains the filepaths corresponding to our global parent domain.

We populate the parent dictionary with the filepaths to the domain_cfg and gridT/U/V/W netCDF files produced for the AGRIF_DEMO parent domain.

paths = {"parent": {
         "domain": filepaths["domain_cfg.nc"],
         "gridT": filepaths["ORCA2_5d_00010101_00010110_grid_T.nc"],
         "gridU": filepaths["ORCA2_5d_00010101_00010110_grid_U.nc"],
         "gridV": filepaths["ORCA2_5d_00010101_00010110_grid_V.nc"],
         "gridW": filepaths["ORCA2_5d_00010101_00010110_grid_W.nc"],
         "icemod": filepaths["ORCA2_5d_00010101_00010110_icemod.nc"]
        },
        }

We can construct a new NEMODataTree called nemo using the .from_paths() constructor.

Notice, that we also need to specify that our global parent domain is zonally periodic (iperio=True) and north folding on T-points (nftype = "T") rather than a closed (regional) domain.

nemo = NEMODataTree.from_paths(paths, iperio=True, nftype="T")

nemo

This returns the following xarray.DataTree, where we have truncated the outputs for improved readability:

<xarray.DataTree>
Group: /
│   Dimensions:               (time_counter: 2, axis_nbounds: 2, ncatice: 5)
│   ...
│ 
├── Group: /gridT
│       Dimensions:               (time_counter: 2, axis_nbounds: 2, j: 148, i: 180,
│                                  ncatice: 5, k: 31)
│       Coordinates:
│           time_centered         (time_counter) object 16B 0001-01-03 12:00:00 0001-...
│         * deptht                (k) float32 124B 5.0 15.0 25.0 ... 4.75e+03 5.25e+03
│           time_instant          (time_counter) object 16B ...
│           gphit                 (j, i) float64 213kB ...
│           glamt                 (j, i) float64 213kB ...
│         * k                     (k) int64 248B 1 2 3 4 5 6 7 ... 25 26 27 28 29 30 31
│         * j                     (j) int64 1kB 1 2 3 4 5 6 ... 143 144 145 146 147 148
│         * i                     (i) int64 1kB 1 2 3 4 5 6 ... 175 176 177 178 179 180
│       Dimensions without coordinates: axis_nbounds
│       Data variables: (12/87)
│           time_centered_bounds  (time_counter, axis_nbounds) object 32B 0001-01-01 ...
│           time_counter_bounds   (time_counter, axis_nbounds) object 32B 0001-01-01 ...
│           simsk                 (time_counter, j, i) float32 213kB ...
│           simsk05               (time_counter, j, i) float32 213kB ...
│           simsk15               (time_counter, j, i) float32 213kB ...
│           snvolu                (time_counter, j, i) float32 213kB ...
│           ...                    ...
│           e1t                   (j, i) float64 213kB ...
│           e2t                   (j, i) float64 213kB ...
│           top_level             (j, i) int32 107kB ...
│           bottom_level          (j, i) int32 107kB ...
│           tmask                 (k, j, i) bool 826kB False False False ... False False
│           tmaskutil             (j, i) bool 27kB False False False ... False False
│       Attributes:
│           name:         ORCA2_5d_00010101_00010110_icemod
│           description:  ice variables
│           title:        ice variables
│           Conventions:  CF-1.6
│           timeStamp:    2025-Sep-13 17:44:13 GMT
│           uuid:         c6c24bd5-1d2b-4d7b-98b5-8d379c94e84b
│           nftype:       T
│           iperio:       True
├── Group: /gridU
│       Dimensions:               (k: 31, axis_nbounds: 2, time_counter: 2, j: 148,
│                                  i: 180)
│       ...
├── Group: /gridV
│       Dimensions:               (k: 31, axis_nbounds: 2, time_counter: 2, j: 148,
│                                  i: 180)
│       ...
├── Group: /gridW
│       Dimensions:               (k: 31, axis_nbounds: 2, time_counter: 2, j: 148,
│                                  i: 180)
│       ...
└── Group: /gridF
        Dimensions:       (j: 148, i: 180, k: 31)
        ...

2. NOC Near-Present Day eORCA1

Next, we'll consider monthly-mean outputs from the National Oceanography Centre Near-Present-Day global eORCA1 configuration of NEMO forced using JRA55-do from 1976-2024.

For more details on this model configuration and the available outputs, users can explore the Near-Present-Day documentation here.


The eORCA1 JRA55v1 NPD data are publicly accessible as remote Zarr v2 stores via JASMIN Object Store, so we will use the NEMODataTree .from_datasets() constructor.

base_url = "https://noc-msm-o.s3-ext.jc.rl.ac.uk/npd-eorca1-jra55v1"

# Opening domain_cfg:
ds_domain = (xr.open_zarr(f"{base_url}/domain/domain_cfg", consolidated=True, chunks={})
             .squeeze(drop=True)
             .rename({"z": "nav_lev"})
             )

# Opening gridT dataset, including sea surface temperature (°C) and salinity (g kg-1):
ds_gridT = xr.merge([xr.open_zarr(f"{base_url}/T1m/{var}", consolidated=True, chunks={})[var] for var in ['tos_con', 'sos_abs']], compat="override")

Next, let's create a NEMODataTree from a dictionary of eORCA1 JRA55v1 xarray.Datasets, specifying that our global domain is zonally periodic (iperio=True) and north folding on T-points (nftype = "F").

datasets = {"parent": {"domain": ds_domain, "gridT": ds_gridT}}

nemo = NEMODataTree.from_datasets(datasets=datasets, iperio=True, nftype="F")

nemo
<xarray.DataTree>
Group: /
│   Dimensions:        (time_counter: 577)
│   ...
│ 
├── Group: /gridT
│   Dimensions:        (time_counter: 577)
│       Dimensions:        (time_counter: 577, j: 331, i: 360, k: 75)
│       Coordinates:
│           time_centered  (time_counter) datetime64[ns] 5kB dask.array<chunksize=(1,), meta=np.ndarray>
│           gphit          (j, i) float64 953kB dask.array<chunksize=(331, 360), meta=np.ndarray>
│           glamt          (j, i) float64 953kB dask.array<chunksize=(331, 360), meta=np.ndarray>
│         * k              (k) int64 600B 1 2 3 4 5 6 7 8 9 ... 68 69 70 71 72 73 74 75
│         * j              (j) int64 3kB 1 2 3 4 5 6 7 8 ... 325 326 327 328 329 330 331
│         * i              (i) int64 3kB 1 2 3 4 5 6 7 8 ... 354 355 356 357 358 359 360
│       Data variables:
│           tos_con        (time_counter, j, i) float32 275MB dask.array<chunksize=(1, 331, 360), meta=np.ndarray>
│           sos_abs        (time_counter, j, i) float32 275MB dask.array<chunksize=(1, 331, 360), meta=np.ndarray>
│           e1t            (j, i) float64 953kB dask.array<chunksize=(331, 360), meta=np.ndarray>
│           e2t            (j, i) float64 953kB dask.array<chunksize=(331, 360), meta=np.ndarray>
│           top_level      (j, i) int32 477kB dask.array<chunksize=(331, 360), meta=np.ndarray>
│           bottom_level   (j, i) int32 477kB dask.array<chunksize=(331, 360), meta=np.ndarray>
│           tmask          (k, j, i) bool 9MB False False False ... False False False
│           tmaskutil      (j, i) bool 119kB False False False ... False False False
│       Attributes:
│           nftype:   F
│           iperio:   True
├── Group: /gridU
│       Dimensions:       (j: 331, i: 360, k: 75)
│       ...
├── Group: /gridV
│       Dimensions:       (j: 331, i: 360, k: 75)
│       ...
├── Group: /gridW
│       Dimensions:       (j: 331, i: 360, k: 75)
│       ...
└── Group: /gridF
        Dimensions:       (j: 331, i: 360, k: 75)
        ...

Regional Ocean Models:


AMM12

We can also construct a NEMODataTree using outputs from regional NEMO ocean model simulations.

Here, we will consider example outputs from the regional AMM12 NEMO reference configuration.

The AMM, Atlantic Margins Model, is a regional model covering the Northwest European Shelf domain on a regular lat-lon grid at approximately 12km horizontal resolution. AMM12 uses the vertical s-coordinates system, GLS turbulence scheme, and tidal lateral boundary conditions using a flather scheme.

Further information on this reference configuration can be found here.


filepaths = nc.examples.get_filepaths("AMM12")

filepaths
{'domain_cfg.nc': '/Users/me/Library/Caches/nemo_cookbook/AMM12/domain_cfg.nc',
 'AMM12_1d_20120102_20120110_grid_T.nc': '/Users/me/Library/Caches/nemo_cookbook/AMM12/AMM12_1d_20120102_20120110_grid_T.nc',
 'AMM12_1d_20120102_20120110_grid_U.nc': '/Users/me/Library/Caches/nemo_cookbook/AMM12/AMM12_1d_20120102_20120110_grid_U.nc',
 'AMM12_1d_20120102_20120110_grid_V.nc': '/Users/me/Library/Caches/nemo_cookbook/AMM12/AMM12_1d_20120102_20120110_grid_V.nc'
 }

As we showed in the AGRIF_DEMO example, we need to populate the paths dictionary with the domain_cfg and gridT/U/V filepaths corresponding to our regional model domain.

paths = {"parent": {
         "domain": filepaths["domain_cfg.nc"],
         "gridT": filepaths["AMM12_1d_20120102_20120110_grid_T.nc"],
         "gridU": filepaths["AMM12_1d_20120102_20120110_grid_U.nc"],
         "gridV": filepaths["AMM12_1d_20120102_20120110_grid_V.nc"],
        },
        }

Next, we can construct a new NEMODataTree called nemo using the .from_paths() constructor.

Note, we do not actually need to specify that our regional domain is not zonally periodic in this case, given that, by default, iperio=False.

nemo = NEMODataTree.from_paths(paths, iperio=False)

nemo
<xarray.DataTree>
Group: /
│   Dimensions:               (time_counter: 8, axis_nbounds: 2)
│   ...
│ 
├── Group: /gridT
│   Dimensions:               (time_counter: 8, axis_nbounds: 2)
│   ...
├── Group: /gridT
│       Dimensions:               (time_counter: 8, axis_nbounds: 2, j: 224, i: 198,
│                                  k: 51)
│       Coordinates:
│           time_centered         (time_counter) datetime64[ns] 64B ...
│           gphit                 (j, i) float64 355kB ...
│           glamt                 (j, i) float64 355kB ...
│         * k                     (k) int64 408B 1 2 3 4 5 6 7 ... 45 46 47 48 49 50 51
│         * j                     (j) int64 2kB 1 2 3 4 5 6 ... 219 220 221 222 223 224
│         * i                     (i) int64 2kB 1 2 3 4 5 6 ... 193 194 195 196 197 198
│       Dimensions without coordinates: axis_nbounds
│       Data variables:
│           time_centered_bounds  (time_counter, axis_nbounds) datetime64[ns] 128B ...
│           time_counter_bounds   (time_counter, axis_nbounds) datetime64[ns] 128B ...
│           tos                   (time_counter, j, i) float32 1MB ...
│           sos                   (time_counter, j, i) float32 1MB ...
│           zos                   (time_counter, j, i) float32 1MB ...
│           e1t                   (j, i) float64 355kB ...
│           e2t                   (j, i) float64 355kB ...
│           top_level             (j, i) int32 177kB ...
│           bottom_level          (j, i) int32 177kB ...
│           tmask                 (k, j, i) bool 2MB False False False ... False False
│           tmaskutil             (j, i) bool 44kB False False False ... False False
│       Attributes:
│           nftype:   None
│           iperio:   False
├── Group: /gridU
│       Dimensions:               (time_counter: 8, axis_nbounds: 2, j: 224, i: 198,
│                                  k: 51)
│       ...
├── Group: /gridV
│       Dimensions:               (time_counter: 8, axis_nbounds: 2, j: 224, i: 198,
│                                  k: 51)
│       ...
├── Group: /gridW
│       Dimensions:       (j: 224, i: 198, k: 51)
│       ...
└── Group: /gridF
        Dimensions:       (j: 224, i: 198, k: 51)
        ...

Nested Global Ocean Sea-Ice Models:


AGRIF_DEMO

Returning to our AGRIF_DEMO NEMO reference configuration, we can also construct a more complex NEMODataTree to store the outputs of the global parent and its child domains in a single data structure.

We will make use of the two successively nested domains located in the Nordic Seas, with the finest grid (1/6°) spanning the Denmark strait. This grandchild domain also benefits from “vertical nesting”, meaning that it has 75 geopotential z-coordinate levels, compared with 31 levels in its parent domain.


filepaths = nc.examples.get_filepaths("AGRIF_DEMO")

Let's start by defining the paths dictionary for the ORCA2 global parent domain and its child and grandchild domains. Notice, that for child and grandchild domains, we must also specify a unique domain number, given that we could include further child or grandchild nests.

paths = {"parent": {
        "domain": filepaths["domain_cfg.nc"],
        "gridT": filepaths["ORCA2_5d_00010101_00010110_grid_T.nc"],
        "gridU": filepaths["ORCA2_5d_00010101_00010110_grid_U.nc"],
        "gridV": filepaths["ORCA2_5d_00010101_00010110_grid_V.nc"],
        "gridW": filepaths["ORCA2_5d_00010101_00010110_grid_W.nc"],
        "icemod": filepaths["ORCA2_5d_00010101_00010110_icemod.nc"]
        },
        "child": {
        "1":{
        "domain": filepaths["2_domain_cfg.nc"],
        "gridT": filepaths["2_Nordic_5d_00010101_00010110_grid_T.nc"],
        "gridU": filepaths["2_Nordic_5d_00010101_00010110_grid_U.nc"],
        "gridV": filepaths["2_Nordic_5d_00010101_00010110_grid_V.nc"],
        "gridW": filepaths["2_Nordic_5d_00010101_00010110_grid_W.nc"],
        "icemod": filepaths["2_Nordic_5d_00010101_00010110_icemod.nc"]
        }},
        "grandchild": {
        "2":{
        "domain": filepaths["3_domain_cfg.nc"],
        "gridT": filepaths["3_Nordic_5d_00010101_00010110_grid_T.nc"],
        "gridU": filepaths["3_Nordic_5d_00010101_00010110_grid_U.nc"],
        "gridV": filepaths["3_Nordic_5d_00010101_00010110_grid_V.nc"],
        "gridW": filepaths["3_Nordic_5d_00010101_00010110_grid_W.nc"],
        "icemod": filepaths["3_Nordic_5d_00010101_00010110_icemod.nc"]
        }},
        }

Next, we need to construct a nests dictionary which contains the properties which define each nested domain. These include:

  • Unique domain number (mapping properties to entries in our paths directory).
  • Parent domain (to which unique domain does this belong).
  • Zonal periodicity of child / grandchild domain (iperio).
  • Horizontal grid refinement factors (rx, ry).
  • Start (imin, jmin) and end (imax, jmax) grid indices in both directions (i, j) of the parent grid.

The latter information should be copied directly from the AGRIF_FixedGrids.in anicillary file used to define nested domains in NEMO.


Example AGRIF_FixedGrids.in

1 (Number of nested domains - parent).

121 146 113 133 4 4 4 (imin, imax, jmin, jmax, rx, ry, rt)

1 (Number of nested domains - child)

20 60 27 60 3 3 3 (imin, imax, jmin, jmax, rx, ry, rt)

0 (Number of nested domains - grandchild)


Important: we must specify the start and end grid indices using Fortran (1-based) indexes rather than Python (0-based) indexes.

nests = {
    "1": {
    "parent": "/",
    "rx": 4,
    "ry": 4,
    "imin": 121,
    "imax": 146,
    "jmin": 113,
    "jmax": 133,
    "iperio": False
    },
    "2": {
    "parent": "1",
    "rx": 3,
    "ry": 3,
    "imin": 20,
    "imax": 60,
    "jmin": 27,
    "jmax": 60,
    "iperio": False
    }
    }

Finally, we can construct a new NEMODataTree called nemo using the .from_paths() constructor.

Again, we also need to specify that our global parent domain is zonally periodic (iperio=True) and north folding on T-points (nftype = "T") rather than a closed (regional) domain.

We can also include additional keyword arguments to pass onto xarray.open_dataset or xr.open_mfdataset when opening NEMO model output files.

nemo = NEMODataTree.from_paths(paths=paths, nests=nests, iperio=True, nftype="T", engine="netcdf4")

nemo
<xarray.DataTree>
Group: /
│   Dimensions:               (time_counter: 2, axis_nbounds: 2, ncatice: 5)
│   ...
│ 
├── Group: /gridT
│   │   Dimensions:               (time_counter: 2, axis_nbounds: 2, j: 148, i: 180,
│   │                              ncatice: 5, k: 31)
│   │   ...
│   └── Group: /gridT/1_gridT
│       │   Dimensions:               (time_counter: 2, axis_nbounds: 2, j1: 80, i1: 100,
│       │                              ncatice: 5, k1: 29)
│       │   ...
│       └── Group: /gridT/1_gridT/2_gridT
│               Dimensions:               (time_counter: 2, axis_nbounds: 2, j2: 99, i2: 120,
│                                          ncatice: 5, k2: 60)
│               ...
├── Group: /gridU
│   │   Dimensions:               (k: 31, axis_nbounds: 2, time_counter: 2, j: 148,
│   │                              i: 180)
│   │   ...
│   └── Group: /gridU/1_gridU
│       │   Dimensions:               (k1: 29, axis_nbounds: 2, time_counter: 2, j1: 80,
│       │                              i1: 100)
│       │   ...
│       └── Group: /gridU/1_gridU/2_gridU
│               Dimensions:               (k2: 60, axis_nbounds: 2, time_counter: 2, j2: 99,
│                                          i2: 120)
│               ...
├── Group: /gridV
│   │   Dimensions:               (k: 31, axis_nbounds: 2, time_counter: 2, j: 148,
│   │                              i: 180)
│   │   ...
│   └── Group: /gridV/1_gridV
│       │   Dimensions:               (k1: 29, axis_nbounds: 2, time_counter: 2, j1: 80,
│       │                              i1: 100)
│       │   ...
│       └── Group: /gridV/1_gridV/2_gridV
│               Dimensions:               (k2: 60, axis_nbounds: 2, time_counter: 2, j2: 99,
│                                          i2: 120)
│               ...
├── Group: /gridW
│   │   Dimensions:               (k: 31, axis_nbounds: 2, time_counter: 2, j: 148,
│   │                              i: 180)
│   │   ...
│   └── Group: /gridW/1_gridW
│       │   Dimensions:               (k1: 29, axis_nbounds: 2, time_counter: 2, j1: 80,
│       │                              i1: 100)
│       │   ...
│       └── Group: /gridW/1_gridW/2_gridW
│               Dimensions:               (k2: 60, axis_nbounds: 2, time_counter: 2, j2: 99,
│                                          i2: 120)
│               ...
└── Group: /gridF
    │   Dimensions:       (j: 148, i: 180, k: 31)
    │   ...
    └── Group: /gridF/1_gridF
        │   Dimensions:       (j1: 80, i1: 100, k1: 29)
        │   ...
        └── Group: /gridF/1_gridF/2_gridF
                Dimensions:       (j2: 99, i2: 120, k2: 60)
                ...

Coupled Climate Models:

UKESM1-0-LL

In addition to ocean-only and ocean sea-ice hindcast simulations (prescribing surface atmospheric forcing), NEMO models are also used as the ocean components in many coupled climate models, including the UK Earth System Model (UKESM) developed jointly by the UK Met Office and Natural Environment Research Council (NERC).

Here, we show how to construct a NEMODataTree from the 1° global ocean sea-ice component of UKESM1-0-LL included in the sixth Coupled Model Intercomparsion Project (CMIP6) using outputs accessible via the CEDA Archive.

Since CMIP6 outputs are processed and formatted according to the CMIP Community Climate Model Output Rewriter (CMOR) software, we will need to include a few additional pre-processing steps to reformat our NEMO model outputs in order to construct a NEMODataTree

Important: only CMIP model outputs variables stored on their original NEMO ocean model grid (i.e, gn) can be used to construct a NEMODataTree

# Open domain_cfg:
ds_domain_cfg = xr.open_dataset("/path/to/MOHC/Ofx/domain_cfg_Ofx_UKESM1.nc")

# Define time decoder to handle CMIP6 time units:
time_decoder = xr.coders.CFDatetimeCoder(use_cftime=True)

# Open potential temperature (°C) dataset:
base_filepath = "/badc/cmip6/data/CMIP6/CMIP/MOHC/UKESM1-0-LL/historical/r4i1p1f2/Omon/thetao/gn/latest"
ds_ukesm1_gridT = xr.open_mfdataset(f"{base_filepath}/thetao_Omon_UKESM1-0-LL_historical_r4i1p1f2_gn_*.nc",
                                    data_vars='all',
                                    decode_times=time_decoder
                                   )

# Adding mixed layer depth (m) to dataset:
ds_ukesm1_gridT['mlotst'] = xr.open_mfdataset(f"{base_filepath}/mlotst_Omon_UKESM1-0-LL_historical_r4i1p1f2_gn_*.nc",
                                              data_vars='all',
                                              decode_times=time_decoder
                                              )['mlotst']

Now we have defined our domain and gridT datasets, let's define a datasets dictionary ensuring that we rename CMORISED dimensions to be consistent with standard NEMO model outputs.

We can then define a NEMODataTree using the .from_datasets() constructor, specifying that our global parent domain is zonally periodic and north-folding on F-points.

datasets = {"parent": {
                "domain": ds_domain_cfg.rename({'z':'nav_lev'}),
                "gridT": ds_ukesm1_gridT.rename({'time':'time_counter', 'i':'x', 'j':'y', 'lev':'deptht'}),
                }}

nemo = NEMODataTree.from_datasets(datasets=datasets, iperio=True, nftype="F")

nemo