Transforming Vertical Coordinates#

A common need in the analysis of ocean and atmospheric data is to transform the vertical coordinate from its original coordinate (e.g. depth) to a new coordinate (e.g. density). Xgcm supports this sort of one-dimensional coordinate transform on Axis and Grid objects using the transform method. Two algorithms are implemented:

Linear interpolation: Linear interpolation is designed to interpolate intensive quantities (e.g. temperature) from one coordinate to another. This method is suitable when the target coordinate is monotonically increasing or decreasing and the data variable is intensive. For example, you want to visualize oxygen on density surfaces from a z-coordinate ocean model.
- Logarithmic interpolation: Logarithmic interpolation (which is linear interpolation done after applying a logarithm to the target coordinate) is also available. This method is suitable when variation of the intensive quantity is best related to the logarithm of the target coordinate, rather than the target coordinate itself. For example, you want to analyze data from a sigma-coordinate atmospheric model on isobaric (constant pressure) surfaces.
Conservative remapping: This algorithm is designed to conserve extensive quantities (e.g. transport, heat content). It requires knowledge of cell bounds in both the source and target coordinate. It also handles non-monotonic target coordinates.

On this page, we explain how to use these coordinate transformation capabilities.

[1]:

from xgcm import Grid
import xarray as xr
import numpy as np
import matplotlib.pyplot as plt

Realistic Data Example#

To illustrate these features in a more realistic example, we use data from the CNRM CMIP6 model. These data are available from the Pangeo Cloud Data Library. We can see that this is a full, global, 4D ocean dataset.

[18]:

import intake
col = intake.open_esm_datastore("https://storage.googleapis.com/cmip6/pangeo-cmip6.json")
cat = col.search(
    source_id = 'CNRM-ESM2-1',
    member_id = 'r1i1p1f2',
    experiment_id = 'historical',
    variable_id= ['thetao','so','vo','areacello'],
    grid_label = 'gn'
)
ddict = cat.to_dataset_dict(zarr_kwargs={'consolidated':True, 'use_cftime':True}, aggregate=False)


--> The keys in the returned dictionary of datasets are constructed as follows:
        'activity_id.institution_id.source_id.experiment_id.member_id.table_id.variable_id.grid_label.zstore.dcpp_init_year.version'

100.00% [4/4 00:00<00:00]

[19]:

thetao = ddict['CMIP.CNRM-CERFACS.CNRM-ESM2-1.historical.r1i1p1f2.Omon.thetao.gn.gs://cmip6/CMIP6/CMIP/CNRM-CERFACS/CNRM-ESM2-1/historical/r1i1p1f2/Omon/thetao/gn/v20181206/.nan.20181206']
so = ddict['CMIP.CNRM-CERFACS.CNRM-ESM2-1.historical.r1i1p1f2.Omon.so.gn.gs://cmip6/CMIP6/CMIP/CNRM-CERFACS/CNRM-ESM2-1/historical/r1i1p1f2/Omon/so/gn/v20181206/.nan.20181206']
vo = ddict['CMIP.CNRM-CERFACS.CNRM-ESM2-1.historical.r1i1p1f2.Omon.vo.gn.gs://cmip6/CMIP6/CMIP/CNRM-CERFACS/CNRM-ESM2-1/historical/r1i1p1f2/Omon/vo/gn/v20181206/.nan.20181206']
areacello = ddict['CMIP.CNRM-CERFACS.CNRM-ESM2-1.historical.r1i1p1f2.Ofx.areacello.gn.gs://cmip6/CMIP6/CMIP/CNRM-CERFACS/CNRM-ESM2-1/historical/r1i1p1f2/Ofx/areacello/gn/v20181206/.nan.20181206'].areacello

vo = vo.rename({'y':'y_c', 'lon':'lon_v', 'lat':'lat_v', 'bounds_lon':'bounds_lon_v', 'bounds_lat':'bounds_lat_v'})

ds = xr.merge([thetao,so,vo], compat='override')
ds = ds.assign_coords(areacello=areacello.fillna(0))
ds

[19]:

<xarray.Dataset>
Dimensions:       (y: 294, x: 362, nvertex: 4, lev: 75, axis_nbounds: 2,
                   time: 1980, y_c: 294)
Coordinates: (12/13)
    bounds_lat    (y, x, nvertex) float64 dask.array<chunksize=(294, 362, 4), meta=np.ndarray>
    bounds_lon    (y, x, nvertex) float64 dask.array<chunksize=(294, 362, 4), meta=np.ndarray>
    lat           (y, x) float64 dask.array<chunksize=(294, 362), meta=np.ndarray>
  * lev           (lev) float64 0.5058 1.556 2.668 ... 5.698e+03 5.902e+03
    lev_bounds    (lev, axis_nbounds) float64 dask.array<chunksize=(75, 2), meta=np.ndarray>
    lon           (y, x) float64 dask.array<chunksize=(294, 362), meta=np.ndarray>
    ...            ...
    time_bounds   (time, axis_nbounds) object dask.array<chunksize=(1980, 2), meta=np.ndarray>
    bounds_lat_v  (y_c, x, nvertex) float64 dask.array<chunksize=(294, 362, 4), meta=np.ndarray>
    bounds_lon_v  (y_c, x, nvertex) float64 dask.array<chunksize=(294, 362, 4), meta=np.ndarray>
    lat_v         (y_c, x) float64 dask.array<chunksize=(294, 362), meta=np.ndarray>
    lon_v         (y_c, x) float64 dask.array<chunksize=(294, 362), meta=np.ndarray>
    areacello     (y, x) float32 dask.array<chunksize=(294, 362), meta=np.ndarray>
Dimensions without coordinates: y, x, nvertex, axis_nbounds, y_c
Data variables:
    thetao        (time, lev, y, x) float32 dask.array<chunksize=(4, 75, 294, 362), meta=np.ndarray>
    so            (time, lev, y, x) float32 dask.array<chunksize=(5, 75, 294, 362), meta=np.ndarray>
    vo            (time, lev, y_c, x) float32 dask.array<chunksize=(3, 75, 294, 362), meta=np.ndarray>
Attributes: (12/57)
    CMIP6_CV_version:        cv=6.2.3.0-7-g2019642
    Conventions:             CF-1.7 CMIP-6.2
    EXPID:                   CNRM-ESM2-1_historical_r1i1p1f2
    activity_id:             CMIP
    arpege_minor_version:    6.3.2
    branch_method:           standard
    ...                      ...
    xios_commit:             1442-shuffle
    status:                  2019-11-05;created;by nhn2@columbia.edu
    netcdf_tracking_ids:     hdl:21.14100/9c34b796-c31d-4c1f-be90-21d032267f6...
    version_id:              v20181206
    intake_esm_varname:      None
    intake_esm_dataset_key:  CMIP.CNRM-CERFACS.CNRM-ESM2-1.historical.r1i1p1f...

The grid is missing an outer coordinate for the Z axis, so we will construct one. This will be needed for conservative interpolation.

[20]:

import cf_xarray
level_outer_data = cf_xarray.bounds_to_vertices(ds.lev_bounds, 'axis_nbounds').load().data

ds = ds.assign_coords({'level_outer': level_outer_data})

Linear Interpolation#

Depth to Depth#

To illustrate linear interpolation, we will first interpolate salinity onto a uniformly spaced vertical grid.

[21]:

grid = Grid(ds, coords={'Z': {'center': 'lev'},
                        },
                periodic=False
            )
grid

[21]:

<xgcm.Grid>
Z Axis (not periodic, boundary=None):
  * center   lev

[22]:

target_depth_levels = np.arange(0,500,50)
salt_on_depth = grid.transform(ds.so, 'Z', target_depth_levels, target_data=None, method='linear')
salt_on_depth

[22]:

<xarray.DataArray 'so' (time: 1980, y: 294, x: 362, lev: 10)>
dask.array<transpose, shape=(1980, 294, 362, 10), dtype=float32, chunksize=(5, 294, 362, 10), chunktype=numpy.ndarray>
Coordinates:
    lat        (y, x) float64 dask.array<chunksize=(294, 362), meta=np.ndarray>
    lon        (y, x) float64 dask.array<chunksize=(294, 362), meta=np.ndarray>
  * time       (time) object 1850-01-16 12:00:00 ... 2014-12-16 12:00:00
    areacello  (y, x) float32 dask.array<chunksize=(294, 362), meta=np.ndarray>
  * lev        (lev) int64 0 50 100 150 200 250 300 350 400 450
Dimensions without coordinates: y, x

Note that the computation is lazy. (No data has been downloaded or computed yet.) We can trigger computation by plotting something.

[23]:

salt_on_depth.isel(time=0).sel(lev=50).plot()

[23]:

<matplotlib.collections.QuadMesh at 0x7fc8587b8550>

Depth to Potential Temperature#

We can also interpolate salinity onto temperature surface through linear interpolation.

[24]:

target_theta_levels = np.arange(-2, 36)
salt_on_theta = grid.transform(ds.so, 'Z', target_theta_levels, target_data=ds.thetao, method='linear')
salt_on_theta

[24]:

<xarray.DataArray 'so' (time: 1980, y: 294, x: 362, thetao: 38)>
dask.array<transpose, shape=(1980, 294, 362, 38), dtype=float32, chunksize=(4, 294, 362, 38), chunktype=numpy.ndarray>
Coordinates:
    lat        (y, x) float64 dask.array<chunksize=(294, 362), meta=np.ndarray>
    lon        (y, x) float64 dask.array<chunksize=(294, 362), meta=np.ndarray>
  * time       (time) object 1850-01-16 12:00:00 ... 2014-12-16 12:00:00
    areacello  (y, x) float32 dask.array<chunksize=(294, 362), meta=np.ndarray>
  * thetao     (thetao) int64 -2 -1 0 1 2 3 4 5 6 ... 27 28 29 30 31 32 33 34 35
Dimensions without coordinates: y, x

[25]:

salt_on_theta.isel(time=0).sel(thetao=20).plot()

/home/jthielen/miniconda3/envs/test_env_xgcm/lib/python3.10/site-packages/numba/np/ufunc/gufunc.py:170: RuntimeWarning: invalid value encountered in _interp_1d_linear
  return self.ufunc(*args, **kwargs)

[25]:

<matplotlib.collections.QuadMesh at 0x7fc85a7b3760>

[26]:

salt_on_theta.isel(time=0).mean(dim='x').plot(x='y')

/home/jthielen/miniconda3/envs/test_env_xgcm/lib/python3.10/site-packages/numba/np/ufunc/gufunc.py:170: RuntimeWarning: invalid value encountered in _interp_1d_linear
  return self.ufunc(*args, **kwargs)

[26]:

<matplotlib.collections.QuadMesh at 0x7fc85a40d1e0>

Conservative Interpolation#

To do conservative interpolation, we will attempt to calculate the meridional overturning in temperature space. Note that this is not a perfectly precise calculation. However, it’s sufficient to illustrate the basic principles of the calculation.

Create another grid object for conservative interpolation.

[27]:

grid = Grid(ds, coords={'Z': {'center': 'lev', 'outer': 'level_outer'},
                        'X': {'center': 'x', 'right': 'x_c'},
                        'Y': {'center': 'y', 'right': 'y_c'}
                        },
            periodic=False,
            )
grid

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [27], in <cell line: 1>()
----> 1 grid = Grid(ds, coords={'Z': {'center': 'lev', 'outer': 'level_outer'},
      2                         'X': {'center': 'x', 'right': 'x_c'},
      3                         'Y': {'center': 'y', 'right': 'y_c'}
      4                         },
      5             periodic=False,
      6             )
      7 grid

File ~/develop/xgcm/xgcm/grid.py:1271, in Grid.__init__(self, ds, check_dims, periodic, default_shifts, face_connections, coords, metrics, boundary, fill_value)
   1269 for pos, dim in positions.items():
   1270     if not (dim in ds.variables or dim in ds.dims):
-> 1271         raise ValueError(
   1272             f"Could not find dimension `{dim}` (for the `{pos}` position on axis `{axis}`) in input dataset."
   1273         )
   1274     if dim not in ds.dims:
   1275         raise ValueError(
   1276             f"Input `{dim}` (for the `{pos}` position on axis `{axis}`) is not a dimension in the input datasets `ds`."
   1277         )

ValueError: Could not find dimension `x_c` (for the `right` position on axis `X`) in input dataset.

To use conservative interpolation, we have to go from an intensive quantity (velocity) to an extensive one (velocity times cell thickness). We fill any missing values with 0, since they don’t contribute to the transport.

[ ]:

thickness = grid.diff(ds.level_outer, 'Z')
v_transport =  ds.vo * thickness
v_transport = v_transport.fillna(0.).rename('v_transport')
v_transport

We also need to interpolate theta or thetao, our target data for interpolation, to the same horizontal position as v_transport. This means moving from cell center to cell corner. This step introduces some considerable errors, particularly near the boundaries of bathymetry. (Xgcm currently has no special treatment for internal boundary conditions–see issue 222.)

[ ]:

ds['theta'] = grid.interp(ds.thetao, ['Y'], boundary='extend')
ds.theta

We can transform v_transport to temperature space (target_theta_levels).

[ ]:

v_transport_theta = grid.transform(v_transport, 'Z', target_theta_levels,
                                   target_data=ds.theta, method='conservative')
v_transport_theta

Notice that this produced a warning. The conservative transformation method natively needs target_data to be provided on the cell bounds (here level_outer). Since transforming onto tracer coordinates is a very common scenario, xgcm uses linear interpolation to infer the values on the outer axis position.

To demonstrate how to provide target_data on the outer grid position, we reproduce the steps xgcm executes internally:

[ ]:

theta_outer = grid.interp(ds.theta,['Z'], boundary='extend')
# the data cannot be chunked along the transformation axis
theta_outer = theta_outer.chunk({'level_outer': -1}).rename('theta')
theta_outer

When we apply the transformation we can see that the results in this case are equivalent:

[ ]:

v_transport_theta_manual = grid.transform(v_transport, 'Z', target_theta_levels,
                                   target_data=theta_outer, method='conservative')

# Warning: this step takes a long time to compute. We will only compare the first time value
xr.testing.assert_allclose(v_transport_theta_manual.isel(time=0), v_transport_theta.isel(time=0))

Now we verify visually that the vertically integrated transport is conserved under this transformation.

[ ]:

v_transport.isel(time=0).sum(dim='lev').plot(robust=True)

[ ]:

v_transport_theta.isel(time=0).sum(dim='theta').plot(robust=True)

Finally, we attempt to plot a crude meridional overturning streamfunction for a single timestep.

[ ]:

dx = 110e3 * np.cos(np.deg2rad(ds.lat_v))
(v_transport_theta.isel(time=0) * dx).sum(dim='x').cumsum(dim='theta').plot.contourf(x='y_c', levels=31)

Performance#

By default xgcm performs some simple checks when using method='linear'. It checks if the last value of the data is larger than the first, and if not, the data is flipped.This ensures that monotonically decreasing variables, like temperature are interpolated correctly. These checks have a performance penalty (~30% in some preliminary tests).

If you have manually flipped your data and ensured that its monotonically increasing, you can switch the checks off to get even better performance.

grid.transform(..., method='linear', bypass_checks=True)

xgcm v0.8 documentation

Transforming Vertical Coordinates

Contents

Transforming Vertical Coordinates#

1D Toy Data Example#

Linear transformation#

Conservative transformation#

Logarithmic Interpolation#