Transforming Vertical Coordinates#
A common need in the analysis of ocean and atmospheric data is to transform the vertical coordinate from its original coordinate (e.g. depth) to a new coordinate (e.g. density). Xgcm supports this sort of one-dimensional coordinate transform on a single Axis of an xgcm.Grid
object using the transform
method. Two algorithms are implemented:
Linear interpolation: Linear interpolation is designed to interpolate intensive quantities (e.g. temperature) from one coordinate to another. This method is suitable when the target coordinate is monotonically increasing or decreasing and the data variable is intensive. For example, you want to visualize oxygen on density surfaces from a z-coordinate ocean model.
Logarithmic interpolation: Logarithmic interpolation (which is linear interpolation done after applying a logarithm to the target coordinate) is also available. This method is suitable when variation of the intensive quantity is best related to the logarithm of the target coordinate, rather than the target coordinate itself. For example, you want to analyze data from a sigma-coordinate atmospheric model on isobaric (constant pressure) surfaces.
Conservative remapping: This algorithm is designed to conserve extensive quantities (e.g. transport, heat content). It requires knowledge of cell bounds in both the source and target coordinate. It also handles non-monotonic target coordinates.
On this page, we explain how to use these coordinate transformation capabilities.
[1]:
from xgcm import Grid
import xarray as xr
import numpy as np
import matplotlib.pyplot as plt
1D Toy Data Example#
First we will create a simple, one-dimensional dataset to illustrate how the transform
function works. This dataset contains
a coordinate called
z
, representing the original depth coordinatea data variable called
theta
, a function ofz
, which we want as our new vertical coordinatea data variable called
phi
, also a function ofz
, which represents the data we want to transform into this new coordinate space
In an oceanic context theta
might be density and phi
might be oxygen. In an atmospheric context, theta
might be potential temperature and phi
might be potential vorticity.
[2]:
z = np.arange(2, 12)
theta = xr.DataArray(np.log(z), dims=['z'], coords={'z': z})
phi = xr.DataArray(np.flip(np.log(z)*0.5+ np.random.rand(len(z))), dims=['z'], coords={'z':z})
ds = xr.Dataset({'phi': phi, 'theta': theta})
ds
[2]:
<xarray.Dataset> Dimensions: (z: 10) Coordinates: * z (z) int64 2 3 4 5 6 7 8 9 10 11 Data variables: phi (z) float64 1.633 1.165 1.233 1.637 ... 1.657 1.358 1.423 1.133 theta (z) float64 0.6931 1.099 1.386 1.609 ... 2.079 2.197 2.303 2.398
Let’s plot this data. Note that, for a simple 1D profile, we can easily visualize phi
in theta
space by simply plotting phi
vs. theta
. This is essentially a form of linear interpolation, performed automatically by matplotlib when it draws lines between the discrete points of our data.
[3]:
def plot_profile():
fig, (ax1, ax2, ax3) = plt.subplots(ncols=3, figsize=[8,5])
ds.theta.plot(ax=ax1, y='z', marker='.', yincrease=False)
ds.phi.plot(ax=ax2, y='z', marker='.', yincrease=False)
ds.swap_dims({'z': 'theta'}).phi.plot(ax=ax3, y='theta', marker='.', yincrease=False)
fig.subplots_adjust(wspace=0.5)
return ax3
plot_profile();

Linear transformation#
Ok now lets transform phi
to theta
coordinates using linear interpolation. A key part of this is to define specific theta
levels onto which we want to interpolate the data.
[4]:
# First create an xgcm grid object
grid = Grid(ds, coords={'Z': {'center':'z'}}, periodic=False)
# define the target values in density, linearly spaced
theta_target = np.linspace(0, 3, 20)
# and transform
phi_transformed = grid.transform(ds.phi, 'Z', theta_target, target_data=ds.theta)
phi_transformed
[4]:
<xarray.DataArray 'phi' (theta: 20)> array([ nan, nan, nan, nan, nan, 1.52174096, 1.33957156, 1.16664694, 1.20395286, 1.29590695, 1.5814605 , 1.56436105, 1.30629331, 1.56431596, 1.36623482, 1.22265461, nan, nan, nan, nan]) Coordinates: * theta (theta) float64 0.0 0.1579 0.3158 0.4737 ... 2.526 2.684 2.842 3.0
Now let’s see what the result looks like.
[5]:
ax = plot_profile()
phi_transformed.plot(ax=ax, y='theta', marker='.')
[5]:
[<matplotlib.lines.Line2D at 0x7fc864c01c90>]

Not too bad. We can increase the number of interpolation levels to capture more of the small scale structure.
[6]:
target_theta = np.linspace(0,3, 100)
phi_transformed = grid.transform(ds.phi, 'Z', target_theta, target_data=ds.theta)
ax = plot_profile()
phi_transformed.plot(ax=ax, y='theta', marker='.')
[6]:
[<matplotlib.lines.Line2D at 0x7fc864af5450>]

Note that by default, values of theta_target
which lie outside the range of theta
have been masked (set to NaN
). To disable this behavior, you can pass mask_edges=False
; values outside the range of theta
will be filled with the nearest valid value.
[7]:
target_theta = np.linspace(0,3, 60)
phi_transformed = grid.transform(ds.phi, 'Z', target_theta, target_data=ds.theta, mask_edges=False)
ax = plot_profile()
phi_transformed.plot(ax=ax, y='theta', marker='.')
[7]:
[<matplotlib.lines.Line2D at 0x7fc85b30f3a0>]

Conservative transformation#
Conservative transformation is designed to preseve the total sum of phi
over the Z
axis. It presumes that phi
is an extensive quantity, i.e. a quantity that is already volume weighted, with respect to the Z axis: for example, units of Kelvins * meters
for heat content, rather than just Kelvins
. The conservative method requires more input data at the moment. You have to not only specify the coordinates of the cell centers, but also the cell faces (or bounds/boundaries). In
xgcm we achieve this by defining the bounding coordinates as the outer
axis position. The target theta
values are likewise intepreted as cell boundaries in theta
-space. In this way, conservative transformation is similar to calculating a histogram.
[8]:
# define the cell bounds in depth
zc = np.arange(1,12)+0.5
# add them to the existing dataset
ds = ds.assign_coords({'zc': zc})
ds
[8]:
<xarray.Dataset> Dimensions: (z: 10, zc: 11) Coordinates: * z (z) int64 2 3 4 5 6 7 8 9 10 11 * zc (zc) float64 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5 9.5 10.5 11.5 Data variables: phi (z) float64 1.633 1.165 1.233 1.637 ... 1.657 1.358 1.423 1.133 theta (z) float64 0.6931 1.099 1.386 1.609 ... 2.079 2.197 2.303 2.398
[9]:
# Recreate the grid object with a staggered `center`/`outer` coordinate layout
grid = Grid(ds, coords={'Z':{'center':'z', 'outer':'zc'}},
periodic=False)
grid
[9]:
<xgcm.Grid>
Z Axis (not periodic, boundary=None):
* center z --> outer
* outer zc --> center
Currently the target_data
(theta
in this case) has to be located on the outer
coordinate for the conservative method (compared to the center
for the linear method).
We can easily interpolate theta
on the outer coordinate with the grid object.
[10]:
ds['theta_outer'] = grid.interp(ds.theta, 'Z', boundary='fill')
ds['theta_outer']
[10]:
<xarray.DataArray 'theta_outer' (zc: 11)> array([0.34657359, 0.89587973, 1.24245332, 1.49786614, 1.70059869, 1.86883481, 2.01267585, 2.13833306, 2.24990484, 2.35024018, 1.19894764]) Coordinates: * zc (zc) float64 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5 9.5 10.5 11.5
Now lets transform the data using the conservative method. Note that the target values will now be interpreted as cell bounds and not cell centers as before.
[11]:
# define the target values in density
theta_target = np.linspace(0,3, 20)
# and transform
phi_transformed_cons = grid.transform(ds.phi,
'Z',
theta_target,
method='conservative',
target_data=ds.theta_outer)
phi_transformed_cons
/home/jthielen/develop/xgcm/xgcm/transform.py:247: FutureWarning: ``output_sizes`` should be given in the ``dask_gufunc_kwargs`` parameter. It will be removed as direct parameter in a future version.
out = xr.apply_ufunc(
[11]:
<xarray.DataArray 'phi' (theta_outer: 19)> array([0. , 0. , 0.37785116, 0.46936054, 0.46936054, 0.48939379, 0.53079432, 0.62434095, 0.91765916, 1.18077243, 1.46775848, 1.57323631, 1.6610291 , 2.16457924, 2.03956194, 0. , 0. , 0. , 0. ]) Coordinates: * theta_outer (theta_outer) float64 0.07895 0.2368 0.3947 ... 2.763 2.921
[12]:
phi_transformed_cons.plot(y='theta_outer', marker='.', yincrease=False)
[12]:
[<matplotlib.lines.Line2D at 0x7fc85b1c52a0>]

There is no point in comparing phi_transformed_cons
directly to phi
or the results of linear interoplation, since here we have reinterpreted phi
as an extensive quantity. However, we can verify that the sum of the two quantities over the Z axis is exactly the same.
[13]:
ds.phi.sum().values
[13]:
array(13.96569797)
[14]:
phi_transformed_cons.sum().values
[14]:
array(13.96569797)
Logarithmic Interpolation#
Since logarithmic interpolation is most often used when tranforming to pressure (isobaric) coordinates, let’s use a new set of example data:
[15]:
ds = xr.Dataset(
{
'temperature': ('sigma', [271.75452, 272.79956, 274.8517, 279.22043, 296.48782]),
'pressure': ('sigma', [100180.625, 96250.0, 87369.14, 72186.66, 53718.586]),
},
{
'sigma': [0.9969, 0.9558, 0.8631, 0.7046, 0.5117]
}
)
Let’s now transform temperature
to pressure
coordinates. As before, a key part of this is to define specific pressure
levels onto which we want to interpolate the data.
[16]:
# Create an xgcm grid object
grid = Grid(ds, coords={'Z': {'center':'sigma'}}, periodic=False)
# Define isobaric levels of interest (a few standard levels)
isobaric_target_levels = np.array([1.0e5, 8.5e4, 7.0e4])
# and transform
temperature_isobaric = grid.transform(ds.temperature, 'Z', isobaric_target_levels, target_data=ds.pressure, method='log')
temperature_isobaric
[16]:
<xarray.DataArray 'temperature' (pressure: 3)> array([271.80163709, 275.48086957, 281.01789905]) Coordinates: * pressure (pressure) float64 1e+05 8.5e+04 7e+04
And finally, let’s see what the result looks like.
[17]:
fig, (ax1, ax2, ax3) = plt.subplots(ncols=3, figsize=[8,5])
ds.pressure.plot(ax=ax1, y='sigma', marker='.', yincrease=False)
ds.temperature.plot(ax=ax2, y='sigma', marker='.', yincrease=False)
ds.swap_dims({'sigma': 'pressure'}).temperature.plot(ax=ax3, y='pressure', marker='.', yincrease=False)
temperature_isobaric.plot(ax=ax3, y='pressure', marker='.')
fig.subplots_adjust(wspace=0.7)
plt.show();

Realistic Data Example#
To illustrate these features in a more realistic example, we use data from the CNRM CMIP6 model. These data are available from the Pangeo Cloud Data Library. We can see that this is a full, global, 4D ocean dataset.
[18]:
import intake
col = intake.open_esm_datastore("https://storage.googleapis.com/cmip6/pangeo-cmip6.json")
cat = col.search(
source_id = 'CNRM-ESM2-1',
member_id = 'r1i1p1f2',
experiment_id = 'historical',
variable_id= ['thetao','so','vo','areacello'],
grid_label = 'gn'
)
ddict = cat.to_dataset_dict(zarr_kwargs={'consolidated':True, 'use_cftime':True}, aggregate=False)
--> The keys in the returned dictionary of datasets are constructed as follows:
'activity_id.institution_id.source_id.experiment_id.member_id.table_id.variable_id.grid_label.zstore.dcpp_init_year.version'
[19]:
thetao = ddict['CMIP.CNRM-CERFACS.CNRM-ESM2-1.historical.r1i1p1f2.Omon.thetao.gn.gs://cmip6/CMIP6/CMIP/CNRM-CERFACS/CNRM-ESM2-1/historical/r1i1p1f2/Omon/thetao/gn/v20181206/.nan.20181206']
so = ddict['CMIP.CNRM-CERFACS.CNRM-ESM2-1.historical.r1i1p1f2.Omon.so.gn.gs://cmip6/CMIP6/CMIP/CNRM-CERFACS/CNRM-ESM2-1/historical/r1i1p1f2/Omon/so/gn/v20181206/.nan.20181206']
vo = ddict['CMIP.CNRM-CERFACS.CNRM-ESM2-1.historical.r1i1p1f2.Omon.vo.gn.gs://cmip6/CMIP6/CMIP/CNRM-CERFACS/CNRM-ESM2-1/historical/r1i1p1f2/Omon/vo/gn/v20181206/.nan.20181206']
areacello = ddict['CMIP.CNRM-CERFACS.CNRM-ESM2-1.historical.r1i1p1f2.Ofx.areacello.gn.gs://cmip6/CMIP6/CMIP/CNRM-CERFACS/CNRM-ESM2-1/historical/r1i1p1f2/Ofx/areacello/gn/v20181206/.nan.20181206'].areacello
vo = vo.rename({'y':'y_c', 'lon':'lon_v', 'lat':'lat_v', 'bounds_lon':'bounds_lon_v', 'bounds_lat':'bounds_lat_v'})
ds = xr.merge([thetao,so,vo], compat='override')
ds = ds.assign_coords(areacello=areacello.fillna(0))
ds
[19]:
<xarray.Dataset> Dimensions: (y: 294, x: 362, nvertex: 4, lev: 75, axis_nbounds: 2, time: 1980, y_c: 294) Coordinates: (12/13) bounds_lat (y, x, nvertex) float64 dask.array<chunksize=(294, 362, 4), meta=np.ndarray> bounds_lon (y, x, nvertex) float64 dask.array<chunksize=(294, 362, 4), meta=np.ndarray> lat (y, x) float64 dask.array<chunksize=(294, 362), meta=np.ndarray> * lev (lev) float64 0.5058 1.556 2.668 ... 5.698e+03 5.902e+03 lev_bounds (lev, axis_nbounds) float64 dask.array<chunksize=(75, 2), meta=np.ndarray> lon (y, x) float64 dask.array<chunksize=(294, 362), meta=np.ndarray> ... ... time_bounds (time, axis_nbounds) object dask.array<chunksize=(1980, 2), meta=np.ndarray> bounds_lat_v (y_c, x, nvertex) float64 dask.array<chunksize=(294, 362, 4), meta=np.ndarray> bounds_lon_v (y_c, x, nvertex) float64 dask.array<chunksize=(294, 362, 4), meta=np.ndarray> lat_v (y_c, x) float64 dask.array<chunksize=(294, 362), meta=np.ndarray> lon_v (y_c, x) float64 dask.array<chunksize=(294, 362), meta=np.ndarray> areacello (y, x) float32 dask.array<chunksize=(294, 362), meta=np.ndarray> Dimensions without coordinates: y, x, nvertex, axis_nbounds, y_c Data variables: thetao (time, lev, y, x) float32 dask.array<chunksize=(4, 75, 294, 362), meta=np.ndarray> so (time, lev, y, x) float32 dask.array<chunksize=(5, 75, 294, 362), meta=np.ndarray> vo (time, lev, y_c, x) float32 dask.array<chunksize=(3, 75, 294, 362), meta=np.ndarray> Attributes: (12/57) CMIP6_CV_version: cv=6.2.3.0-7-g2019642 Conventions: CF-1.7 CMIP-6.2 EXPID: CNRM-ESM2-1_historical_r1i1p1f2 activity_id: CMIP arpege_minor_version: 6.3.2 branch_method: standard ... ... xios_commit: 1442-shuffle status: 2019-11-05;created;by nhn2@columbia.edu netcdf_tracking_ids: hdl:21.14100/9c34b796-c31d-4c1f-be90-21d032267f6... version_id: v20181206 intake_esm_varname: None intake_esm_dataset_key: CMIP.CNRM-CERFACS.CNRM-ESM2-1.historical.r1i1p1f...
The grid is missing an outer
coordinate for the Z axis, so we will construct one. This will be needed for conservative interpolation.
[20]:
import cf_xarray
level_outer_data = cf_xarray.bounds_to_vertices(ds.lev_bounds, 'axis_nbounds').load().data
ds = ds.assign_coords({'level_outer': level_outer_data})
Linear Interpolation#
Depth to Depth#
To illustrate linear interpolation, we will first interpolate salinity onto a uniformly spaced vertical grid.
[21]:
grid = Grid(ds, coords={'Z': {'center': 'lev'},
},
periodic=False
)
grid
[21]:
<xgcm.Grid>
Z Axis (not periodic, boundary=None):
* center lev
[22]:
target_depth_levels = np.arange(0,500,50)
salt_on_depth = grid.transform(ds.so, 'Z', target_depth_levels, target_data=None, method='linear')
salt_on_depth
[22]:
<xarray.DataArray 'so' (time: 1980, y: 294, x: 362, lev: 10)> dask.array<transpose, shape=(1980, 294, 362, 10), dtype=float32, chunksize=(5, 294, 362, 10), chunktype=numpy.ndarray> Coordinates: lat (y, x) float64 dask.array<chunksize=(294, 362), meta=np.ndarray> lon (y, x) float64 dask.array<chunksize=(294, 362), meta=np.ndarray> * time (time) object 1850-01-16 12:00:00 ... 2014-12-16 12:00:00 areacello (y, x) float32 dask.array<chunksize=(294, 362), meta=np.ndarray> * lev (lev) int64 0 50 100 150 200 250 300 350 400 450 Dimensions without coordinates: y, x
Note that the computation is lazy. (No data has been downloaded or computed yet.) We can trigger computation by plotting something.
[23]:
salt_on_depth.isel(time=0).sel(lev=50).plot()
[23]:
<matplotlib.collections.QuadMesh at 0x7fc8587b8550>

Depth to Potential Temperature#
We can also interpolate salinity onto temperature surface through linear interpolation.
[24]:
target_theta_levels = np.arange(-2, 36)
salt_on_theta = grid.transform(ds.so, 'Z', target_theta_levels, target_data=ds.thetao, method='linear')
salt_on_theta
[24]:
<xarray.DataArray 'so' (time: 1980, y: 294, x: 362, thetao: 38)> dask.array<transpose, shape=(1980, 294, 362, 38), dtype=float32, chunksize=(4, 294, 362, 38), chunktype=numpy.ndarray> Coordinates: lat (y, x) float64 dask.array<chunksize=(294, 362), meta=np.ndarray> lon (y, x) float64 dask.array<chunksize=(294, 362), meta=np.ndarray> * time (time) object 1850-01-16 12:00:00 ... 2014-12-16 12:00:00 areacello (y, x) float32 dask.array<chunksize=(294, 362), meta=np.ndarray> * thetao (thetao) int64 -2 -1 0 1 2 3 4 5 6 ... 27 28 29 30 31 32 33 34 35 Dimensions without coordinates: y, x
[25]:
salt_on_theta.isel(time=0).sel(thetao=20).plot()
/home/jthielen/miniconda3/envs/test_env_xgcm/lib/python3.10/site-packages/numba/np/ufunc/gufunc.py:170: RuntimeWarning: invalid value encountered in _interp_1d_linear
return self.ufunc(*args, **kwargs)
[25]:
<matplotlib.collections.QuadMesh at 0x7fc85a7b3760>

[26]:
salt_on_theta.isel(time=0).mean(dim='x').plot(x='y')
/home/jthielen/miniconda3/envs/test_env_xgcm/lib/python3.10/site-packages/numba/np/ufunc/gufunc.py:170: RuntimeWarning: invalid value encountered in _interp_1d_linear
return self.ufunc(*args, **kwargs)
[26]:
<matplotlib.collections.QuadMesh at 0x7fc85a40d1e0>

Conservative Interpolation#
To do conservative interpolation, we will attempt to calculate the meridional overturning in temperature space. Note that this is not a perfectly precise calculation. However, it’s sufficient to illustrate the basic principles of the calculation.
Create another grid object for conservative interpolation.
[27]:
grid = Grid(ds, coords={'Z': {'center': 'lev', 'outer': 'level_outer'},
'X': {'center': 'x', 'right': 'x_c'},
'Y': {'center': 'y', 'right': 'y_c'}
},
periodic=False,
)
grid
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Input In [27], in <cell line: 1>()
----> 1 grid = Grid(ds, coords={'Z': {'center': 'lev', 'outer': 'level_outer'},
2 'X': {'center': 'x', 'right': 'x_c'},
3 'Y': {'center': 'y', 'right': 'y_c'}
4 },
5 periodic=False,
6 )
7 grid
File ~/develop/xgcm/xgcm/grid.py:1271, in Grid.__init__(self, ds, check_dims, periodic, default_shifts, face_connections, coords, metrics, boundary, fill_value)
1269 for pos, dim in positions.items():
1270 if not (dim in ds.variables or dim in ds.dims):
-> 1271 raise ValueError(
1272 f"Could not find dimension `{dim}` (for the `{pos}` position on axis `{axis}`) in input dataset."
1273 )
1274 if dim not in ds.dims:
1275 raise ValueError(
1276 f"Input `{dim}` (for the `{pos}` position on axis `{axis}`) is not a dimension in the input datasets `ds`."
1277 )
ValueError: Could not find dimension `x_c` (for the `right` position on axis `X`) in input dataset.
To use conservative interpolation, we have to go from an intensive quantity (velocity) to an extensive one (velocity times cell thickness). We fill any missing values with 0, since they don’t contribute to the transport.
[ ]:
thickness = grid.diff(ds.level_outer, 'Z')
v_transport = ds.vo * thickness
v_transport = v_transport.fillna(0.).rename('v_transport')
v_transport
We also need to interpolate theta
or thetao, our target data for interpolation, to the same horizontal position as v_transport
. This means moving from cell center to cell corner. This step introduces some considerable errors, particularly near the boundaries of bathymetry. (Xgcm currently has no special treatment for internal boundary conditions–see issue 222.)
[ ]:
ds['theta'] = grid.interp(ds.thetao, ['Y'], boundary='extend')
ds.theta
We can transform v_transport
to temperature space (target_theta_levels
).
[ ]:
v_transport_theta = grid.transform(v_transport, 'Z', target_theta_levels,
target_data=ds.theta, method='conservative')
v_transport_theta
Notice that this produced a warning. The conservative
transformation method natively needs target_data
to be provided on the cell bounds (here level_outer
). Since transforming onto tracer coordinates is a very common scenario, xgcm uses linear interpolation to infer the values on the outer
axis position.
To demonstrate how to provide target_data
on the outer grid position, we reproduce the steps xgcm executes internally:
[ ]:
theta_outer = grid.interp(ds.theta,['Z'], boundary='extend')
# the data cannot be chunked along the transformation axis
theta_outer = theta_outer.chunk({'level_outer': -1}).rename('theta')
theta_outer
When we apply the transformation we can see that the results in this case are equivalent:
[ ]:
v_transport_theta_manual = grid.transform(v_transport, 'Z', target_theta_levels,
target_data=theta_outer, method='conservative')
# Warning: this step takes a long time to compute. We will only compare the first time value
xr.testing.assert_allclose(v_transport_theta_manual.isel(time=0), v_transport_theta.isel(time=0))
Now we verify visually that the vertically integrated transport is conserved under this transformation.
[ ]:
v_transport.isel(time=0).sum(dim='lev').plot(robust=True)
[ ]:
v_transport_theta.isel(time=0).sum(dim='theta').plot(robust=True)
Finally, we attempt to plot a crude meridional overturning streamfunction for a single timestep.
[ ]:
dx = 110e3 * np.cos(np.deg2rad(ds.lat_v))
(v_transport_theta.isel(time=0) * dx).sum(dim='x').cumsum(dim='theta').plot.contourf(x='y_c', levels=31)
Logarithmic Interpolation#
As noted previously, logarithmic interpolation is most often used to interpolate data from atmospheric models with a non-isobaric vertical coordinate (such as sigma or hybrid sigma) to isobaric levels suitable for analysis. And so, in place of the previous 4D ocean dataset, let’s generate a synthetic 3D atmospheric dataset (loosely based on the Sanders 1971 Analytic Model) to use to explore logarithmic interpolation:
[28]:
def generate_analytic_model(
nx=250,
ny=150,
nsigma=30,
L=3e6,
y_max=1.8e6,
y_scale_factor=0.333,
alpha=0.722,
terrain_rise=1.5e3,
pressure_variability=1e3,
pressure_sea_level_mean=1e5,
pressure_top=5e3,
scale_height=8e3,
temperature_variability=5,
temperature_mean_surface=300,
temperature_mean_tropopause=228.5,
a=1.12e-5,
b=2.3e-3,
k=20
):
"""Generate sythetic data for an atmospheric trough over a Gaussian terrain.
Parameters
----------
nx : int
Count of grid points in x direction
ny : int
Count of grid points in y direction
nsigma : int
Count of vertical sigma coordinate levels
L : float
Zonal wavelength of trough [meters]
y_max : float
Meridional extent of domain [meters]
alpha : float
Vertical control parameter for tropopause location [dimensionless]
terrain_rise : float
Max height of the terrain [meters]
pressure_variability : float
Wave amplitude of sea-level pressure field [Pa]
pressure_sea_level_mean : float
Domain average of sea-level pressure field [Pa]
pressure_top : float
Uniform pressure at top of domain [Pa]
scale_height : float
Control parameter for conversion of sea-level pressure to surface pressure [meters]
temperature_variability : float
Wave amplitude of temperature perturbation field [K]
temperature_mean_surface : float
Horizontal mean of temperature at ground level [K]
temperature_mean_tropopause : float
Horizontal mean of temperature at model top [K]
a : float
Meridional control parameter for temperature field shape [K / m]
b : float
Meridional control parameter for temperature field shape [Pa / m]
k : float
Vertical control parameter for mean temperature profile sharpness in vertical
"""
# Constants
R = 287.05
g = 9.81
# Define coordinates (and broadcasted versions)
x = np.linspace(-L / 2, L / 2, nx)
y = np.linspace(0, y_max, ny)
sigma = np.linspace(1, 0, nsigma)
x_2d, y_2d = np.meshgrid(x, y)
x_3d, y_3d, sigma_3d = np.meshgrid(x, y, sigma)
# Sea-level pressure (2D) as a trough shape
pressure_sea_level = (
pressure_sea_level_mean
- b * y_2d * y_scale_factor
- pressure_variability * np.cos(2 * np.pi / L * x_2d) * np.cos(2 * np.pi / L * y_2d * y_scale_factor)
)
# Terrain height as an offset Gaussian shape (uniform in meridional direction)
geopotential_height_surface = np.exp(-((x_2d + L / 3) / (L / 3))**2) * terrain_rise
# Surface pressure (2D) from sea-level pressure and terrain height
pressure_surface = pressure_sea_level * np.exp(
-geopotential_height_surface / scale_height
)
# Pressure (3D) from definition of sigma coordinate
pressure = sigma_3d * (pressure_surface - pressure_top)[..., None] + pressure_top
# Trough component of temperature
temperature_pertubation = (
-(1 + alpha * np.log(pressure_sea_level_mean / pressure))
* (
a * y_3d * y_scale_factor
+ temperature_variability * np.cos(2 * np.pi / L * x_3d) * np.cos(2 * np.pi / L * y_3d * y_scale_factor)
)
)
# Vertical component of temperature
temperature_mean = (
temperature_mean_tropopause
+ (
np.log(1 + np.exp(k * (sigma + alpha - 1)))
/ np.log(1 + np.exp(k * alpha))
) * (temperature_mean_surface - temperature_mean_tropopause)
)
# Combine and calcuate temperature and geopotential 3D
temperature = temperature_mean[None, None] + temperature_pertubation
geopotential = g * geopotential_height_surface[..., None] - np.concatenate(
(
np.zeros_like(x_2d)[..., None],
np.cumsum(
(
R
* (temperature[..., :-1] + temperature[..., 1:])
/ (sigma_3d[..., :-1] + sigma_3d[..., 1:])
* np.diff(sigma_3d, axis=-1)
),
axis=-1
)
),
axis=-1
)
# Return dataset!
return xr.Dataset(
{
'temperature': (('y', 'x', 'sigma'), temperature),
'geopotential': (('y', 'x', 'sigma'), geopotential),
'pressure': (('y', 'x', 'sigma'), pressure),
},
{
'y': y,
'x': x,
'sigma': sigma
}
), {
'p_init': pressure_sea_level,
't_init': temperature_pertubation[..., 0]
}
ds, stuff = generate_analytic_model()
ds
[28]:
<xarray.Dataset> Dimensions: (y: 150, x: 250, sigma: 30) Coordinates: * y (y) float64 0.0 1.208e+04 2.416e+04 ... 1.788e+06 1.8e+06 * x (x) float64 -1.5e+06 -1.488e+06 ... 1.488e+06 1.5e+06 * sigma (sigma) float64 1.0 0.9655 0.931 ... 0.06897 0.03448 0.0 Data variables: temperature (y, x, sigma) float64 305.5 302.2 298.9 ... 215.3 214.1 212.2 geopotential (y, x, sigma) float64 1.146e+04 1.452e+04 ... 3.418e+05 pressure (y, x, sigma) float64 8.728e+04 8.444e+04 ... 8.238e+03 5e+03
This synthetic model gives a low-pressure trough centered in the domain, with a sloping terrain in the zonal direction. Here is what that terrain (surface geopotential height) looks like:
[29]:
(ds['geopotential'] / 9.81).rename('geopotential_height').sel(sigma=1, y=0).plot()
[29]:
[<matplotlib.lines.Line2D at 0x7fc85a69d4b0>]

If we were to inspect a given vertical level of these data, it would be difficult to interpret due to the terrain-following nature of the sigma coordinate:
[30]:
ds['geopotential'].isel(sigma=15).plot.contourf()
[30]:
<matplotlib.contour.QuadContourSet at 0x7fc854a50df0>

And so, let’s interpolate to some common meteorological upper-air levels (750 hPa, 500 hPa, and 300 hPa) for plotting and analysis:
[31]:
grid = Grid(
ds,
coords={
'X': {'center': 'x'},
'Y': {'center': 'y'},
'Z': {'center': 'sigma'}
},
periodic=False
)
isobaric_levels = np.array([7.5e4, 5.0e4, 3.0e4])
geopotential_isobaric = grid.transform(
ds['geopotential'],
'Z',
isobaric_levels,
target_data=ds['pressure'],
method='log'
)
temperature_isobaric = grid.transform(
ds['temperature'],
'Z',
isobaric_levels,
target_data=ds['pressure'],
method='log'
)
ds_isobaric = xr.merge([geopotential_isobaric, temperature_isobaric])
ds_isobaric
[31]:
<xarray.Dataset> Dimensions: (y: 150, x: 250, pressure: 3) Coordinates: * y (y) float64 0.0 1.208e+04 2.416e+04 ... 1.788e+06 1.8e+06 * x (x) float64 -1.5e+06 -1.488e+06 ... 1.488e+06 1.5e+06 * pressure (pressure) float64 7.5e+04 5e+04 3e+04 Data variables: geopotential (y, x, pressure) float64 2.529e+04 6.027e+04 ... 9.459e+04 temperature (y, x, pressure) float64 291.3 262.7 242.8 ... 240.8 221.8
An additional benefit of now having our data in isobaric coordinates is that the form of kinematics and dynamics formulas are more straightforward compared to non-isobaric forms. To demonstrate this, let’s plot the 300 hPa Temperature and Geostrophic Wind:
[32]:
# Add inner coords for derivatives
ds_isobaric.coords['x_inner'] = (ds_isobaric['x'].values[:-1] + ds_isobaric['x'].values[1:]) / 2
ds_isobaric.coords['y_inner'] = (ds_isobaric['y'].values[:-1] + ds_isobaric['y'].values[1:]) / 2
# Create new grid
grid_isobaric = Grid(
ds_isobaric,
coords={
'X': {'center': 'x', 'inner': 'x_inner'},
'Y': {'center': 'y', 'inner': 'y_inner'},
'Z': {'center': 'pressure'}
},
periodic=False
)
# Calculate geostrophic wind components (without metrics)
f = 1.2e-4 # f-plane approximation of Coriolis force
u_wind = grid_isobaric.interp(
-grid_isobaric.diff(ds_isobaric['geopotential'], 'Y', to='inner') / grid_isobaric.diff(ds_isobaric['y'], 'Y', to='inner') / f,
to='inner',
axis='X'
)
v_wind = grid_isobaric.interp(
grid_isobaric.diff(ds_isobaric['geopotential'], 'X', to='inner') / grid_isobaric.diff(ds_isobaric['x'], 'X', to='inner') / f,
to='inner',
axis='Y'
)
# Plot
level = 3.0e4
fig, ax = plt.subplots(1, 1, figsize=[8,5])
ds_isobaric['temperature'].sel(pressure=level).plot.contourf(ax=ax, alpha=0.3)
ds_isobaric['geopotential'].sel(pressure=level).plot.contour(ax=ax, colors='k')
barb_reduce = slice(10, -10, 25)
ax.barbs(
u_wind.x_inner.values[barb_reduce],
u_wind.y_inner.values[barb_reduce],
u_wind.sel(pressure=level).values[barb_reduce, barb_reduce],
v_wind.sel(pressure=level).values[barb_reduce, barb_reduce]
)
ax.set_aspect('equal')
plt.show();

Performance#
By default xgcm performs some simple checks when using method='linear'
. It checks if the last value of the data is larger than the first, and if not, the data is flipped.This ensures that monotonically decreasing variables, like temperature are interpolated correctly. These checks have a performance penalty (~30% in some preliminary tests).
If you have manually flipped your data and ensured that its monotonically increasing, you can switch the checks off to get even better performance.
grid.transform(..., method='linear', bypass_checks=True)