# Generate missing C-grid coordinates¶

Climate model output is usually provided with geometric information like the position of the cell center and boundaries. Gridded observational datasets often lack these informations and provide only the position of either the gridcell center or cell boundary.

This makes the calculation of common vector calculus operators like the gradient difficult, and results depend on the particular method used.

`xgcm`

can infer the cell boundary or cell center location depending on the geometry of the gridded source dataset. This enables consistent and easily reproducible calculations across model and observational datasets.

The autogenerate module enables the user to apply the same methods to both model output and observational products, which enables better comparison and a unified workflow using different sources of data.

In this case xgcm can infer the missing coordinates to enable the creation of a grid object. Below we will illustrate how to infer coordinates for several example datasets (nonperiodic and periodic) and show how the resulting dataset can be used to perform common calculations like gradients and distance/area weighted averages on observational data.

## Non periodic 1D example: Ocean temperature profile¶

First let’s import xarray and xgcm

```
[ ]:
```

```
import numpy as np
import xarray as xr
from xgcm import Grid
from xgcm.autogenerate import generate_grid_ds
import matplotlib.pyplot as plt
%matplotlib inline
```

Below we will create a synthetic temperature profile with decreasing temperature at depth, with unevenly spaced depth intervals (commonly found in hydrographic data).

```
[ ]:
```

```
# create a synthetic ocean temperature profile with uneven spacing in depth
z = np.hstack([np.arange(1,10, 1), np.arange(10,200, 10), np.arange(200,700, 20)])
# Create synthetic temperature profile with maximum temperature gradient at mid depth (e.g. the thermocline)
temp = ((np.cos(np.pi*z/700) + 1) + np.exp(-z/350) / 2) * 10
# Convert to xarray.Dataset
ds = xr.DataArray(temp, dims=['depth'], coords={'depth':z}).to_dataset(name='temperature')
ds.temperature.plot(y='depth', yincrease=False)
```

### Infer the cell boundaries using xgcm.autogenerate¶

generate_grid_ds can infer the missing cell positions based on the given position (defaults to cell center) and the axis, which is defined by passing a dictionary with the physical axis as key and the dataset dimensions belonging to that axis as values.

```
[ ]:
```

```
# Generate 'full' dataset, which includes additional coordinate `depth_left` and appropriate attributes.
ds_full = generate_grid_ds(ds, {'Z':'depth'})
print(ds_full)
print(ds.depth.data)
print(ds_full.depth_left.data)
```

We see now that a new dimension `depth_left`

was created, with cell boundaries shifted towards the surface

The default behaviour of generate_grid_ds is to extrapolate the grid position to the ‘left’ (e.g. towards the surface for a depth profile), assuming that the spacing in the two cells closest to the boundary (here: the first and second cell) is equal. Particular geometries might require adjustments of the boundary treatment, by specifying e.g. `pad=0`

to ensure the topmost cell boundary is located at the sea surface.

Finally we can create the xgcm.Grid object like we would from model output (see for example here)

```
[ ]:
```

```
# Create grid object
grid = Grid(ds_full, periodic=False)
print(grid)
```

### Calculate vertical gradient¶

Now we have all the tools we need to calculate the vertical gradient just like with numerical model output

```
[ ]:
```

```
# Calculate vertical distances located on the cellboundary
ds_full.coords['dzc'] = grid.diff(ds_full.depth, 'Z', boundary='extrapolate')
# Calculate vertical distances located on the cellcenter
ds_full.coords['dzt'] = grid.diff(ds_full.depth_left, 'Z', boundary='extrapolate')
```

```
[ ]:
```

```
# note that the temperature gradient is located on the `depth_left` dimension,
# but no temperature information is available, to infer the gradient in the topmost grid cell.
# The following will pad with nan towards the surface. Alternatively the values could be padded with
# with a particular value or linearly extrapolated.
dtemp_dz = grid.diff(ds.temperature, 'Z', boundary='fill', fill_value=np.nan) / ds_full.dzc
print(dtemp_dz)
fig, ax1 = plt.subplots()
ax1.invert_yaxis()
ax2 = ax1.twiny()
ds.temperature.plot(ax=ax1, y='depth', color='C0')
ax1.set_xlabel('temperature [deg C]', color='C0')
dtemp_dz.plot(ax=ax2, y='depth_left', color='C1')
ax2.set_xlabel('vertical temperature gradient [deg C / m]', color='C1');
```

### Depth weighted average¶

Another common operation for many climate datasets is a weighted mean along an unevenly spaced dimension. Using the grid spacing for the tracer cells earlier this becomes trivial.

```
[ ]:
```

```
mean_temp = ds_full.temperature.mean('depth')
mean_temp_weighted = (ds_full.temperature * ds_full.dzt).sum('depth') / ds_full.dzt.sum('depth')
print(mean_temp.data)
print(mean_temp_weighted.data)
```

## Periodic 2D example¶

Below we will show how to apply these methods similarly to a global surface wind field, which is periodic in the longitudinal (‘x’) direction.

```
[1]:
```

```
# hack to make file name work with nbsphinx and binder
import os
fname = '../datasets/uvwnd.10m.gauss.2018.nc'
if not os.path.exists(fname):
fname = '../' + fname
```

```
[ ]:
```

```
ds = xr.open_dataset(fname)
ds_full = generate_grid_ds(ds, {'X':'lon', 'Y':'lat'})
ds_full
```

As in the depth profile the longitude and latitude values are extended to the `left`

when the defaults are used. However, since the latitude is not periodic we can specify the position to extend to as `outer`

(more details here). This will extend the latitudinal positions both to the left and right, avoiding missing values later on.

```
[ ]:
```

```
grid = Grid(ds_full, periodic=['X'])
grid
```

Now we compute the difference (in degrees) along the longitude and latitude for both the cell center and the cell face. Since we are not taking the difference of a data variable across the periodic boundary, we need to specify the `boundary_discontinutity`

in order to avoid the introduction of artefacts at the boundary.

```
[ ]:
```

```
dlong = grid.diff(ds_full.lon, 'X', boundary_discontinuity=360)
dlonc = grid.diff(ds_full.lon_left, 'X', boundary_discontinuity=360)
dlonc_wo_discontinuity = grid.diff(ds_full.lon_left, 'X')
dlatg = grid.diff(ds_full.lat, 'Y', boundary='fill', fill_value=np.nan)
dlatc = grid.diff(ds_full.lat_left, 'Y', boundary='fill', fill_value=np.nan)
dlonc.plot()
dlonc_wo_discontinuity.plot(linestyle='--')
```

Without adding the `boundary_discontinuity`

input, the last cell distance is calculated incorectly.

The values we just calculated are actually not cell distances, but instead differences in longitude and latitude (in degrees). In order to calculate operators like the gradient `dlon`

and `dlat`

have to be converted into approximate cartesian distances on a globe.

```
[ ]:
```

```
def dll_dist(dlon, dlat, lon, lat):
"""Converts lat/lon differentials into distances in meters
PARAMETERS
----------
dlon : xarray.DataArray longitude differentials
dlat : xarray.DataArray latitude differentials
lon : xarray.DataArray longitude values
lat : xarray.DataArray latitude values
RETURNS
-------
dx : xarray.DataArray distance inferred from dlon
dy : xarray.DataArray distance inferred from dlat
"""
distance_1deg_equator = 111000.0
dx = dlon * xr.ufuncs.cos(xr.ufuncs.deg2rad(lat)) * distance_1deg_equator
dy = ((lon * 0) + 1) * dlat * distance_1deg_equator
return dx, dy
ds_full.coords['dxg'], ds_full.coords['dyg'] = dll_dist(dlong, dlatg, ds_full.lon, ds_full.lat)
ds_full.coords['dxc'], ds_full.coords['dyc'] = dll_dist(dlonc, dlatc, ds_full.lon, ds_full.lat)
```

Based on the distances we can estimate the area of each grid cell and compute the area-weighted meridional average temperature

```
[ ]:
```

```
ds_full.coords['area_c'] = ds_full.dxc * ds_full.dyc
```

### Compute zonal gradient of the surface wind field¶

Now that all needed grid metrics are available, we can compute the zonal temperature gradient similar as above.

```
[ ]:
```

```
du_dx = grid.diff(ds_full.uwnd, 'X') / ds_full.dxg
du_dx
```

The values of the gradient are correctly located on the cell boundary on the x-axis and on the cell center in the y-axis

```
[ ]:
```

```
import cartopy.crs as ccrs
fig, axarr = plt.subplots(ncols=2, nrows=3, figsize=[16,20],
subplot_kw=dict(projection=ccrs.Orthographic(0, -30)))
for ti,tt in enumerate(np.arange(0,30, 10)):
ax1 = axarr[ti,0]
ax2 = axarr[ti,1]
time = ds_full.time.isel(time=tt).data
ds_full.uwnd.isel(time=tt).plot(ax=ax1, transform=ccrs.PlateCarree(),robust=True)
du_dx.isel(time=tt).plot(ax=ax2, transform=ccrs.PlateCarree(), robust=True)
ax1.set_title('Zonal Surface Wind %s' %time)
ax2.set_title('Zonal Gradient of Zonal Surface Wind %s' %time)
for ax in [ax1, ax2]:
ax.set_global(); ax.coastlines();
```

The resulting gradient is correctly computed across the periodic (x-axis) boundary.

### Area weighted average¶

By using the approximated cell area, we can easily compute the area weighted average of the zonal wind.

```
[ ]:
```

```
u = ds_full.uwnd.mean('time')
u_mean = u.mean('lat')
u_mean_weighted = (u * u.area_c).sum('lat') / (u.area_c).sum('lat')
u_mean.plot(label='unweighted mean')
u_mean_weighted.plot(label='area weighted mean')
plt.legend()
plt.ylabel('zonal wind [m/s]')
plt.gca().autoscale('tight')
```

Thes two lines show the importance of applying a weighted mean, when the grid spacing is irregular (e.g. datasets gridded on a regular lat-lon grid).