Concatenating a gridded rainfall reanalysis dataset into a time series
Context¶
Purpose¶
To load and extract a region of interest from a gridded rainfall reanalysis dataset, and concatenate into a time series using the Iris package.
Preprocessing description¶
Time series data allows us to carry out a wide range of analyses including but not limited to trend, seasonality, anomaly detection and causality. As most of the climatological datasets are gridded, we provide a general tool to preprocess them into time series. The example global dataset from NCEP/NCAR reanalysis has a fairly low resolution (T62 Gaussian grid or approximately 1.9 * 1.9 degrees lat/long) which allows easy execution Kretschmer et al., 2021. It is openly available with a variety of atmospheric variables at near surface levels in daily and monthly frequencies as well as long-term monthly mean in NetCDF format, which is described in and can be obtained from the NOAA Physical Sciences Laboratory.
This notebook uses a single sample data file for global daily precipitation rate (monthly mean) included with the notebook.
Highlights¶
- Data for the entire globe is loaded and plotted using Iris
- Seasonal means are created by aggregating the data
- The Indonesian Borneo region is extracted and plotted
- The area-averaged time series for Indonesian Borneo region is created
- A particular season and timeframe are extracted from the time series
Contributions¶
Dataset originator/creator¶
- NOAA National Center for Environmental Prediction (creator)
Dataset authors¶
- Eugenia Kalnay, Director, NCEP Environmental Modeling Center
Load libraries¶
Source
import os
import iris
import iris.quickplot as qplt
import iris.coord_categorisation as coord_cat
import cf_units
import nc_time_axis
import matplotlib.pyplot as plt
import urllib.request
import holoviews as hv
import geoviews as gv
import warnings
warnings.filterwarnings(action='ignore')
%matplotlib inline
hv.extension('bokeh')
Set project structure¶
notebook_folder = './notebook'
if not os.path.exists(notebook_folder):
os.makedirs(notebook_folder)
Retrieve and/or load a sample data file¶
filepath = 'https://downloads.psl.noaa.gov/Datasets/ncep.reanalysis.derived/surface_gauss/'
filename = 'prate.sfc.mon.mean.nc'
if not os.path.exists(os.path.join(notebook_folder,filename)):
urllib.request.urlretrieve(os.path.join(filepath,filename), os.path.join(notebook_folder, filename))
# Load monthly precipitation data into an iris cube
precip = iris.load_cube(os.path.join(notebook_folder, filename), 'Monthly Mean of Precipitation Rate')
precip.coord('latitude').guess_bounds()
precip.coord('longitude').guess_bounds()
print(precip)
Monthly Mean of Precipitation Rate / (unknown) (time: 926; latitude: 94; longitude: 192)
Dimension coordinates:
time x - -
latitude - x -
longitude - - x
Attributes:
Conventions 'COARDS'
NCO '20121013'
References 'http://www.psl.noaa.gov/data/gridded/data.ncep.reanalysis.derived.html'
actual_range array([-2.3283064e-10, 5.8903999e-04], dtype=float32)
dataset 'NCEP Reanalysis Derived Products'
dataset_title 'NCEP-NCAR Reanalysis 1'
description 'Data is from NMC initialized reanalysis\n(4x/day). It consists of T42 ...'
history 'Mon Jul 5 23:55:54 1999: ncrcat prate.mon.mean.nc /Datasets/ncep.reanalysis.derived/surface_gauss/prate.mon.mean.nc ...'
invalid_units 'Kg/m^2/s'
least_significant_digit np.int16(0)
level_desc 'Surface'
parent_stat 'Individual Obs'
platform 'Model'
precision np.int16(1)
statistic 'Mean'
title 'monthly mean prate.sfc from the NCEP Reanalysis'
valid_range array([-400., 700.], dtype=float32)
var_desc 'Precipitation Rate'
From print(precip)
we have an idea whether the metadata is complete and where the possible gaps are. In case the iris cube does not contain a unit, we can set it as follows:
# Set unit of precipitation data, if the cube does not contain it
unit_to_add = 'kg m-2 s-1'
if precip.units == 'unknown':
precip.units = cf_units.Unit (unit_to_add)
Visualisation¶
Here we use the Iris quickplot
wrapper to matplotlib
(static plot with limited interactivity), and holoviews
to interactive plotting the gridded data with added coastline.
Static plot¶
Source
# Plot data of the first time step using iris quickplot with pcolormesh
qplt.pcolormesh(precip[0])
plt.gca().coastlines(color='white')
plt.show()

Interactive plot¶
Source
# Declare some options
options = dict(width=600, height=350, yaxis='left', colorbar=True,
toolbar='above', cmap='viridis', infer_projection=True, tools=['hover'])
# Create a geoviews dataset object
rainfall_ds = gv.Dataset(precip[0], ['longitude', 'latitude'], 'Monthly Mean Of Precipitation Rate (kg m-2 s-1)',
group='Monthly mean of precipitation rate')
plot_rainfall = rainfall_ds.to.image().opts(**options) * gv.feature.coastline().opts(line_color='white')
plot_rainfall
Create seasonal means¶
Here we construct seasonal means from the monthly data for each grid, for the purpose of extracting a particular season of interest later on.
# Add auxiliary coordinates to the cube to indicate each season
coord_cat.add_season(precip, 'time', name='clim_season')
coord_cat.add_season_year(precip, 'time', name='season_year')
print(precip)
Monthly Mean of Precipitation Rate / (kg m-2 s-1) (time: 926; latitude: 94; longitude: 192)
Dimension coordinates:
time x - -
latitude - x -
longitude - - x
Auxiliary coordinates:
clim_season x - -
season_year x - -
Attributes:
Conventions 'COARDS'
NCO '20121013'
References 'http://www.psl.noaa.gov/data/gridded/data.ncep.reanalysis.derived.html'
actual_range array([-2.3283064e-10, 5.8903999e-04], dtype=float32)
dataset 'NCEP Reanalysis Derived Products'
dataset_title 'NCEP-NCAR Reanalysis 1'
description 'Data is from NMC initialized reanalysis\n(4x/day). It consists of T42 ...'
history 'Mon Jul 5 23:55:54 1999: ncrcat prate.mon.mean.nc /Datasets/ncep.reanalysis.derived/surface_gauss/prate.mon.mean.nc ...'
invalid_units 'Kg/m^2/s'
least_significant_digit np.int16(0)
level_desc 'Surface'
parent_stat 'Individual Obs'
platform 'Model'
precision np.int16(1)
statistic 'Mean'
title 'monthly mean prate.sfc from the NCEP Reanalysis'
valid_range array([-400., 700.], dtype=float32)
var_desc 'Precipitation Rate'
# Aggregate by season
annual_seasonal_mean = precip.aggregated_by(
['clim_season', 'season_year'],
iris.analysis.MEAN)
# Check this worked
for season, year in zip(
annual_seasonal_mean.coord('clim_season')[:10].points,
annual_seasonal_mean.coord('season_year')[:10].points):
print(season + ' ' + str(year))
djf 1948
mam 1948
jja 1948
son 1948
djf 1949
mam 1949
jja 1949
son 1949
djf 1950
mam 1950
Extract Borneo region¶
Here we extract our area of study which covers the Indonesian Borneo region, as specified by Melendy et al. (2014) (available at https://
# Create a constraint for the latitude and Longitude extents
Borneo_lat = iris.Constraint(latitude=lambda v: v > -4.757 and v <= 3.211 )
Borneo_lon = iris.Constraint(longitude=lambda v: v > 107.815 and v <= 117.987 )
# Extract data based on the spatial extent
Borneo = annual_seasonal_mean.extract(Borneo_lat & Borneo_lon)
print(Borneo)
Monthly Mean of Precipitation Rate / (kg m-2 s-1) (time: 309; latitude: 4; longitude: 5)
Dimension coordinates:
time x - -
latitude - x -
longitude - - x
Auxiliary coordinates:
clim_season x - -
season_year x - -
Cell methods:
0 clim_season: season_year: mean
Attributes:
Conventions 'COARDS'
NCO '20121013'
References 'http://www.psl.noaa.gov/data/gridded/data.ncep.reanalysis.derived.html'
actual_range array([-2.3283064e-10, 5.8903999e-04], dtype=float32)
dataset 'NCEP Reanalysis Derived Products'
dataset_title 'NCEP-NCAR Reanalysis 1'
description 'Data is from NMC initialized reanalysis\n(4x/day). It consists of T42 ...'
history 'Mon Jul 5 23:55:54 1999: ncrcat prate.mon.mean.nc /Datasets/ncep.reanalysis.derived/surface_gauss/prate.mon.mean.nc ...'
invalid_units 'Kg/m^2/s'
least_significant_digit np.int16(0)
level_desc 'Surface'
parent_stat 'Individual Obs'
platform 'Model'
precision np.int16(1)
statistic 'Mean'
title 'monthly mean prate.sfc from the NCEP Reanalysis'
valid_range array([-400., 700.], dtype=float32)
var_desc 'Precipitation Rate'
# Plot data of the first season in the study region using iris quickplot with pcolormesh
qplt.pcolormesh(Borneo[0])
plt.gca().coastlines(color='white')
plt.show()

Create area-averaged time series¶
To construct a seasonal rainfall time series for the study region, we first compute the areal average rainfall. Note that due to the spherical nature of the planet Earth, the area of every grid-box is not the same, therefore we need to perform the weighted mean based on the weights by area.
# Create area-weights array
grid_area_weights = iris.analysis.cartography.area_weights(Borneo)
# Perform the area-weighted mean using the computed grid-box areas.
Borneo_mean = Borneo.collapsed(['latitude', 'longitude'],
iris.analysis.MEAN,
weights=grid_area_weights)
We then extract the temporal timescale of interest (Boreal Summers from 1950 - 2019).
jja_constraint = iris.Constraint(clim_season='jja')
year_constraint = iris.Constraint(season_year=lambda v: v > 1949 and v <= 2019 )
Borneo_jja = Borneo_mean.extract(jja_constraint & year_constraint)
print(Borneo_jja)
Monthly Mean of Precipitation Rate / (kg m-2 s-1) (time: 70)
Dimension coordinates:
time x
Auxiliary coordinates:
clim_season x
season_year x
Scalar coordinates:
latitude 0.0 degrees, bound=(-3.80947, 3.80947) degrees
longitude 114.375 degrees, bound=(109.6875, 119.0625) degrees
Cell methods:
0 clim_season: season_year: mean
1 latitude: longitude: mean
Attributes:
Conventions 'COARDS'
NCO '20121013'
References 'http://www.psl.noaa.gov/data/gridded/data.ncep.reanalysis.derived.html'
actual_range array([-2.3283064e-10, 5.8903999e-04], dtype=float32)
dataset 'NCEP Reanalysis Derived Products'
dataset_title 'NCEP-NCAR Reanalysis 1'
description 'Data is from NMC initialized reanalysis\n(4x/day). It consists of T42 ...'
history 'Mon Jul 5 23:55:54 1999: ncrcat prate.mon.mean.nc /Datasets/ncep.reanalysis.derived/surface_gauss/prate.mon.mean.nc ...'
invalid_units 'Kg/m^2/s'
least_significant_digit np.int16(0)
level_desc 'Surface'
parent_stat 'Individual Obs'
platform 'Model'
precision np.int16(1)
statistic 'Mean'
title 'monthly mean prate.sfc from the NCEP Reanalysis'
valid_range array([-400., 700.], dtype=float32)
var_desc 'Precipitation Rate'
Finally, we use the Iris quickplot
wrapper to matplotlib pyplot
(static with limited interactivity) and holoviews
to interactive plotting the time series generated.
Static plot¶
Source
# Plot time series using iris quickplot
qplt.plot(Borneo_jja)
plt.title('Borneo JJA Precipitation')
plt.show()

Interactive plot¶
Source
# As holoviews does not support direct plotting of a non-gridded cube object, we need to decompose the cube into its x- and y-axes.
time = Borneo_jja.coord('season_year').points
data = Borneo_jja.data
# Create a holoviews time series object
Borneo_jja_dynamic = hv.Curve((time, data), 'Time', 'Monthly Mean Of Precipitation Rate (kg m-2 s-1)')
# Show the plot and declare some options
plot_timeseries = Borneo_jja_dynamic.opts(width=600, height=350, yaxis='left', tools=['hover'], title="Borneo JJA Precipitation")
plot_timeseries
Save as a new NetCDF file¶
iris.save(Borneo_jja, os.path.join(notebook_folder, 'Borneo_precip_mean.nc'))
Summary¶
This notebook has demonstrated the use of the Iris
package to easily load, plot and manipulate gridded environmental NetCDF data.
Citing this Notebook¶
Please see CITATION.cff for the full citation information. The citation file can be exported to APA or BibTex formats (learn more here).
Additional information¶
Review: This notebook has been reviewed by one or more members of the Environmental Data Science book community. The open review is available here.
License: The code in this notebook is licensed under the MIT License. The Environmental Data Science book is licensed under the Creative Commons by Attribution 4.0 license. See further details here.
Contact: If you have any suggestion or report an issue with this notebook, feel free to create an issue or send a direct message to environmental
Notebook repository version: v2.0.0
Last tested: 2025-03-27
- Kretschmer, M., Adams, S. V., Arribas, A., Prudden, R., Robinson, N., Saggioro, E., & Shepherd, T. G. (2021). Quantifying Causal Pathways of Teleconnections. Bulletin of the American Meteorological Society, 102(12), E2247–E2263. 10.1175/BAMS-D-20-0117.1
- Kalnay, E., Kanamitsu, M., Kistler, R., Collins, W., Deaven, D., Gandin, L., Iredell, M., Saha, S., White, G., Woollen, J., Zhu, Y., Chelliah, M., Ebisuzaki, W., Higgins, W., Janowiak, J., Mo, K. C., Ropelewski, C., Wang, J., Leetmaa, A., … Joseph, D. (1996). The NCEP/NCAR 40-Year Reanalysis Project. Bulletin of the American Meteorological Society, 77(3), 437–472. https://doi.org/10.1175/1520-0477(1996)077<0437:TNYRP>2.0.CO;2