Francesco Nazzaro <f.nazzaro@bopen.eu> - @fnazzaro89
B-Open, Rome - http://bopen.eu (now hiring!)
12th of July, 2017
Temperature compared to the 1961-1990 average
3 scenarios:
We are here: +1°C
The main library to deal with climate data is
thanks to integration, it adds dimensions names and
coordinate indexes to ndarray
from ecmwfapi import ECMWFDataServer
server = ECMWFDataServer()
dataset_path = "dataset.nc"
server.retrieve({
'stream' : "oper",
'levtype' : "sfc",
'param' : "228.128/167.128",
'dataset' : "interim",
'step' : "0",
'grid' : "0.75/0.75",
'time' : "00",
'date' : "1979-01-01/to/2016-12-31",
'type' : "an",
'class' : "ei",
'format' : "netcdf",
'target' : dataset_path
})
To get climate data from ECMWF you can follow http://apps.ecmwf.int/archive-catalogue/
>>> import xarray as xr
>>> dataset = xr.open_dataset(dataset_path)
>>> print(dataset)
<xarray.Dataset>
Dimensions: (lat: 241, lon: 480, time: 452)
Coordinates:
* lat (lat) float32 -90.0 -89.25 -88.5 -87.75 -87.0 -86.25 -85.5 ...
* lon (lon) float64 -180.0 -179.2 -178.5 -177.8 -177.0 -176.2 -175.5 ...
* time (time) datetime64[ns] 1979-01-31 1979-02-28 1979-03-31 ...
Data variables:
tas (time, lat, lon) float32 249.589 249.589 249.589 249.589 ...
tprate (time, lat, lon) float32 9.30046e-10 9.30046e-10 9.30046e-10 ...
precipitation (tprate) temperature (tas) lat lon
time
You can select only a variable
>>> print(dataset.tas)
<xarray.DataArray 'tas' (time: 452, lat: 241, lon: 480)>
dask.array<open_dataset-00bb1754472e239912b40328fcbb22f6tas, shape=(452, 241, 480), dtype=float32, chunksize=(452, 241, 480)>
Coordinates:
* lat (lat) float32 -90.0 -89.25 -88.5 -87.75 -87.0 -86.25 -85.5 ...
* lon (lon) float64 -180.0 -179.2 -178.5 -177.8 -177.0 -176.2 -175.5 ...
* time (time) datetime64[ns] 1979-01-31 1979-02-28 1979-03-31 ...
Attributes:
standard_name: air_temperature
long_name: 2 metre temperature
units: K
precipitation (tprate) temperature (tas) lat lon
time
>>> dataset.tas.shape
(452, 241, 480)
>>> tas_2016 = dataset.tas.sel(time='2016-01')
>>> tas_2016.shape
(1, 241, 480)
>>> print(tas_2016)
<xarray.DataArray 'tas' (time: 1, lat: 241, lon: 480)>
dask.array<getitem, shape=(1, 241, 480) ...
Coordinates:
* lat (lat) float32 -90.0 -89.25 ...
* lon (lon) float64 -180.0 -179.2 ...
* time (time) datetime64[ns] 2016-01-31
Attributes:
standard_name: air_temperature
long_name: 2 metre temperature
units: K
>>> tas_2016.plot(cmap='RdBu')
Xarray DataSet and DataArray have all the methods and attributes of numpy and a lot more methods to perform operations on data.
It depends only on lat and lon map
>>> tas_rimini = dataset.tas.sel(lat=44.05755, lon=12.56528, method='nearest')
>>> print(tas_rimini)
<xarray.DataArray 'tas' (time: 452)>
dask.array<getitem, shape=(452,), dtype=float32, chunksize=(452,)>
Coordinates:
lat float32 44.25
lon float64 12.75
* time (time) datetime64[ns] 1979-01-31 1979-02-28 1979-03-31 ...
Attributes:
standard_name: air_temperature
long_name: 2 metre temperature
units: K
>>> tas_rimini.plot()
>>> climatology = tas_rimini.groupby('time.month').mean('time')
>>> print(climatology)
<xarray.DataArray 'tas' (month: 12)>
dask.array<stack, shape=(12,), dtype=float32, chunksize=(1,)>
Coordinates:
lat float32 44.25
lon float64 12.75
* month (month) int64 1 2 3 4 5 6 7 8 9 10 11 12
>>> climatology.plot()
Dask divides arrays into many small pieces, called chunks, each of which is presumed to be small enough to fit into memory.
Operations on dask arrays are lazy. Operations queue up a series of tasks mapped over blocks, and no computation is performed until you actually ask values to be computed
Open dataset with chunks:
>>> from dask.dot import dot_graph
>>> dataset = xr.open_dataset('dataset.nc', chunks={'lat':200, 'lon':200}) # a chunk every 200 lat and lon
>>> tas_rimini = dataset.tas.sel(lat=44.05755, lon=12.56528, method='nearest')
>>> climatology = tas_rimini.groupby('time.month').mean('time')
>>> dot_graph(climatology.data.dask)
NETCDF
Chunks
Resources from CDS
Python editor with xarray and custom climate tools
Application preview for End Users
Francesco Nazzaro <f.nazzaro@bopen.eu> - @fnazzaro89
B-Open, Rome - http://bopen.eu (now hiring!)
12th of July, 2017