GIS for energy simulation and analysis 1 : case studies.

Introduction

Case Studies :

Gridded wind generation, data, modelling and visualisation

PhD: Gridded modelling of wind generation, using GIS

A grid was chosen for its simplicity, ability to harmonise datasets and allow array based maths.
The power of using GIS and Python in this case was the ability to create a bespoke framework and adapt data and simulation methods to it.
Numpy arrays used to create model grid
- conceptual as did not need to be spatially accurate
- Only a common key necessary
- Looping, slicing and mathematical operation become trivial
Assigning data to this conceptual framework, using a geographical grid was done in ArcGIS
- Spatial referencing necessary

SpWind - PhD Spatiotemporal Wind model III

Once the grid has been chose there is four phase process to creating a generation timeseries

Analyse available land, Exclude land from development
Identify high value areas
Allocate annual capacity to available zones
Simulate

PhD: Gridded modelling of wind generation, Phase 1: GIS analysis of available land

Layers of spatial restrictions on development merged
Land use of GB wind farms examined - only common types used.

Plots show uses of onshore turbines and the evaluation of the exclusion analysis - very few farms built in exclusion zones
ArcGIS and matplotlib

Phase 2 & 3 : Allocate capacity to available high quality areas

Using National Grid Energy Scenarios (circa 2012)

Allocate these to land identified in previous slide, where wind quality is good.

Phase 4 : Wind Simulation - Past and Present Research

Simulating generation from wind turbines has evolved from station data to reanalysis data

Generation from wind turbines has been simulated from weather data for many decades
The most recent crop of simulation studies can find roots in work from Graham Sinden, who used onshore MIDAS station data
Teams at Reading, Imperial and Edinburgh Universities (as well of UCL of course) have adapted these methods for reanalysis data,
- making significant improvements in accuracy, scope and resolution
- mostly focussed on wind turbine performance and demand supply matching
  - Only UCL modelling demand
- recently this field has expanded to include research in Germany
- and incorporated into energy systems optimisations
- Since I last worked on this there has, no doubt, been further progress
Hardware, software and data have all improved and continue to do so - the relationship between weather and power remain more or less the same ...

Phase 4 : Wind Simulation Fundamentals

establish the wind speed at the location of simulation
- Assume that nearby measurement represents the site
  - Mast data or grid point reanalysis
- Or try to more closely represent conditions by altering wind speeds somehow
  - Statistical or dynamic downscaling
Estimate the wind speed at the height of turbine
- Extrapolate upwards using law of choice
- Or down if using pressure level wind speeds
Convert wind speed to powers
- Using measured relationship
- Or physical relationship
- Choose whether to include swept area and wind density?

Simulating or estimating generation relies on a 3 step fundamental process, details will comein the following model examples

SpWind - PhD Spatiotemporal Wind model

GB only, CFSR driven, gridded model of wind generation for scenario disaggregation

Use only grid point centroid values, assuming that these represent the conditions within the grid square.
- To establish that this was the case, points were evaluated against 10 m MIDAS data
- Strong correlation and low error found (plots)
- High sites less well represented, these aren't used for development however
- Might not be suitable for areas with complex terrain
Only 1 near surface height therefore extrapolate using the power law with a single assumed exponent for onshore and one for offshore
- Simplification of surface roughness and atmospheric conditions (room for improvement)

ESTIMO_wind - RESTLESS wind simulation

MERRA driven, wind farm specific, Global Capacity factor timeseries derivation

Closest four points bilinearly interpolated to farm location
Curve fitted using wind speeds at 2, 10 and 50 m using a linearised version of the log law equation
- friction velocity and roughnesss length estimated using least squares optimisation (scipy).
This stage of the model can be evaluated by comparing timeseries to met masts
Some sites better than other - higher degree of accuracy where all close points have similar surface conditions

SpWind - PhD Spatiotemporal Wind model II

Convert to power using one offshore and one onshore archetypal curve and uniform hub heights
- A simplified method ...
Not a very high degree of accuracy
Method does, however, facilitate scenario analysis, particularly when paired with demand modelling, described later

ESTIMO_wind - RESTLESS wind simulation

All wind farms simulated separately using data from the windpower.net
Site specific wind speed heights and farm curves
Increased detail necessitates the use of Legion - great when it works ...

ESTIMO_wind - RESTLESS wind simulation

The use of MERRA, including multiple wind speed heights, combined with site specific simulation and metadata significantly improves the accuracy of historic simulation
Evaluating simulated wind generation is relatively straightforward due to good information on capacity and generation
Countries with large diverse capacity are easier to simulate, as demonstrated by the plots of simualted vs measure German and UK wind

ESTIMO_wind - RESTLESS wind simulation

Location specific simualtion or countries wiht less spatial diversity are less easy to simulate and verfiy, as demonstrated by the plot of a single UK wind farm and Finland.

ESTIMO_wind - RESTLESS wind simulation

Scenario modelling using European data can be more sophisticated than SpWIND due to more data
- e.g. data on different classifications of wind farm as shown in the plots
Scenarios first use operational, then under construction, approved and planned
- Therefore representative of evolving capacity.
Overall the RESTLESS method significantly improves accuracy and capability of Energy Space Time wind simulation
- The maintained MERRA database also enables significant other work

Method improvements

Factors not incorporated in model

Wake Effects
Downtime
Lag in turbine operation
Energy Density over blades
Air density
Decline in performance with age
Wind speed changes < 1 hour in frequency Location specific turbine curves
Operating efficiency
Atmospheric stability, surface roughness and orography in height correction

Despite the considerable work and excellent data there is still a lot of room for improvement in the simulation, mainly down to the factors described to the right
Some of this can be theoretically incorporated through statistcial alterations of the curves, as seen on the right
Or better calibration, though this is not trivial
Improved wind speed data can also be introduced in future studies
Particularly in regional studies or those addressing climate change.

PhD: Gridded modelling of wind generation - analysis and visualisation using Python

Matplotlib 3d wireframes, animated using a video editor

SpWind - PhD outputs

Hourly variability in wind generation, electricity demand and residual demand. Matplotlib and ArcGIS

Increased variability in both scenarios, higher capacity factors throughout the later in years under Gone Green on the left, especially in the colder months.

Predictable variability under both scenarios for all years. Little evidence of the impact of heat pumps on the temporal patterns of electricity demand.

The Gone Green scenario experiences greater variability as a result of more wind capacity, particularly offshore.

Wind Generation

Electricity Demand

Residual Demand

PhD: Gridded modelling of wind generation - analysis and visualisation using Python

Hourly variability in residual demand - matplotlib images (adapted)

PhD: Gridded modelling of wind generation - analysis and visualisation using Python

See Sinden (2007) for original method

PhD: Gridded modelling of wind generation - analysis and visualisation using Python

Correlation coefficient between timeseries of generation from pairs of grid squares under different scenarios
Demonstrating that in terms of increasing spatial diversity, much can be achieved using onshore only farms in GB
Due to the long thin nature of the land mass and the weather patterns.
Matplotlib 2d histograms

PhD: Gridded modelling of wind generation - analysis and visualisation using Python

Blog: Animated maps of renewable energy modelling

PhD: Gridded modelling of wind generation - animated output

Case Studies :

Gridded solar generation, data, modelling and visualisation

Solar generation

MERRA driven, gridded, global capacity factor timeseries derivation

Like wind, methods for simulating solar generation are well established
Even better, some of the are already coded in Python.
For that reason Stefan Pfenningers GSEE model was used for PV generation estimation in RESTLESS, as described in the plot
It was adapted slightly to run for all MERRA grid points for all hours
Assuming a 1 MW fixed panel with optimum direction and tilt
207, 937 points * 8760 hours * 37 years
- > 67 billion generation values

*legion required again

PV output simulated using top of atmosphere and ground level irradiance and ground temperature data and a physical model - adapted from Stefan's method
All hours from 1980 simulated for entire grid - Non tracking panels 1 MW per grid centroid assumed.
Aggregated to country by assuming either geographically distributed or population weighted capacity currently testing method. can be aggregated using any desired geography.
Output = long time series of capacity factors for all global countries - easy to use in scenarios. Renewables ninja provide European timeseries (only 10 years)

Module - Global PV simulation

Simulated monthly mean global capacity factors using 1980 meteorology

Solar generation - global hourly capacity factors using 1980 meteorology

Solar generation

Evaluation and calibration of a solar simulation model is harder to carry out than wind, due to limited information on installed capacity and uncertain measured generation timeseries
The meteorology is much simpler, however, therefore if some confidence can be gained in the simulation scenario modelling means simply multiplying capacity factors by assumed future capacities
Climate change is unlikely to effect irradiance as much as wind speed over larger spatial scopes.

Case Studies :

Capacity visualisation

Blog: Animated maps of renewable energy capacity https://esenergyvis.wordpress.com/

Blog: Animated maps of renewable energy capacity

Animated map of historical wind capacity

ArcGIS mapping, Python plotting and a video editor

Case Studies :

Air Pollution

DEFRA Gridded background pollution projections

DEFRA gridded 1km x 1km 2011 - 2030 projections of NOx and PM
Disaggregated by source including roads, domestic, commercial, rail, vehicles and point sources
Internal and external
Current year tethered to reality, projections = scenarios
Can be used in environmental modelling
Rasters visualised using ArcGIS - harmonised legend values, export to image, gif through photoshop
Photoshop good for small gifs
Larger number of images = after effects
Significant levels of background pollution come from outside of the country
Move to Northern Scotland for clean air!

The 5 worst polluted areas in the country

Using the same data and visualisation only, it is possible to examine some of the aspects of the projections
E.g. road pollution clearly reduces significantly through electrification in these scenarios
But pollution from other forms of transport, includign air and sea becomes proportionally more significant
Can you tell which areas are the worst?

GB Boroughs ranked by background NOx pollution : 2011 - 2010

Using a GIS, the gridded air pollution data was joined with census geographies representing boroughs - a fundamental GIS process.
That data was then exported to a spreadsheet (this could be done in Python, including internally in ArcGIS)
The mean NOx value by borough was calculated each year and the boroughs ranked each year
The result is a simplification of some spatiotemporal data into a much easier to understand form
The plot shows interesting artefacts of the scenarios
- London boroughs remain the most polluted
- There are significant improvements for boroughs with high levels of transport pollution (e.g. Lewisham)
- Boroughs with airports and sea ports nearby clearly fall

Gridded Air Pollution

The London Atmospheric Emissions inventory provide data on background air pollution on a 20 m grid

Roadside air pollution

DEFRA also provide estimates of roadside air pollution, estimated from traffic counts and emissions factors as well as measured pollution and dispersion modelling.
This dataset was combined with data on schools to identify establishments within 150 m of roads exceeding the EU limit value of 40 µg/m3 of NO2.
- Used in the poisoned playgrounds campaign
A GIS was used to calculate euclidean distances to all roads in the dataset
Results presented in a web GIS

Case Studies :

Simulating energy demand

SpWind - PhD SpDEAM - Spatiotemporal demand model

PhD: Demand modelling necessitates more data harmonisation

Very few datasets are in the correct framework, therefore considerable work done harmonising to grid. ArcGIS outputs.

SpWind - PhD SpDEAM

Case Studies :

Simulating energy demand : Buildings

SpWind - PhD SpDEAM

Extending the Estimation of energy demand to non domestic

Building footprints from Ordnance Survey Mastermap
Building heights from OS
- More data from e.g. Lidar
Derive simple volume
Non domestic building use from OS Addressbase premium
Assign to built form
Handle multiple uses
Energy intensity by use from academic research
Energy demand = floor area * intensity
Simple method, but uses data that covers whole of GB and requires very little in model computation
Significant imorvements are available thourgh better data, e.g. VOA, Metered data and Lidar
- Massive increase in complexity
- see simstock and 3d stock

Mastermap vs. Openstreetmap

LIDAR point cloud to building extrusion - UCL

Lidar data continued

There are LIDAR derived building height datasets available from EMU analytics, or the raw data from the environment agency

See website

Case Studies :

Simulating energy demand : People

SpWind - PhD SpDEAM

Gridded population data

Datasets vary in scope and resolution and also in spatial referencing
All are based on census data, redistributed to a grid
Therefore represent evening domestic population
Some datasets use ancillary data, e.g. nighttime lights or surveys of workplaces to improve representation
Work to be done on day time population, therefore not the best option for all questions
1 km x 1 km longitudinal data for Europe available via geostat
1 km x 1 km data for GB via Centre for Ecology and Hydrology - same grid as air pollution, useful for modelling
Global data available from multiple sources, the most popular is the Global Rural Urban Mapping Project (GRUMP).
- 30 arc second grid, close to 1 km, but causes a grid mismatch with regional coordinate systems (see plot)

Population data

Gridded population densities can be used in conjunction with projections of population for simulation models etc.
Need to be careful of data artefacts in many cases due to calibration etc
e.g. plot

Population data

Much of the analysis done on datasets which include geotagging is essentially population mapping, albeit at a potentially high temporal resolution, for example google location services

Here a Python mapping modules were used to show all of the location data my phone collected (before I turned it off, because it is creepy)

Case Studies :

Simulating energy demand : Transport

Roads

Department for Transport produce vehicle km data for major roads from measurements of Annual Average Daily flow
- by vehicle type
- by road
- these can cut city boundaries, so a GIS is needed to estimate what proportion is in the city (see plot)
Minor road data is not attributed to roads, but there are a subset of measurements which can be used
See also, OS opendata for roads shapefiles and sub national road consumption statistics
Spatial transport energy demand can then be estimated

Minor roads - Birmingham

Major roads - Birmingham

Rail

Passenger km data are available from the Office for Rail Regulation (ORR), dis aggregated by operator
The General Transit Feed Specification (GTFS) provides data on route, stations, stops and operators
The two data were joined
A simplified rail network describing trips between stations by operator was created (see plot)
The network was cliipped around cities and a proportion of the passenger km assigned to travel within the city boundaries
The GIS provides the ability to make more reasonable assumptions on data allocation to different geographies.
Freight data are available by operator from ORR, but GTFS aggregate these trips to a single code so assignment must be done on a land based proportion

Data :

Meterological data - overview.

Overview

Choosing data
Met mast data
- MIDAS
- Wunderground
- Offshore Buoys
- Data above 10 m
Reanalysis datasets
- NASA MERRA
- NCEP CFSR
- Alternatives and developments

Forecasted data
- Short term
- Long term
Maintained Database
- Access
- Contents
- Usage case studies
  - Population weighting
  - Points
  - Grids

Provide an overview of the different types of data, strengths and weakness, access and usage. For more detail see knowledge_base.doc which includes links, which are also provided at the end of this presentation

Choosing Historical Meteorological Data

Met Mast Data

For academic projects in GB the Met Office Integrated Data Archive System (MIDAS) Land and Marine Surface Stations Data (1853-current) database is the best choice

Free for research
100's of stations spread out across GB
Accessible through the Centre for Environmental Data Analysis (CEDA), registration necessary
Multiple variables
Hourly data available inc. wind speed and rainfall
< 10 m elevation
Biased to populated areas
Not homogeneous w. some data quality issues

Offshore Buoys

If offshore sites are of interest there are data from weather stations at sea

The British Oceanographic Centre (BODC) National Oceanographic Database (NODB)
Approx 20 offshore buoys and permanent "lightships"
Data at 6 m above sea
Wind speed, temp, wave height
Timeseries available as well as current conditions
API and bulk download

Offshore Masts

Temporary masts are often installed at planned wind farm locations

Wind speed timeseries at hub equivalent heights e.g. 100m
Data available from the Marine Data Exchange - not all reported data can be extracted - map shows accessible timeseries
These sites will occur less outside of GB, excluding Denmark, due to less offshore wind capacity
Several years often available
Wind resource evaluation
Model calibration

Reanalysis Datasets

Gridded global or regional data, huge number of weather variables for up to 100 years

Derived from Numerical Weather Prediction Models (NWP)
Tethered to observed data
Convention to provide 3 hourly "forecasts"
- High resolution data = hourly
From 10 - 100 years of hindcasted data
Homogeneous and clean
Resolution is compromised for scope in both space and time, four common combinations
- Long global coarse in space and time
- Long global coarse in space
- Long regional fine in space coarse in time
- Short regional fine in space time
Quality varies regionally .....

NASA MERRA

37 years of global data at fine(ish) spatial and and fine temporal resolution

Preferred dataset, db maintained to near present day
1979 - Present (1-2 month lag)
- MERRA 1 ends 2010, MERRA 2 = latest product
0.5° × 0.66° grid with 72 layers.
Very large number of variables. Many Hourly.
- Wind, Temperature, Water, Humidity - see database slide for those maintained on NAS
Wind speed at 2, 10, 50 m above surface as well as pressure levels - key for simulation ...
Widely used in research and commercially - may be replaced by regional reanalyses or ERA5 in near future
Files retrievable in NETCDF form or GRIB
- GRIB are meteorologist file of choice
- NETCdf contain better metadata - see late slides on access
Interpolated Extrapolated data used in simulation (will describe in detail later)
- Can estimate the accuracy of the data and method by comparing to high met mast data
- Plots show performance .....

NCEP CFSR

37 years of global data at fine(ish) spatial and and fine temporal resolution

Very similar to MERRA in many ways
37 years of hourly data, multiple layers
CFSR V1- 2010, CFSR v2 from 2010
Slightly improved spatial resolution on MERRA
Only 1 wind speed height above surface
- mutliple pressure levels can be used - see next presentation
Used in PhD and HighRES
Gives some novelty as MERRA is more widely used.
GB database on NAS, not maintained

Alternative Reanalysis Datasets

Regional reanalyses provide enhanced spatial resolution, other satellite derived data are available;

The choice of regional reanalysis depends on Geography
COSMO REA - see figure
- Fewer variables, wind at 2 and 10 m
- pressure level data available
- 2 or 6 km resolution!
- 2 km only for central Europe
ECMWF ERA 5
- currently under development
- global, hourly, 30 km
- improvement on MERRA according to some
- 1950 - present (long!)
- Python module for access
CMSAF AVHRR
- satellite derived irradiance
- improved accuracy on MERRA
- Less stable
Using Multiple products may improve individual accuracy, but detaches homogeneous meteorology
Others exist - see region specific originators +

downscaling

Using reanalysis data - fundamentals

Reanalyis data represents a high quality record of past meteorology, this can be used to represent the future, with caution.

Long time series from reanalysis provide information on how climate has changed as well as how weather varies over multiple temporal resolutions (hours - decades) - plot shows minimum annual temperatures by country in Europe.
- These timeseries have therefore been used to examine how other things are effected by this change and variability, e.g. weather driven energy generation and demand
- The process is often called hindcasting - projecting a past timeseries into the future
However - Not all change and variability is captured
- Weather and climate will not be the same in the future as it has been in the past.
Therefore when using the data it may be better to select extreme or "average" periods or introduce stochasticity through the use of multiple periods for single simulations etc. DISCUSS .....

Forecasts

Where hindcasting is not appropriate, forecasted data are available at multiple scales and resolutions. Don't just add 2 degrees to reanalysis data!

Forecast modelling, as opposed to simulation modelling should use a different dataset - for example:
Short term prediction or forecasting, e.g. simulation of wind generation on grid to balance grid, predict price or reduce carbon content of activity (Baking messaging....)
- Global Forecast System (GFS) - see gif, gives gridded 3 hourly via FTP
  - up to ten days, 0.5 degrees
  - multi decade historical data for training
- ECMWF also provide gridded forecasts to members
- Wunderground (e.g.) give site specific hourly forecasts
  - to ten days
  - free via api (with limits)
- Esembles (multiple datasets) can be used to fill time, accuracy degrades with time
Long terms projections of climate change are available from multiple sources
- UKCPO9 = weather generator, Coupled Model Intercomparison Project (CMIP5) to 2100
- Coordinated Regional Downscaling Experiment (CORDEX) 10 km, same framework as COSMO

Examples of visualising gridded met data

https://www.windy.com/?51.448,-0.138,5

Population weighting weather data

Population weighting a gridded weather dataset is a way to get a single value for each timestep that represents the weather experienced by a subset of people, for example in a country.

Recreate MERRA grid in GIS, using centroid coordinates and Voronoi polygons
GRUMP population GPW v4 data
- Extract population by country
- Sum all pop grids within a MERRA grid cell, pop grids are smaller, approximate method, with some crossover.
- New pop grid by counrty with 0's outside borders
- Multiply timeslice weather by new grid, sum and divide by total country population
temperature at 2m (K), wind speed at 2m (m/s), net downward solar radiation at the surface (W/m2), specific humidity at 2 m (kg/kg), Wet bulb temperature (K) and Dew Point temperature (K) for all countries, 1980 - present on NAS

Accessing and interpreting the maintained database

On the UCL network (inc. VPN), a Network Attached Storage (NAS) can be accessed (details available from me)
NASA MERRA
- 1979 - near present, global grid, 23 hourly variables (see next slide)
- Mix of MERRA 1 and MERRA 2
- MERRA grid shapefiles
NCEP CFSR GB subset to approx 2010
Population weighted weather timeseries for whole globe as detailed previously
Access via browser:
- http://128.40.180.9

MERRA data on the maintained database

Links and further material

MIDAS data @ CEDA http://catalogue.ceda.ac.uk/uuid/220a65615218d5c9cc9e4785a3234bd0

Weather Underground https://www.wunderground.com/wundermap

Buoys Map http://www.ndbc.noaa.gov/maps/United_Kingdom.shtml

Buoys data https://www.bodc.ac.uk/data/bodc_database/nodb/search/

Marine Data exchange http://www.marinedataexchange.co.uk/wind-data.asp

Reanalysis general info http:www.reanalysis.org

MERRA 2 download https://disc.gsfc.nasa.gov/daac-bin/FTPSubset2.pl

ECMWF ERA5 https://software.ecmwf.int/wiki/display/CKB/How+to+download+ERA5+data+via+the+ECMWF+Web+API

COSMO REA2 http://reanalysis.meteo.uni-bonn.de/?Download_Data___COSMO-REA2

CMSAF https://wui.cmsaf.eu/safira/action/viewDoiDetails?acronym=SARAH_V002

Global Forecast System https://www.ncdc.noaa.gov/data-access/model-data/model-datasets/global-forcast-system-gfs

NAS - see me

Methods for evaluating accuracy and variability

Mean Absolute Deviation (MAD)– using the in built pandas method .mad(), which does not totally agree with the manual method. A higher number describes larger deviation around the mean (or median) value.
Un biased Variance (Var) - using the in built pandas method .cov(). See Covariance description below.
Standard Deviation (StD) - using the in built pandas method .std(). A measure of dispersion around the mean. Low standard deviation indicates values are close to the mean.
Mean - using the in built pandas method .std(). Indicates average value of timeseries.
Pearson’s Correlation Coefficient (P) - using the in built pandas method .corr(). unbiased variance. 1 equals perfect positive correlation. Correlation is a normalised version of Covariance (therefore between 1 and -1) which allows comparison of different variables.
Kendall’s Tau Correlation Coefficient (KTau) - using the in built pandas method .corr(). unbiased variance
R2 correlation coefficient (R2) – using scipy.stats.linregress(x, y) - 1 equals perfect positive correlation
Root Mean Squared Error (RMSE m/s) – using a formula. Average difference between timeseries.
Covariance (COV) - using the in built pandas method .cov() the same measure as variance, but between timeseries. pairwise covariance. Covariance is a numerical measure that indicates the inter-dependency between two variables. A covariance of 0 indicates that the variables are totally independent. while a high and positive covariance value means that a variable is big when the other is big.

GIS for energy simulation and analysis 1 : case studies

By Ed Sharp

GIS for energy simulation and analysis 1 : case studies

Guest lecture for the Energy Systems and Data Analytics MSc 4th March 2020

5 years ago
433

GIS for energy simulation and analysis 1 : case studies.

Introduction

Case Studies :

Gridded wind generation, data, modelling and visualisation

Case Studies :

Gridded solar generation, data, modelling and visualisation

Case Studies :

Capacity visualisation

Case Studies :

Air Pollution

Case Studies :

Simulating energy demand

Case Studies :

Simulating energy demand : Buildings

Case Studies :

Simulating energy demand : People

Case Studies :

Simulating energy demand : Transport

Data :

Meterological data - overview.

GIS for energy simulation and analysis 1 : case studies

More from Ed Sharp